-
Notifications
You must be signed in to change notification settings - Fork 107
Description
Hi,
First of all - thank you for your amazing work. pybedtools has been super useful for my research so far and I am very grateful.
I'm trying to run 'randomstats' with the following args:
results_dict = a.randomstats(b, iterations=1000, new=True, genome_fn=chromsizes_fn, processes=4, shuffle_kwargs={"chrom": True}, intersect_kwargs={"f": 1})
`---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
~/.local/lib/python3.7/site-packages/pybedtools/bedtool.py in parallel_apply(self, iterations, func, func_args, func_kwargs, processes, _orig_pool)
2932 for it in range(iterations):
-> 2933 yield func(*func_args, **func_kwargs)
2934 raise StopIteration
~/.local/lib/python3.7/site-packages/pybedtools/stats.py in random_intersection(x, y, genome_fn, shuffle_kwargs, intersect_kwargs)
16 result = len(zz)
---> 17 helpers.close_or_delete(z, zz)
18 return result
~/.local/lib/python3.7/site-packages/pybedtools/helpers.py in close_or_delete(*args)
547 if hasattr(x.fn, "throw"):
--> 548 x.fn.throw(StopIteration)
549
StopIteration:
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
in
~/.local/lib/python3.7/site-packages/pybedtools/bedtool.py in randomstats(self, other, iterations, new, genome_fn, include_distribution, **kwargs)
2846 )
2847 distribution = self._randomintersection(
-> 2848 other, iterations=iterations, genome_fn=genome_fn, **kwargs
2849 )
2850
~/.local/lib/python3.7/site-packages/pybedtools/bedtool.py in _randomintersection(self, other, iterations, genome_fn, intersect_kwargs, _orig_pool, shuffle_kwargs, processes)
3038 ),
3039 processes=processes,
-> 3040 _orig_pool=_orig_pool,
3041 )
3042 )
RuntimeError: generator raised StopIteration`
The thing is that when I remove the 'processes' argument the 'randomstats' works just fine, but everytime I try to run it with 'processes' (even with a value of 1), I get the aformentioned error.
Other relevant data:
- 'a' and 'b' are both bedtool objects generated from a df. A regular a.intersect(b, f=1) works perfectly.
- 'chromsizes_fn' is the name of a genome file generated from a dict with:
chromsizes_fn = pybedtools.chromsizes_to_file(chromsizes_dic, fn=temp_genome.name)
I tried using both fn=False and fn=temp_genome.name - I tried to run it with both new=True and without it. It crashed on both tries.
I would really appreciate your help. I'm planning to run 'randomstats' on a large number of files, with at least 1000 iterations for each time, and being able to use multiprocessing will make it feasible.