Skip to content

Skip reruns of resumed workflow based on published files #6417

@gevro

Description

@gevro

Hi,
For a pipeline with workflows A and B, I want to provide the user the ability to run any combination of consecutive workflows with a --workflow parameter.

This would be fairly straightforward capability for any pipelining software such as nextflow.

I have programmed that successfully except for one issue: if a user runs workflows A and B, and let's say all workflow A tasks complete successfully but some tasks of workflow B fail, and then the user resumes from workflow B, it should skip prior workflow A processes that were completed successfully.

But that behavior does not seem possible unless I use storeDir for workflow A rather than publishDir. Because on resume from workflow B, nextflow does not know how to 'hydrate' the prior work cache of workflow A.

The expected behavior of nextflow (and any pipelining software) would be for -resume to skip tasks for which the publishDir files exists, just as it does when doing storeDir. I'm not sure why nextflow doesn't do that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions