This holds all files related to the workshop "From a Collection of Scripts to a Pipeline – Writing Nextflow Workflows with nf-core Best Practices" at the GCB 2025.
Authors: Famke Bäuerle (@famosab), Mark Polster (@mapo9)
Bioinformatics analyses often begin as a set of scattered scripts, but scaling them into reproducible and maintainable workflows can be challenging. In this hands-on workshop we aim to guide you through transforming your scripts into a robust Nextflow pipeline using nf-core components and best practices. We will cover essential topics such as pipeline structuring, version control, and best practices for collaboration and reproducibility. Whether you're new to Nextflow or looking to refine your workflow development skills, this workshop will provide practical insights and hands-on experience to help you understand and utilize the nf-core framework for your own research.
Here we provide the materials for the workshop. You can find everything related to "bash-scripts" in the bash/ folder and everything related to "python-scripts" in the py_scripts/ folder. Exemplary VCF files to test out different things are available in the vcf-files/ folder. Each of the folders comes with a short README which gives you more information on how to use the files and scripts in the respective folder.
Our objective is to guide you through the process of transforming existing scripts into a working nextflow pipeline by utilizing nf-core components.
Warning
Do not use _ in your pipeline or module names. They are only allowed in subworkflow names.
- You need the nf-core toolbox.
conda create --name nf-core nf-core nextflow
conda activate nf-coreYou can also install the toolbox via pip pip install nf-core.
- Use the pipelines create command to create a new pipeline.
Do this in a new and empty folder. The pipeline name can be anything, for example
gcbworkshop.
nf-core pipelines create- Further steps. You will find a lot of TODO statements in the pipeline code. You do not need to worry about those for now. We will work on simple modules and subworkflows to help you understand the template a little more.
Note
The nf-core documentation is a great source for recommendations and tricks. You can find a lot of answers over there. Anything Nextflow related can be looked up in the Nextflow docs.
If you work with out exemplary scripts you will need to adapt files in the assets folder of your pipeline. The adapted files are given under files_to_change/assets.
- modules: these hold process definitions
- local: modules you can create and adapt yourself
- nf-core: modules you can install with nf-core tools
- subworkflows: structural elements that help keep the
main.nfshort- local: subworkflows you can create and adapt yourself
- nf-core: subworkflows you can install with nf-core tools
- workflows: this hold the main workflow in your nf-core template
- main.nf: nextflow script file for each element (module, subworkflow, workflow)
Note
You will only need to adapt the main.nf files in the created folder. Do not change the main.nf at the root of your pipeline folder.
-
Try running the bash scripts by following the advice in the README.
-
The provided bash scripts are available as nf-core modules. Look for them and install them to your pipeline with nf-core tools. You can find the tool by having a look in the bash script or the provided conda environment.
nf-core modules install ...The modules we want to use here are very similar to their usage in the qbic-pipelines/vcftomaf pipeline. You can refer to that pipeline if you get stuck but note that it has more functionality than what we are aiming for.
- Integrate these scripts into a subworkflow called
bash_scripts.nfin your pipeline.
nf-core subworkflows createThis creates the files necessary for a new subworkflow within your pipeline. You can adapt those to your needs.
-
Try running the python scripts by following the advice in the README.
-
Go to Seqera Containers and create your own Container with the required tools to run the scripts (see
conda_py.yml) -
Create a subworkflow called
py_scripts.nfin your pipeline (see above) -
Copy the python scripts to the
binfolder of the pipeline and make them executable -
Create a module for each of the scripts and call them in the subworkflow (you can use the fasta2peptides module as an orientation)
nf-core modules create This creates the files necessary for a new module within your pipeline. You can adapt those to your needs.
To understand how Python scripts can be utilized you can check out the nf-core/epitopeprediction pipeline, e.g. the fasta2peptides module.
Include the created subworkflows in the workflows/<name-of-your-workflow>.nf script. Call them in the main: section.
nextflow run main.nfThis runs your pipeline. If all the include statements are correct it should at least start the pipeline (you will see that other stuff is missing, like the --input flag).
Many bioinformatics problems have a nf-core pipeline that you can use to run your own analysis. Feel free to check out the nf-core website and search for keywords matching your data. You can then read the documentation and try to run the pipeline for your own analysis. If you get stuck you can always ask in the slack channel of the pipeline. You can join the slack here.
- Wrong nf-core version.
pip install --upgrade nf-coreshould upgrade to nf-core version > 3.0
- Java is missing
sdk install java 17.0.10-temneeds to be added to your path aswell:
export JAVA_CMD="/home-link/paifb01/.sdkman/candidates/java/17.0.10-tem/bin/java"replace /home-link/paifb01/ with the path to your sdkman installation.