Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
5bbcc07
back to dev
maxulysse Aug 18, 2025
072dbcc
Merge pull request #1597 from nf-core/back_to_dev
maxulysse Aug 18, 2025
346e42a
First pass allow bam input
pinin4fjords Sep 5, 2025
83e312f
Add samplesheet output with bams
pinin4fjords Sep 8, 2025
fac2cee
Fix up bam samplesheet generation
pinin4fjords Sep 8, 2025
fd2d5ee
Misc fixes
pinin4fjords Sep 8, 2025
6e32ed7
Fix alignment
pinin4fjords Sep 8, 2025
b0c3d10
Remove view
pinin4fjords Sep 8, 2025
627fa9c
Rework percent mapping usage to allow for bam input
pinin4fjords Sep 9, 2025
68ea671
Fix ID issue
pinin4fjords Sep 9, 2025
a1eb7a1
Fix mapping rate to samplesheet
pinin4fjords Sep 9, 2025
a172f46
Linting fixes
pinin4fjords Sep 9, 2025
6748397
Strip trailing whitespace
pinin4fjords Sep 9, 2025
0e3d4ac
Correct function tests for function changes
pinin4fjords Sep 9, 2025
662cca5
Fix percent mapped handling
pinin4fjords Sep 9, 2025
fca5dfd
Fix usage formatting for pre-commit
pinin4fjords Sep 9, 2025
51d9ca5
Skipping alignment no longer necessarily means skipping bam processing
pinin4fjords Sep 9, 2025
776d903
Don't output failed mappings with null mapping rate
pinin4fjords Sep 9, 2025
346178e
More config fixes now we might be post-processing bams after we skip …
pinin4fjords Sep 9, 2025
13749c7
Add not on bam reprocessing
pinin4fjords Sep 9, 2025
235589e
pre-commit fix
pinin4fjords Sep 9, 2025
ccbf323
update snap
pinin4fjords Sep 9, 2025
1a3175e
Bump multiqc
pinin4fjords Sep 9, 2025
89ddb3c
Add bam input test
pinin4fjords Sep 9, 2025
ce1327a
RO-crate fix
pinin4fjords Sep 9, 2025
9b07d79
Update test exclusions
pinin4fjords Sep 9, 2025
544a4de
Merge branch 'bam_entry_point' of github.com:nf-core/rnaseq into bam_…
pinin4fjords Sep 9, 2025
efd6236
Try some manual snapshot fixes
pinin4fjords Sep 9, 2025
3f34c9a
Merge branch 'bam_entry_point' of github.com:nf-core/rnaseq into bam_…
pinin4fjords Sep 9, 2025
e16b8dc
Improve docs
pinin4fjords Sep 9, 2025
d427bf1
Test fixes
pinin4fjords Sep 9, 2025
58a6b92
Add missing files in bam_input snap
pinin4fjords Sep 9, 2025
9034f6f
Update changelog
pinin4fjords Sep 9, 2025
8fcb935
Fix rocrate... again
pinin4fjords Sep 9, 2025
a1c66a3
Merge pull request #1603 from nf-core/bam_entry_point
pinin4fjords Sep 10, 2025
6d3ef9c
Bump rsem module
pinin4fjords Sep 15, 2025
0c41b2a
Align separately to RSEM
pinin4fjords Sep 15, 2025
21e5465
Fix STAR parameter handling to restore star_salmon compatibility
pinin4fjords Sep 15, 2025
e5add62
Remove commented rsem index
pinin4fjords Sep 15, 2025
fc0686f
Update rsem subworkflow test, stripping hard-to-test sentieon test
pinin4fjords Sep 15, 2025
d50bb9b
Bump rsem sentieon module
pinin4fjords Sep 15, 2025
9f40616
Readd versions from rsem swf
pinin4fjords Sep 15, 2025
f9cd5cd
Update rsem snaps
pinin4fjords Sep 15, 2025
8d27efc
Delete pipeline star_rsem sention tests
pinin4fjords Sep 15, 2025
4a97a75
Update bam input snap
pinin4fjords Sep 15, 2025
450cc33
Don't need to publish bam files from RSEM
pinin4fjords Sep 15, 2025
68488c5
RSEM now needs an index even without alignment
pinin4fjords Sep 15, 2025
a467b3d
Fix bam input test for rsem
pinin4fjords Sep 15, 2025
1f983cb
Add target quantifier warning
pinin4fjords Sep 15, 2025
0d2764d
Don't snapshot star logs now for rsem
pinin4fjords Sep 15, 2025
fd6de75
Fix star_rsem snap for log exclusion
pinin4fjords Sep 15, 2025
e6f1cc0
Update changelog
pinin4fjords Sep 15, 2025
d35b674
Prettier
pinin4fjords Sep 15, 2025
818d79d
Remove old alignment assertions from rsem test
pinin4fjords Sep 16, 2025
91d7d26
Tweak to only use BAM files with --skip_alignment
pinin4fjords Sep 16, 2025
00a0a15
Merge pull request #1604 from nf-core/rsem_bam_input
pinin4fjords Sep 16, 2025
ead60fd
Fix default for umi_discard_read to prevent validation errors in Plat…
pinin4fjords Sep 17, 2025
ef4c511
Update changelog
pinin4fjords Sep 17, 2025
c003e2d
Merge pull request #1605 from nf-core/fix_umi_discard_default
pinin4fjords Sep 17, 2025
6aa18da
Bump versions for 3.21.
pinin4fjords Sep 18, 2025
287980d
update changelog
pinin4fjords Sep 18, 2025
722bb47
Prettier
pinin4fjords Sep 18, 2025
13c6e21
Merge pull request #1606 from nf-core/pre_3.21.0_bumps
pinin4fjords Sep 18, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,11 @@ nf_core_version: 3.3.2
repository_type: pipeline
template:
author: "Harshil Patel, Phil Ewels, Rickard Hammarén"
description: RNA sequencing analysis pipeline for gene/isoform quantification
and extensive quality control.
description: RNA sequencing analysis pipeline for gene/isoform quantification and
extensive quality control.
force: false
is_nfcore: true
name: rnaseq
org: nf-core
outdir: .
version: 3.20.0
version: 3.21.0
27 changes: 26 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,32 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## 3.20.0
## [[3.21.0](https://github.com/nf-core/rnaseq/releases/tag/3.21.0)] - 2025-09-18

### Credits

Special thanks to the following for their contributions to the release:

- [Edmund Miller](https://github.com/edmundmiller)
- [Friederike Hanssen](https://github.com/friederikehanssen)
- [Maxime Garcia](https://github.com/maxulysse)
- [Jonathan Manning](https://github.com/pinin4fjords)

### Enhancements & fixes

- [PR #1597](https://github.com/nf-core/rnaseq/pull/1597) - Bump version after release 3.20.0
- [PR #1603](https://github.com/nf-core/rnaseq/pull/1603) - Add bam input pathway
- [PR #1604](https://github.com/nf-core/rnaseq/pull/1604) - Enable BAM input for RSEM
- [PR #1605](https://github.com/nf-core/rnaseq/pull/1605) - Fix default for umi_discard_read to prevent validation errors in Platform
- [PR #1606](https://github.com/nf-core/rnaseq/pull/1606) - Bump version to 3.21.0 ahead of release

### Software dependencies

| Dependency | Old version | New version |
| ---------- | ----------- | ----------- |
| `MultiQC` | 1.30 | 1.31 |

## [[3.20.0](https://github.com/nf-core/rnaseq/releases/tag/3.20.0)] - 2025-08-18

### Credits

Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@

## Introduction

**nf-core/rnaseq** is a bioinformatics pipeline that can be used to analyse RNA sequencing data obtained from organisms with a reference genome and annotation. It takes a samplesheet and FASTQ files as input, performs quality control (QC), trimming and (pseudo-)alignment, and produces a gene expression matrix and extensive QC report.
**nf-core/rnaseq** is a bioinformatics pipeline that can be used to analyse RNA sequencing data obtained from organisms with a reference genome and annotation. It takes a samplesheet with FASTQ files or pre-aligned BAM files as input, performs quality control (QC), trimming and (pseudo-)alignment, and produces a gene expression matrix and extensive QC report.

![nf-core/rnaseq metro map](docs/images/nf-core-rnaseq_metro_map_grey_animated.svg)

Expand Down Expand Up @@ -76,6 +76,8 @@ CONTROL_REP1,AEG588A1_S1_L004_R1_001.fastq.gz,AEG588A1_S1_L004_R2_001.fastq.gz,a

Each row represents a fastq file (single-end) or a pair of fastq files (paired end). Rows with the same sample identifier are considered technical replicates and merged automatically. The strandedness refers to the library preparation and will be automatically inferred if set to `auto`.

The pipeline supports a two-step reprocessing workflow using BAM files from previous runs. Run initially with `--save_align_intermeds` to generate a samplesheet with BAM paths, then reprocess using `--skip_alignment` for efficient downstream analysis without repeating expensive alignment steps. This feature is designed specifically for pipeline-generated BAMs.

> [!WARNING]
> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files).

Expand Down
21 changes: 21 additions & 0 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,27 @@
"errorMessage": "Strandedness must be provided and be one of 'auto', 'forward', 'reverse' or 'unstranded'",
"enum": ["forward", "reverse", "unstranded", "auto"],
"meta": ["strandedness"]
},
"genome_bam": {
"type": "string",
"format": "file-path",
"exists": true,
"pattern": "^([\\S\\s]*\\/)?[^\\s\\/]+\\.(bam|BAM)$",
"errorMessage": "Genome BAM file cannot contain spaces and must have extension '.bam'"
},
"transcriptome_bam": {
"type": "string",
"format": "file-path",
"exists": true,
"pattern": "^([\\S\\s]*\\/)?[^\\s\\/]+\\.(bam|BAM)$",
"errorMessage": "Transcriptome BAM file cannot contain spaces and must have extension '.bam'"
},
"percent_mapped": {
"type": "number",
"minimum": 0,
"maximum": 100,
"errorMessage": "Percent mapped must be a number between 0 and 100",
"meta": "percent_mapped"
}
},
"required": ["sample", "fastq_1", "strandedness"]
Expand Down
1 change: 0 additions & 1 deletion conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@ params {
bbsplit_fasta_list = 'https://raw.githubusercontent.com/nf-core/test-datasets/626c8fab639062eade4b10747e919341cbf9b41a/reference/bbsplit_fasta_list.txt'
hisat2_index = 'https://raw.githubusercontent.com/nf-core/test-datasets/626c8fab639062eade4b10747e919341cbf9b41a/reference/hisat2.tar.gz'
salmon_index = 'https://raw.githubusercontent.com/nf-core/test-datasets/626c8fab639062eade4b10747e919341cbf9b41a/reference/salmon.tar.gz'
rsem_index = 'https://raw.githubusercontent.com/nf-core/test-datasets/626c8fab639062eade4b10747e919341cbf9b41a/reference/rsem.tar.gz'

// Other parameters
skip_bbsplit = false
Expand Down
20 changes: 13 additions & 7 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ nextflow run nf-core/rnaseq -profile test_full,<docker/singularity/institute>

The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory.

:::tip
Many of the BAM files produced by this pipeline can be reused as input for future runs with `--skip_alignment`. This is particularly useful for reprocessing data or running downstream analysis steps without repeating computationally expensive alignment. See the [usage documentation](https://nf-co.re/rnaseq/usage#bam-input-for-reprocessing-workflow) for details on using BAM files as input.
:::

## Pipeline overview

The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps:
Expand Down Expand Up @@ -213,8 +217,8 @@ When `--remove_ribo_rna` is specified, the pipeline uses [SortMeRNA](https://git
<summary>Output files</summary>

- `star_salmon/`
- `*.Aligned.out.bam`: If `--save_align_intermeds` is specified the original BAM file containing read alignments to the reference genome will be placed in this directory.
- `*.Aligned.toTranscriptome.out.bam`: If `--save_align_intermeds` is specified the original BAM file containing read alignments to the transcriptome will be placed in this directory.
- `*.Aligned.out.bam`: If `--save_align_intermeds` is specified the original BAM file containing read alignments to the reference genome will be placed in this directory. These files can be reused as `genome_bam` input in future pipeline runs.
- `*.Aligned.toTranscriptome.out.bam`: If `--save_align_intermeds` is specified the original BAM file containing read alignments to the transcriptome will be placed in this directory. These files can be reused as `transcriptome_bam` input in future pipeline runs.
- `salmon.merged.gene_counts.tsv`: Matrix of gene-level raw counts across all samples.
- `salmon.merged.gene_tpm.tsv`: Matrix of gene-level TPM values across all samples.
- `salmon.merged.gene.SummarizedExperiment.rds`: RDS object that can be loaded in R that contains a [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) container with the abundance TPM (`tpm`), estimated counts (`counts`) and gene length (`length`), estimated library size-scaled counts (`counts_scaled`), estimated length-scaled counts (`counts_length_scaled`) in the assays slot for genes.
Expand Down Expand Up @@ -276,16 +280,16 @@ The STAR section of the MultiQC report shows a bar plot with alignment rates: go
- `rsem.merged.transcript_tpm.tsv`: Matrix of isoform-level TPM values across all samples.
- `*.genes.results`: RSEM gene-level quantification results for each sample.
- `*.isoforms.results`: RSEM isoform-level quantification results for each sample.
- `*.STAR.genome.bam`: If `--save_align_intermeds` is specified the original BAM file containing read alignments to the reference genome will be placed in this directory.
- `*.transcript.bam`: If `--save_align_intermeds` is specified the original BAM file containing read alignments to the transcriptome will be placed in this directory.
- `*.STAR.genome.bam`: If `--save_align_intermeds` is specified the BAM file from STAR alignment containing read alignments to the reference genome will be placed in this directory. These files can be reused as `genome_bam` input in future pipeline runs.
- `*.transcript.bam`: If `--save_align_intermeds` is specified the BAM file from STAR alignment containing read alignments to the transcriptome will be placed in this directory. These files can be reused as `transcriptome_bam` input in future pipeline runs.
- `star_rsem/<SAMPLE>.stat/`
- `*.cnt`, `*.model`, `*.theta`: RSEM counts and statistics for each sample.
- `star_rsem/log/`
- `*.log`: STAR alignment report containing the mapping results summary.

</details>

[RSEM](https://github.com/deweylab/RSEM) is a software package for estimating gene and isoform expression levels from RNA-seq data. It has been widely touted as one of the most accurate quantification tools for RNA-seq analysis. RSEM wraps other popular tools to map the reads to the genome (i.e. STAR, Bowtie2, HISAT2; STAR is used in this pipeline) which are then subsequently filtered relative to a transcriptome before quantifying at the gene- and isoform-level. Other advantages of using RSEM are that it performs both the alignment and quantification in a single package and its ability to effectively use ambiguously-mapping reads.
[RSEM](https://github.com/deweylab/RSEM) is a software package for estimating gene and isoform expression levels from RNA-seq data. It has been widely touted as one of the most accurate quantification tools for RNA-seq analysis. When using `--aligner star_rsem`, the pipeline first runs STAR alignment with RSEM-compatible parameters to generate genome and transcriptome BAM files, then RSEM quantifies expression using these pre-aligned BAMs via the `--alignments` mode. This approach ensures optimal compatibility while maintaining RSEM's ability to effectively use ambiguously-mapping reads.

You can choose to align and quantify your data with RSEM by providing the `--aligner star_rsem` parameter.

Expand All @@ -299,7 +303,7 @@ You can choose to align and quantify your data with RSEM by providing the `--ali
<summary>Output files</summary>

- `hisat2/`
- `<SAMPLE>.bam`: If `--save_align_intermeds` is specified the original BAM file containing read alignments to the reference genome will be placed in this directory.
- `<SAMPLE>.bam`: If `--save_align_intermeds` is specified the original BAM file containing read alignments to the reference genome will be placed in this directory. These files can be reused as `genome_bam` input in future pipeline runs.
- `hisat2/log/`
- `*.log`: HISAT2 alignment report containing the mapping results summary.
- `hisat2/unmapped/`
Expand All @@ -323,7 +327,7 @@ The pipeline has been written in a way where all the files generated downstream
<summary>Output files</summary>

- `<ALIGNER>/`
- `<SAMPLE>.sorted.bam`: If `--save_align_intermeds` is specified the original coordinate sorted BAM file containing read alignments will be placed in this directory.
- `<SAMPLE>.sorted.bam`: If `--save_align_intermeds` is specified the original coordinate sorted BAM file containing read alignments will be placed in this directory. These files can be reused as `genome_bam` input in future pipeline runs.
- `<SAMPLE>.sorted.bam.bai`: If `--save_align_intermeds` is specified the BAI index file for the original coordinate sorted BAM file will be placed in this directory.
- `<SAMPLE>.sorted.bam.csi`: If `--save_align_intermeds --bam_csi_index` is specified the CSI index file for the original coordinate sorted BAM file will be placed in this directory.
- `<ALIGNER>/samtools_stats/`
Expand Down Expand Up @@ -864,6 +868,8 @@ A number of genome-specific files are generated by the pipeline because they are
- Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.yml`. The `pipeline_report*` files will only be present if the `--email` / `--email_on_fail` parameter's are used when running the pipeline.
- Reformatted samplesheet files used as input to the pipeline: `samplesheet.valid.csv`.
- Parameters used by the pipeline run: `params.json`.
- `samplesheets/`
- `samplesheet_with_bams.csv`: **Auto-generated samplesheet for BAM reprocessing** (only created when using `--save_align_intermeds`) containing all samples with BAM file paths. For samples processed from FASTQ, includes paths to newly generated BAMs; for samples that were BAM input, preserves the original input paths. This samplesheet can be used directly for future pipeline runs with `--skip_alignment`, enabling efficient reprocessing without re-alignment.

</details>

Expand Down
Loading
Loading