-
Notifications
You must be signed in to change notification settings - Fork 1k
Tetranscripts #9635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Tetranscripts #9635
Changes from all commits
Commits
Show all changes
59 commits
Select commit
Hold shift + click to select a range
2fdcba5
adding tetranscripts module
d07b869
Updating test files and correcting command typo
681c7ed
Successful singularity run test
718a444
Adding multiple files to test - wip
51573d4
First effort to include multiple .bam
1e120f3
Multiple .bam processing add
40215a3
Amending to test data
08062eb
Add --sortByPos to config for test
6268b4c
Adding config to add parameters for test bams
4db3aa6
Trying locally
9966c64
Main working but stub failing with test data
fe4b5d6
Public data working main test and stub
8b5ba95
Fixing linting!
1546e5f
Still linting
70cf2d3
Linted and test passing public data - no .bai
5e07e24
Working Singularity
2e1acfc
Remove TODO
9bc319a
Removed lingering TODO, linting 2 warnings, pass test
1775924
Changing main.nf to topics
76b0b01
Updating meta.yml to topics
c068363
Updating topics main.nf.test
78216e2
Topics update linting pass test fail
502d95f
Updated to topics passing linting and test
c563b5b
Fixing meta.yml typo - now passing linting and testing.
b9a4e53
adding tetranscripts module
7aeeed9
Updating test files and correcting command typo
d16a48d
Successful singularity run test
8ca293d
Adding multiple files to test - wip
b077342
First effort to include multiple .bam
ca23268
Multiple .bam processing add
c4c1a85
Amending to test data
f529230
Add --sortByPos to config for test
09e2942
Adding config to add parameters for test bams
11ef148
Trying locally
6c7264b
Main working but stub failing with test data
5c5d1e4
Public data working main test and stub
49420a2
Fixing linting!
0cd342d
Still linting
0d9e0e2
Linted and test passing public data - no .bai
07feee9
Working Singularity
fdb7492
Remove TODO
8c1f3a6
Removed lingering TODO, linting 2 warnings, pass test
2d4aa49
Changing main.nf to topics
37d57c5
Updating meta.yml to topics
e5a1fed
Updating topics main.nf.test
620d59b
Topics update linting pass test fail
f8b81a6
Updated to topics passing linting and test
a4f5295
Fixing meta.yml typo - now passing linting and testing.
0478893
Merge branch 'nf-core:master' into tetranscripts
hanalysis 7b1a8d2
Merge branch 'tetranscripts' of github.com:hanalysis/nf-core-modules …
29b6648
Adding test for multiple bam files
dda863b
adding ontologies to meta.yml
2618f98
Adding .bai optional functionality
4ffdfc4
Adding optional .txt output channels
f67602d
Removing bai input adding column test check for multiple files
e2dc82c
Run prettier to fix pre-linting
123bdff
Merge branch 'master' into tetranscripts
hanalysis 6661809
Removing whitespace to pass prek
hanalysis 711abfe
Adding back in additional lines end of file
hanalysis File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| --- | ||
| # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json | ||
| channels: | ||
| - conda-forge | ||
| - bioconda | ||
| dependencies: | ||
| - "bioconda::tetranscripts=2.2.3" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| process TETRANSCRIPTS { | ||
| tag "$meta_c.id" | ||
| label 'process_single' | ||
|
|
||
| conda "${moduleDir}/environment.yml" | ||
| container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? | ||
| 'https://depot.galaxyproject.org/singularity/tetranscripts:2.2.3--pyh7cba7a3_0': | ||
| 'biocontainers/tetranscripts:2.2.3--pyh7cba7a3_0' }" | ||
|
|
||
| input: | ||
| tuple val(meta_t), path(bam_t) | ||
| tuple val(meta_c), path(bam_c) | ||
| tuple val(meta_ggtf), path(g_gtf) | ||
| tuple val(meta_tegtf), path(te_gtf) | ||
|
|
||
| output: | ||
| tuple val(meta_t), path("*.cntTable"), emit: countTable | ||
| tuple val(meta_t), path("*.R"), emit: log2fc | ||
| tuple val(meta_t), path("*_analysis.txt"), emit: analysis, optional: true | ||
| tuple val(meta_t), path("*_gene_TE.txt"), emit: sigdiff, optional: true | ||
| tuple val("${task.process}"), val('tetranscripts'), eval("tetranscripts version | sed '1!d;s/.* //'"), emit: versions_tetranscripts, topic: versions | ||
|
|
||
| when: | ||
| task.ext.when == null || task.ext.when | ||
|
|
||
| script: | ||
| def args = task.ext.args ?: '' | ||
| def prefix = task.ext.prefix ?: "${meta_c.id}" | ||
| // Join multiple BAM files with spaces for -t and -c arguments | ||
| def treatment_bams = [bam_t].flatten().join(' ') | ||
| def control_bams = [bam_c].flatten().join(' ') | ||
| """ | ||
| TEtranscripts \\ | ||
| -t ${treatment_bams} \\ | ||
| -c ${control_bams} \\ | ||
| --GTF $g_gtf \\ | ||
| --TE $te_gtf \\ | ||
| --project ${prefix} \\ | ||
| $args | ||
|
|
||
| """ | ||
|
|
||
| stub: | ||
| def args = task.ext.args ?: '' | ||
| def prefix = task.ext.prefix ?: "${meta_c.id}" | ||
| """ | ||
| echo $args | ||
|
|
||
| touch ${prefix}.R | ||
| touch ${prefix}.cntTable | ||
| touch ${prefix}_gene_TE_analysis.txt | ||
| touch ${prefix}_sigdiff_gene_TE.txt | ||
|
|
||
| """ | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,150 @@ | ||
| # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json | ||
| name: "tetranscripts" | ||
| description: Runs TEtranscripts which summarises transposable element content of | ||
| a bam file. | ||
| keywords: | ||
| - transposable | ||
| - TE | ||
| - transcriptomics | ||
| tools: | ||
| - "tetranscripts": | ||
| description: A package for including transposable elements in differential | ||
| enrichment analysis of sequencing datasets. | ||
| homepage: https://github.com/mhammell-laboratory/TEtranscripts | ||
| documentation: https://hammelllab.labsites.cshl.edu/software/#TEtranscripts | ||
| tool_dev_url: https://github.com/mhammell-laboratory/TEtranscripts | ||
| doi: 10.1093/bioinformatics/btv422 | ||
| licence: ["GPL v3"] | ||
| identifier: biotools:tetranscripts | ||
|
|
||
| input: | ||
| # Treatment BAM | ||
| - - meta_t: | ||
| type: map | ||
| description: | | ||
| Groovy Map containing treatment sample information. e.g. `[ | ||
| id:'sample1' ]` | ||
| - bam_t: | ||
| type: file | ||
| description: A BAM file for the treatment condition | ||
| pattern: "*.{bam}" | ||
| ontologies: | ||
| - edam: "http://edamontology.org/format_2572" | ||
|
|
||
| # Control BAM | ||
| - - meta_c: | ||
| type: map | ||
| description: | | ||
| Groovy Map containing control sample information, | ||
| e.g. `[ id:'control1']` | ||
| - bam_c: | ||
| type: file | ||
| description: A BAM file for the control condition | ||
| pattern: "*.{bam}" | ||
| ontologies: | ||
| - edam: "http://edamontology.org/format_2572" | ||
|
|
||
| # Genome GTF | ||
| - - meta_ggtf: | ||
| type: map | ||
| description: | | ||
| Groovy map containing control sample information | ||
| e.g. `[ id:'control1' ]` | ||
| - g_gtf: | ||
| type: file | ||
| description: A GTF file for alignment to the genome | ||
| pattern: "*.{gtf}" | ||
| ontologies: | ||
| - edam: "http://edamontology.org/format_2306" | ||
|
|
||
| # TE GTF | ||
| - - meta_tegtf: | ||
| type: map | ||
| description: | | ||
| Groovy map containing TE GTF information | ||
| e.g. `[ id:'control1' ]` | ||
|
|
||
| - te_gtf: | ||
| type: file | ||
| description: A curated GTF file for alignment to transposable elements | ||
| pattern: "*.{gtf}" | ||
| ontologies: | ||
| - edam: "http://edamontology.org/format_2306" | ||
| output: | ||
| countTable: | ||
| - - meta_t: | ||
| type: map | ||
| description: | | ||
| Groovy Map containing treatment sample information. | ||
| e.g. `id:'sample1' ]` | ||
| - "*.cntTable": | ||
| type: file | ||
| description: Counts table of transposable element families | ||
| pattern: "*.cntTable" | ||
| ontologies: [] | ||
| log2fc: | ||
| - - meta_t: | ||
| type: map | ||
| description: | | ||
| Groovy Map containing treatment sample information. | ||
| e.g. `id:'sample1' ]` | ||
| - "*.R": | ||
| type: file | ||
| description: Differential gene expression analysis file | ||
| pattern: "*.R" | ||
|
|
||
| ontologies: | ||
| - edam: http://edamontology.org/format_3999 # R script | ||
|
|
||
| analysis: | ||
| - - meta_t: | ||
| type: map | ||
| description: | | ||
| Groovy map containing treatment sample information. | ||
| e.g. `id:'sample1' ]` | ||
| - "*_analysis.txt": | ||
| type: file | ||
| description: DESeq2 analysis file | ||
| pattern: "*_analysis.txt" | ||
| ontologies: | ||
| - edam: http://edamontology.org/format_2330 # text format | ||
|
|
||
| sigdiff: | ||
| - - meta_t: | ||
| type: map | ||
| description: | | ||
| Groovy map containing treatment sample information. | ||
| e.g. `id:'sample1' ]` | ||
| - "*_gene_TE.txt": | ||
| type: file | ||
| description: DESeq2 analysis file | ||
| pattern: "*_gene_TE.txt" | ||
| ontologies: | ||
| - edam: http://edamontology.org/format_2330 # text format | ||
|
|
||
| versions_tetranscripts: | ||
| - - ${task.process}: | ||
| type: string | ||
| description: The process the versions were collected from | ||
| - tetranscripts: | ||
| type: string | ||
| description: The tool name | ||
| - "tetranscripts version | sed '1!d;s/.* //'": | ||
| type: eval | ||
| description: The expression to obtain the version of the tool | ||
|
|
||
| topics: | ||
| versions: | ||
| - - ${task.process}: | ||
| type: string | ||
| description: The process the versions were collected from | ||
| - tetranscripts: | ||
| type: string | ||
| description: The tool name | ||
| - "tetranscripts version | sed '1!d;s/.* //'": | ||
| type: eval | ||
| description: The expression to obtain the version of the tool | ||
| authors: | ||
| - "@hanalysis" | ||
| maintainers: | ||
| - "@hanalysis" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,158 @@ | ||
| // nf-core modules test tetranscripts | ||
| nextflow_process { | ||
|
|
||
| name "Test Process TETRANSCRIPTS" | ||
| script "../main.nf" | ||
| process "TETRANSCRIPTS" | ||
|
|
||
| tag "modules" | ||
| tag "modules_nfcore" | ||
| tag "tetranscripts" | ||
|
|
||
| test("single file c and t - bam") { | ||
| config "./nextflow.config" | ||
|
|
||
| when { | ||
| process { | ||
| """ | ||
|
|
||
| input[0] = [ | ||
| [ id:'ctrl_bam' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), | ||
| [], | ||
| ] | ||
|
|
||
| input[1] = [ | ||
| [ id:'test_bam' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam', checkIfExists: true), | ||
| [], | ||
| ] | ||
| input[2] = [ | ||
| [ id:'genome_gtf' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.gtf', checkIfExists: true), | ||
| ] | ||
| input[3] = [ | ||
| [ id:'te_gtf' ], | ||
| file('https://raw.githubusercontent.com/hanalysis/test-datasets/refs/heads/modules/data/genomics/homo_sapiens/genome/chr21/sequence/GRCh38_GENCODE_rmsk_TE_chr21.gtf', checkIfExists: true), | ||
| ] | ||
| """ | ||
|
|
||
| } | ||
| } | ||
|
|
||
| then { | ||
| assert process.success | ||
| assertAll( | ||
| { assert snapshot( | ||
| process.out, | ||
| process.out.findAll { key, val -> key.startsWith('versions') } | ||
| ).match() } | ||
| ) | ||
| } | ||
|
|
||
| } | ||
|
|
||
| test("multiple files t - bam") { | ||
| config "./nextflow.config" | ||
|
|
||
| when { | ||
| process { | ||
| """ | ||
|
|
||
| input[0] = [ | ||
| [ id:'ctrl_bam' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), | ||
| [], | ||
| ] | ||
|
|
||
| input[1] = [ | ||
| [ id:'test_bam' ], | ||
| [file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam', checkIfExists: true), | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.rna.paired_end.sorted.chr6.bam', checkIfExists: true),], | ||
| [] | ||
| ] | ||
| input[2] = [ | ||
| [ id:'genome_gtf' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.gtf', checkIfExists: true), | ||
| ] | ||
| input[3] = [ | ||
| [ id:'te_gtf' ], | ||
| file('https://raw.githubusercontent.com/hanalysis/test-datasets/refs/heads/modules/data/genomics/homo_sapiens/genome/chr21/sequence/GRCh38_GENCODE_rmsk_TE_chr21.gtf', checkIfExists: true), | ||
| ] | ||
| """ | ||
|
|
||
| } | ||
| } | ||
|
|
||
| then { | ||
| assert process.success | ||
|
|
||
| def outputFile = path(process.out.countTable[0][1]) // [0] is first sample, [1] is file (not meta) | ||
| assert outputFile.exists() | ||
|
|
||
| def lines = outputFile.readLines() | ||
| // take first 5 lines to check | ||
| def firstLines = lines.take(5) | ||
|
|
||
| // Check 4 columns (row names + 3 samples) | ||
| firstLines.eachWithIndex { line, idx -> | ||
| def columns = line.split(/\s+/) | ||
| assert columns.size() == 4: "Line ${idx + 1}: Expected 4 columns but got ${columns.size()} in line: '${line}'" | ||
| } | ||
|
|
||
| assertAll( | ||
| { assert snapshot( | ||
| process.out, | ||
| process.out.findAll { key, val -> key.startsWith('versions') } | ||
| ).match() } | ||
| ) | ||
| } | ||
|
|
||
| } | ||
|
|
||
| test("single file c and t - bam - stub") { | ||
|
|
||
| options "-stub" | ||
|
|
||
| config "./nextflow.config" | ||
|
|
||
| when { | ||
| process { | ||
| """ | ||
| input[0] = [ | ||
| [ id:'ctrl_bam' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), | ||
| [], | ||
| ] | ||
|
|
||
| input[1] = [ | ||
| [ id:'test_bam' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam', checkIfExists: true), | ||
| [], | ||
| ] | ||
| input[2] = [ | ||
| [ id:'genome_gtf' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.gtf', checkIfExists: true), | ||
| ] | ||
| input[3] = [ | ||
| [ id:'te_gtf' ], | ||
| file('https://raw.githubusercontent.com/hanalysis/test-datasets/refs/heads/modules/data/genomics/homo_sapiens/genome/chr21/sequence/GRCh38_GENCODE_rmsk_TE_chr21.gtf', checkIfExists: true), | ||
| ] | ||
|
|
||
| """ | ||
| } | ||
| } | ||
|
|
||
| then { | ||
| assert process.success | ||
| assertAll( | ||
| { assert snapshot( | ||
| process.out, | ||
| process.out.findAll { key, val -> key.startsWith('versions') } | ||
| ).match() } | ||
| ) | ||
| } | ||
|
|
||
| } | ||
|
|
||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.