fix: gprofiler2 output files missing gene names in intersection columns #497 #9304
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The gprofiler2 module outputs were missing the actual gene names/IDs in the expected columns, making it impossible to identify which specific genes contribute to pathway enrichment.
Expected behavior:
*.gprofiler2.all_enriched_pathways.tsv should contain an intersection column with gene names/IDs
*.gprofiler2.[source].sub_enriched_pathways.tsv should contain actual gene names in the DE_genes_names column
Actual behavior:
all_enriched_pathways.tsv file lacks the intersection column entirely
sub_enriched_pathways.tsv files have DE_genes_names column containing numeric values (same as DE_genes) instead of gene names
Now with the fix
Enable g:Profiler evidence codes so the
intersectioncolumn is emitted.Populate sub-tables with both Ensembl IDs and symbols:
DE_genes_ids
= originalintersectionIDs DE_genes_names= gene symbols (from DE table where available, else gprofiler2::gconvert), fallback to IDs if unmappednextflow run . -profile test,docker
--gprofiler2_run true
--gprofiler2_organism mmusculus
--gprofiler2_evcodes true
--outdir test_gprofile_symbols
*all_enriched_pathways.tsvnow containsintersection.*sub_enriched_pathways.tsvnow hasDE_genes_idsandDE_genes_names(symbols present; IDs used as fallback).Notes
DE_genes_idsin sub tables.