Skip to content

Support Nextflow typed syntax for process inputs/outputs#4

Merged
ewels merged 10 commits intomainfrom
fix/typed-process-io-parsing
Mar 3, 2026
Merged

Support Nextflow typed syntax for process inputs/outputs#4
ewels merged 10 commits intomainfrom
fix/typed-process-io-parsing

Conversation

@ewels
Copy link
Owner

@ewels ewels commented Mar 3, 2026

Summary

Adds support for Nextflow's typed syntax (nextflow.preview.types = true) which was causing processes to generate documentation with no inputs, no outputs, and no Groovydoc descriptions.

Reported via community feedback on fulcrumgenomics/nf-fgsv.

Changes

Typed process I/O parsing (nf_parser.py)

  • Parse typed tuple inputs: (meta, bam): Tuple<Map, Path> and Tuple<?, Path> (LSP ? for Map)
  • Parse typed simple inputs: x: Integer, bam: Path
  • Parse bare emit-name outputs returned by the LSP for typed processes (e.g. txt, bam)
  • Enrich bare outputs from .nf source files to recover full declarations (e.g. txt = tuple(meta, file("*.txt")))
  • Handle mixed qualified/bare tuple components (e.g. tuple(meta, file("*.txt")))
  • Add topic: as a section delimiter so topic blocks don't bleed into output parsing
  • Guard against traditional qualifiers (each, val, etc.) being misparsed as typed inputs

Groovydoc support (extractor.py, lsp_client.py)

  • Apply @param / @return tag descriptions to process inputs and outputs
  • Support named @return tags (@return txt Description) so each output gets its own description
  • Fall back to reading Groovydoc directly from .nf source files when the LSP omits it (common with typed processes)
  • Support both @param/@return tags and Inputs:/Outputs: bullet-list Groovydoc formats
  • Handle Groovydoc not immediately adjacent to process declaration (e.g. nextflow.preview.types = true between comment and process)

HTML template fixes (html.html, html.py)

  • Fix params page appearing empty due to ID collision: README headings like ## Inputs generated id="inputs", shadowing the <section id="inputs"> used for navigation. Prefix all README heading IDs with readme-
  • Show pattern restriction as an inline yellow badge in parameter cards (with dark mode support)
  • Render all LSP-sourced descriptions through the |markdown filter (process inputs, outputs, workflow I/O, function params) for consistent formatting

Tests

  • 60 tests in new tests/test_nf_parser.py covering traditional and typed syntax
  • 8 tests in tests/test_extractor.py for Groovydoc parsing and param description matching

Examples

  • Add nf-fgsv as a new example pipeline demonstrating typed syntax with Groovydoc

ewels and others added 10 commits March 3, 2026 15:06
Add parsing support for Nextflow's typed I/O declarations
(nextflow.preview.types = true) which use a different syntax than
traditional DSL2:

- Typed tuple inputs: (meta, bam): Tuple<Map, Path>
- Typed simple inputs: x: Integer, bam: Path
- Named assignment outputs: txt = tuple(meta, file("*.txt"))
- Topic sections: topic:/>> 'name' as section delimiters

Also adds comprehensive test suite for nf_parser module (53 tests)
covering both traditional and typed syntax patterns.

Fixes issue where pipelines using typed syntax would generate
documentation with no inputs or outputs listed.
The Nextflow LSP returns bare emit names (e.g. 'txt', 'bam') as
outputs for typed processes, rather than qualified declarations.
Add handling for these bare identifier outputs in _parse_single_output.

Also add tests for ? type in typed tuples (LSP representation of
Map type) and integration tests matching real LSP hover output from
nf-fgsv pipeline.

Verified against fulcrumgenomics/nf-fgsv: all 4 processes now
correctly show both inputs and outputs.
When the Nextflow LSP returns only bare emit names for typed process
outputs (e.g. 'txt', 'bam'), read the actual .nf source file to get
the full output declarations (e.g. txt = tuple(meta, file("*.txt"))).

- Add enrich_outputs_from_source() in nf_parser.py to parse the
  source output block and replace bare outputs with rich declarations
- Add _parse_tuple_components() to handle mixed qualified/bare tuple
  elements (e.g. tuple(meta, file("*.txt")) -> val(meta), file(...))
- Call enrichment from extractor when bare outputs are detected
- Add 6 tests for the enrichment function

Verified against nf-fgsv: outputs now show val(meta), file("*") with
type=tuple and emit names, matching the quality of typed inputs.
README headings like '## Inputs' generated id='inputs', colliding
with the page-level section id='inputs' used for Parameters navigation.
When JavaScript called getElementById('inputs'), it found the README
heading first (earlier in DOM), leaving the Parameters section hidden.

Fix by prefixing all README heading IDs with 'readme-' (e.g.
'readme-inputs'), keeping section IDs intact for navigation.
The pattern field from nextflow_schema.json was already included in
JSON, YAML, and Markdown output but was missing from the HTML template.
Add it after the enum block in the parameter card.
- Parse @param and @return tags from LSP hover and apply descriptions
  to matching ProcessInput/ProcessOutput objects
- Support named @return tags (@return txt Description) so each output
  gets its own description instead of sharing a single _return key
- Fall back to reading Groovydoc directly from .nf source files when
  the LSP returns no param docs (common with typed processes)
- Support both @param/@return and Inputs:/Outputs: bullet-list formats
- Handle Groovydoc not immediately adjacent to process declaration
  (e.g. nextflow.preview.types = true between comment and process)
- Render all LSP-sourced descriptions through |markdown filter in HTML
  (consistent with meta.yml descriptions)
- Move pattern field from a separate block to an inline yellow badge
  in the parameter card, with dark mode support

Adds 8 tests for Groovydoc parsing and param description matching.
Fulcrum Genomics structural variant calling pipeline, demonstrating
typed Nextflow syntax (nextflow.preview.types) with Groovydoc support.
Read .nf source files once per process instead of twice (once for
Groovydoc parsing, once for output enrichment). Extract duplicated
section-boundary regexes and _return_ key prefix strings as shared
constants to keep parsers in sync.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ewels ewels merged commit c207ac6 into main Mar 3, 2026
8 checks passed
@ewels ewels deleted the fix/typed-process-io-parsing branch March 3, 2026 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant