Skip to content

Add HDL shared test fixtures#1932

Open
lyh970817 wants to merge 5 commits intonf-core:modulesfrom
lyh970817:hdl-reference-fixtures
Open

Add HDL shared test fixtures#1932
lyh970817 wants to merge 5 commits intonf-core:modulesfrom
lyh970817:hdl-reference-fixtures

Conversation

@lyh970817
Copy link

@lyh970817 lyh970817 commented Mar 15, 2026

PR checklist

  • I read the test-data specifications
  • I tried testing my module with available data and there is no compatible data in test-datasets
  • If I added data: I described the new files on the root README.md
  • If I modified data: I checked whether the root README.md is still up-to-date
  • I made sure to submit the smallest dataset possible (e.g. try .tar.gz for larger directories)

This PR adds the smallest shared HDL fixture set needed to exercise HDL-compatible LD reference inputs for the companion module submission.
Existing popgen fixtures do not provide the HDL reference directory layout consumed by hdl/h2 and hdl/rg.

All HDL files in this branch are fully synthetic toy fixtures generated in-repo by data/genomics/homo_sapiens/popgen/hdl/generate_toy_hdl_data.R to match HDL's expected format and layout; they are compatible with HDL inputs and do not redistribute an upstream HDL reference bundle.

Added files:

  • reference/chr1.1_toy.bim
  • reference/chr1.1_toy.rda
  • reference/chr1.2_toy.bim
  • reference/chr1.2_toy.rda
  • reference/toy_snp_counter.RData
  • reference/toy_snp_list.RData
  • sumstats/trait1_canonical.tsv
  • sumstats/trait2_canonical.tsv

Size footprint: 915 bytes total.
These toy HDL inputs were minimized to the smallest set that still supports real HDL execution for the companion module tests.

Companion modules PR: nf-core/modules#10912

@lyh970817 lyh970817 marked this pull request as ready for review March 15, 2026 17:45
@lyh970817 lyh970817 requested a review from Copilot March 15, 2026 17:46
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a minimal HDL-formatted shared fixture set (HDL LD reference layout + canonical sumstats) to support companion module tests that require HDL reference directory structure.

Changes:

  • Added toy canonical HDL summary statistics for two traits.
  • Added toy HDL LD reference .bim sidecars for two chr1 chunks.
  • Documented the new HDL fixture files in the root README.md.

Reviewed changes

Copilot reviewed 5 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
data/genomics/homo_sapiens/popgen/hdl/sumstats/trait1_canonical.tsv Adds toy canonical HDL summary statistics (trait 1)
data/genomics/homo_sapiens/popgen/hdl/sumstats/trait2_canonical.tsv Adds toy canonical HDL summary statistics (trait 2)
data/genomics/homo_sapiens/popgen/hdl/reference/chr1.1_toy.bim Adds toy HDL reference BIM sidecar for chunk 1.1
data/genomics/homo_sapiens/popgen/hdl/reference/chr1.2_toy.bim Adds toy HDL reference BIM sidecar for chunk 1.2
README.md Documents the newly added HDL fixture files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +3 to +5
rs2 C T 1 202 rs2 C T 10000 -0.2
rs3 A C 1 303 rs3 A C 10000 0.4
rs4 G T 1 404 rs4 G T 10000 -0.1
Comment on lines +2 to +5
rs1 A G 1 101 rs1 A G 12000 0.3
rs2 C T 1 202 rs2 C T 12000 -0.4
rs3 A C 1 303 rs3 A C 12000 0.2
rs4 G T 1 404 rs4 G T 12000 -0.2
@lyh970817 lyh970817 requested a review from Copilot March 16, 2026 07:58
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a minimal synthetic HDL-format reference + sumstats fixture set to support companion HDL module tests, along with regeneration tooling and documentation.

Changes:

  • Added toy HDL reference sidecars/payloads and canonical sumstats TSVs under data/genomics/homo_sapiens/popgen/hdl/.
  • Added an R script to regenerate the fixtures deterministically from in-repo constants.
  • Documented the dataset locally and registered the new files in the root README.md.

Reviewed changes

Copilot reviewed 7 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
data/genomics/homo_sapiens/popgen/hdl/sumstats/trait2_canonical.tsv Adds toy canonical sumstats fixture for trait 2
data/genomics/homo_sapiens/popgen/hdl/sumstats/trait1_canonical.tsv Adds toy canonical sumstats fixture for trait 1
data/genomics/homo_sapiens/popgen/hdl/reference/chr1.2_toy.bim Adds toy BIM sidecar for HDL chunk 1.2
data/genomics/homo_sapiens/popgen/hdl/reference/chr1.1_toy.bim Adds toy BIM sidecar for HDL chunk 1.1
data/genomics/homo_sapiens/popgen/hdl/generate_toy_hdl_data.R Adds generator script for recreating all toy fixtures
data/genomics/homo_sapiens/popgen/hdl/README.md Documents intent, layout, and regeneration steps for the HDL toy dataset
README.md Registers the new HDL toy dataset files in the root inventory

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

README.md Outdated
Comment on lines +695 to +702
- hdl/reference/chr1.1_toy.bim: Toy HDL LD reference BIM sidecar for chunk 1.1. From [HDL](https://github.com/zhenin/HDL).
- hdl/reference/chr1.1_toy.rda: Toy HDL LD reference payload for chunk 1.1. From [HDL](https://github.com/zhenin/HDL).
- hdl/reference/chr1.2_toy.bim: Toy HDL LD reference BIM sidecar for chunk 1.2. From [HDL](https://github.com/zhenin/HDL).
- hdl/reference/chr1.2_toy.rda: Toy HDL LD reference payload for chunk 1.2. From [HDL](https://github.com/zhenin/HDL).
- hdl/reference/toy_snp_counter.RData: Toy HDL SNP count metadata. From [HDL](https://github.com/zhenin/HDL).
- hdl/reference/toy_snp_list.RData: Toy HDL SNP list metadata. From [HDL](https://github.com/zhenin/HDL).
- hdl/sumstats/trait1_canonical.tsv: Canonical HDL summary statistics for trait 1. From [HDL](https://github.com/zhenin/HDL).
- hdl/sumstats/trait2_canonical.tsv: Canonical HDL summary statistics for trait 2. From [HDL](https://github.com/zhenin/HDL).
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the root README entries to describe these as synthetic HDL-format fixtures generated by generate_toy_hdl_data.R for HDL-compatible inputs, and aligned the PR body wording/checklist with that provenance.


args <- commandArgs(trailingOnly = FALSE)
file_arg <- "--file="
script_path <- sub(file_arg, "", args[grep(file_arg, args)])
Comment on lines +20 to +29
"1 rs1 0 101 A G",
"1 rs2 0 102 C T"
),
file.path(reference_dir, "chr1.1_toy.bim")
)

writeLines(
c(
"1 rs3 0 201 A C",
"1 rs4 0 202 G T"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants