Skip to content

fd:12: hPutBuf: resource vanished (Broken pipe) #153

@uloeber

Description

@uloeber

Hi Luis,
I currently get an error I cannot trace back and our IT doesn't know what's the issue either. I am running a job on one of our huge nodes 2TB, so memory should not be an issue and get the following error (just removed some personal information (path)):


Exiting after fatal error:
ESC[31mAn unhandled error occurred (this should not happen)!

        If you can reproduce this issue, please run your script
        with the --trace flag and report a bug (including the script and the trace) at
                https://github.com/ngless-toolkit/ngless/issues

The error message was: `fd:12: hPutBuf: resource vanished (Broken pipe)`)```

here are the last lines from the trace log:
[main] CMD:/bactopia-20201013/anaconda3/envs/ngless/bin/bwa mem -t 1 -K 100000000 -p -a /NGLESSmodules/Modules/gmgc.ngm/1.0/cached/gmgc:no-rare.fna.splits_250000m.0-bwa-0.7.17.fna -
[main] Real time: 79285.296 sec; CPU: 79095.875 sec

[Wed 22-06-2022 12:52:27] Line 16: Success
[Wed 22-06-2022 12:52:27] Line 16: Mapped readset stats (/NGLESSmodules/Modules/gmgc.ngm/1.0/cached/gmgc:no-rare.fna.splits_250000m.0.fna):
[Wed 22-06-2022 12:52:27] Line 16: Total reads: 21950778
[Wed 22-06-2022 12:52:27] Line 16: Total reads aligned: 13303553 [60.61%]
[Wed 22-06-2022 12:52:27] Line 16: Total reads Unique map: 7767800 [35.39%]
[Wed 22-06-2022 12:52:27] Line 16: Total reads Non-Unique map: 5535753 [25.22%]
[Wed 22-06-2022 12:52:27] Line 17: Running garbage collection.
[Wed 22-06-2022 12:52:27] Line 17: Interpreting [interpretIO]: gmgc_mapped_post = select(Lookup 'gmgc_mapped' as NGLMappedReadSet)using {Block {blockVariable = Variable "mr", blockBody = Sequence [mr = (Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "filter"}( Nothing; min_match_size=45; min_identity_pc=95; action={drop} ),if [UOpNot((Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "flag"}( Just {mapped} ))] then {Sequence [discard]} else {Sequence []}]}}
[Wed 22-06-2022 12:52:27] Line 17: Interpreting [assignment]: select(Lookup 'gmgc_mapped' as NGLMappedReadSet)using {Block {blockVariable = Variable "mr", blockBody = Sequence [mr = (Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "filter"}( Nothing; min_match_size=45; min_identity_pc=95; action={drop} ),if [UOpNot((Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "flag"}( Just {mapped} ))] then {Sequence [discard]} else {Sequence []}]}}
[Wed 22-06-2022 12:52:27] Line 17: Executing blocked select on file /temp/mapped_gmgc:no-rare.sam6057-0.zstd
[Wed 22-06-2022 12:52:27] Line 17: Created & opened temporary file //temp/block_selected_mapped_gmgc:no-rare.sam6057-1.zstd
[Wed 22-06-2022 13:00:19] Line 21: Running garbage collection.
[Wed 22-06-2022 13:00:19] Line 21: Interpreting [interpretIO]: temp$4 = mapstats(Lookup 'gmgc_mapped_post' as NGLMappedReadSet)
[Wed 22-06-2022 13:00:19] Line 21: Interpreting [assignment]: mapstats(Lookup 'gmgc_mapped_post' as NGLMappedReadSet)
[Wed 22-06-2022 13:00:19] Line 21: Computing mapstats on File /temp/block_selected_mapped_gmgc:no-rare.sam6057-1.zstd
[Wed 22-06-2022 13:03:03] Line 21: Created & opened temporary file /temp/sam_stats_block_selected_mapped_gmgc:no-rare6057-2.stats
[Wed 22-06-2022 13:03:03] Line 21: Running garbage collection.
[Wed 22-06-2022 13:03:03] Line 21: Interpreting [interpretIO]: __check_ofile(BinaryOp("gmgc" -BOpPathAppend- BinaryOp(Lookup 'RESULTS' as NGLString -BOpAdd- "gmgc_norare.stats.txt")); original_lno=21)
[Wed 22-06-2022 13:03:03] Line 21: Interpreting [executing module function: '__check_ofile']: NGOString "gmgc/D16gmgc_norare.stats.txt"
[Wed 22-06-2022 13:03:03] Line 21: Running garbage collection.
[Wed 22-06-2022 13:03:03] Line 21: Interpreting [interpretIO]: write(Lookup 'temp$4' as NGLCounts; __can_move=True; __hash="7b8566ff57fbce4d04ada723875d32b8"; ofile=BinaryOp("gmgc" -BOpPathAppend- BinaryOp(Lookup 'RESULTS' as NGLString -BOpAdd- "gmgc_norare.stats.txt")))
[Wed 22-06-2022 13:03:03] Line 21: Interpreting [write]: NGOCounts File /temp/sam_stats_block_selected_mapped_gmgc:no-rare6057-2.stats
[Wed 22-06-2022 13:03:03] Line 21: Writing counts to: gmgc/D16gmgc_norare.stats.txt
[Wed 22-06-2022 13:03:03] Line 22: Running garbage collection.
[Wed 22-06-2022 13:03:03] Line 22: Interpreting [interpretIO]: __check_ofile(BinaryOp("gmgc" -BOpPathAppend- BinaryOp(Lookup 'RESULTS' as NGLString -BOpAdd- ".gmgc_norare.bam")); original_lno=22)
[Wed 22-06-2022 13:03:03] Line 22: Interpreting [executing module function: '__check_ofile']: NGOString "gmgc/D16.gmgc_norare.bam"
[Wed 22-06-2022 13:03:03] Line 22: Running garbage collection.
[Wed 22-06-2022 13:03:03] Line 22: Interpreting [interpretIO]: write(Lookup 'gmgc_mapped_post' as NGLMappedReadSet; __can_move=True; __hash="dcb070e78c288900aed230a2e77b322c"; ofile=BinaryOp("gmgc" -BOpPathAppend- BinaryOp(Lookup 'RESULTS' as NGLString -BOpAdd- ".gmgc_norare.bam")))
[Wed 22-06-2022 13:03:03] Line 22: Interpreting [write]: NGOMappedReadSet {nglgroupName = "preprocessed/D16filtered_HG_HS_qc.pair.1.fq.gz", nglSamFile = File /temp/block_selected_mapped_gmgc:no-rare.sam6057-1.zstd, nglReference = Just "gmgc:no-rare"}
[Wed 22-06-2022 13:03:03] Line 22: Created & opened temporary file /temp/converted_block_selected_mapped_gmgc:no-rare6057-3.bam
[Wed 22-06-2022 13:03:03] Line 22: SAM->BAM Conversion start ('/block_selected_mapped_gmgc:no-rare.sam6057-1.zstd' -> '/temp/converted_block_selected_mapped_gmgc:no-rare6057-3.bam')
ngless "1.4"
import "gmgc" version "1.0"
import "parallel" version "1.0"
import "samtools" version "1.0"

samples = readlines('samplelist.txt')
current = lock1(samples)
input = paired ("preprocessed"</>current + "filtered_HG_HS_qc.pair.1.fq.gz","preprocessed"</>current + "filtered_HG_HS_qc.pair.2.fq.gz")

#different result outputdirs not necessary anymore, since all results will be "collected" in one output with the collect function
RESULTS = current


##GMGC
##"How counts are adjusted in the presence of multiple annotations is defined by the multiple argument. Generally, for obtaining gene abundances, distribution of multiple mappers is the best (using multiple={dist1}), while for functional annotations, you want to count them all (using multiple={all1}). This implies that the functional annotations will sum to a higher value than the number of reads. This may seem strange at first, but it is the intended behaviour."
gmgc_mapped = map (input, reference='gmgc:no-rare',mode_all=True,block_size_megabases=250000)
gmgc_mapped_post = select(gmgc_mapped) using |mr|:
    mr = mr.filter(min_match_size=45, min_identity_pc=95, action={drop})
    if not mr.flag({mapped}):
        discard
write(mapstats(gmgc_mapped_post),ofile="gmgc"</>RESULTS+'gmgc_norare.stats.txt')
write(gmgc_mapped_post,ofile="gmgc"</>RESULTS+'.gmgc_norare.bam')

Thanks in advance!
Ulrike

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions