-
Notifications
You must be signed in to change notification settings - Fork 735
Closed
Labels
Description
Bug report
Expected behavior and actual behavior
When executing a task that produces a container OOM error, it returns the following error:
Process
UseMem (1) terminated for an unknown reason -- Likely it has been terminated by the external system
This problem disables the possibility of retrying OOM errors. As there is no exit code, users can't retry checking if exitcode is 137.
Steps to reproduce the problem
Run a pipeline with a task exahausting the memory
nextflow run 'https://github.com/robsyme/nf-test' -r mem-testing
Program output
In the log file, we can see the error is produced because the .exitcode file is not generated.
In other executors like AWS Batch exit code is first got from API status and fallback to read .exitfile in case of not able to get from the API.
Environment
- Nextflow version: 25.04.6
- Java version: [?]
- Operating system: [macOS, Linux, etc]
- Bash version: (use the command
$SHELL --version
)
Additional context
(Add any other context about the problem here)