-
Notifications
You must be signed in to change notification settings - Fork 127
Description
What are you really trying to do?
I have a long running workflow that uses workflow.uuid4
to set an ID for a child workflow, then sleeps. I modified the retry policy of an activity and set initial_interval
with workflow.random().randint
. The modified activity is called before the child workflow is started. I deployed the new workflow.
Describe the bug
All the running workflows failed with [TMPRL1100] Nondeterminism
error after they woke up from the sleep. All the errors were complaining about the ID of the child workflow. Example error message:
[TMPRL1100] Nondeterminism error: Child workflow id of scheduled event 'child_workflow_6bfd6073-2403-413f-acc3-7090cf837d0a' does not match child workflow id of command 'child_workflow_96a033ac-6730-4f6d-abfd-60732403113f
If I make any other change in the retry policy, e.g., changing initial_interval
from one number to another, I don't get the above error. It only happens, if a workflow.random()
function call is added.
Minimal Reproduction
I managed to reproduce the issue with the simple workflow code below.
- Deploy the worker
- Start the workflow
- Uncomment
workflow.random().randint
- Re-deploy the worker
- After workflow wakes up it dies with non-determinism error
I also got the same error with other random functions, e.g., workflow.random().uniform
. However, just adding workflow.random()
doesn't cause workflow fail with non-determinism.If workflow.uuid4
is called BEFORE workflow.random().randint
, then the workflow doesn't fail with non-determinism after workflow.random().randint
was added.
from datetime import timedelta
from temporalio import workflow
@workflow.defn
class ChildWorkflowUuid:
@workflow.run
async def run(self) -> str:
return workflow.info().workflow_id
@workflow.defn
class WorkflowUuid:
@workflow.run
async def run(self) -> str:
# IF CHILD ID IS SET HERE, THE WORKFLOW COMPLETES WITHOUT ANY ERRORS AFTER ADDING `workflow.random().randint(1, 5)`
# child_workflow_id = str(workflow.uuid4())
# UNCOMMENT `workflow.random().randint(1, 5)` WHILE WORKFLOW IS SLEEPING (just `workflow.random()` does not cause non-determinism error)
# workflow.random().randint(1, 5)
# IF CHILD ID IS SET HERE, WORKFLOW FAILS WITH
# [TMPRL1100] Nondeterminism error: Child workflow id of scheduled event '0d4bdc56-4950-4478-94ad-076b61f06fb1' does not match child workflow id of command 'ab05243a-0d4b-4c56-8950-547854ad076b
child_workflow_id = str(workflow.uuid4())
child_workflow_uuid = await workflow.execute_child_workflow(
ChildWorkflowUuid.run,
id=child_workflow_id,
)
await workflow.sleep(timedelta(seconds=30))
return child_workflow_uuid
Environment/Versions
- OS and processor: M1 Mac Pro and x86 Linux
- Temporal Version: 1.17.0
- Are you using Docker or Kubernetes or building Temporal from source: no
Additional context
I am not sure if this is a bug or expected behaviour. I assumed that workflow.uuid4
seed only depends on the run ID, which should be unchanged after the workflow was modified.