Skip to content

Conversation

Maximilien-R
Copy link
Contributor

Context

When a TaskRun references StepActions, TaskRun.Status.Steps is partially populated during resolution (to capture provenance). Later, pod-based status processing appends inline steps, which can result in StepAction steps being shown first and inline steps last, regardless of the real pod order.

This breaks dashboards and tools that assume Status.Steps reflects the true execution order:

  • Incorrect ordering: StepAction steps appear first and inline steps are appended, regardless of their real execution order in the pod.
  • UI "popping": In the Tekton Dashboard, StepAction-backed steps show up immediately (post-resolution) while inline steps only appear later (after pod creation and status reconciliation). This causes steps to "pop" into view and reshuffle, which is confusing.

Fixes #9037

Changes

This PR includes two complementary commits that address both completeness and ordering of Status.Steps.

  • Commit n°1:
    • setTaskRunStatusBasedOnStepStatus now constructs a new ordered slice the size of the step container list and replaces trs.Steps in one shot.
    • The input stepStatuses were already sorted to match pod.Spec.Containers; we leverage that to ensure strict pod-order.
    • Provenance for matching steps (by name) is preserved.
  • Commit n°2:
    • During GetStepActionsData, if at least one StepAction is present in the Task, create a StepState for every step:
      • StepAction steps get Provenance.RefSource when available.
      • Inline steps get nil provenance.
    • This makes Status.Steps complete earlier in the lifecycle, removing the Dashboard "popping" effect where StepAction steps appear first and inline steps arrive later.

Release Notes

- `TaskRun.Status.Steps` now strictly follows the pod’s step container order.
- When `StepActions` are used, inline steps are also included in `TaskRun.Status.Steps` (with nil provenance) early, preventing steps from "popping" into view later and improving dashboard stability.

@tekton-robot tekton-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Sep 19, 2025
@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 19, 2025
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 92.2% 92.4% 0.1

@waveywaves
Copy link
Member

/retest

@waveywaves waveywaves requested review from vdemeester and waveywaves and removed request for jerop September 23, 2025 17:26
@waveywaves
Copy link
Member

/assign

Copy link
Member

@waveywaves waveywaves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for this fix ! I have one performance fix, a comment suggestion and wanted to point that when not StepActions exist, StepStates won't get created considering that we are returning on lines 202-204 in taskspec.go. we should have StepStates created regardless for consistent behaviour right ?

@Maximilien-R Maximilien-R force-pushed the fix-taskrun-status-steps-ordering branch from bb65aa8 to faa13e8 Compare September 30, 2025 10:56
@Maximilien-R
Copy link
Contributor Author

Hi @waveywaves 👋

I've taken your feedback into account and made the following changes:

  • added a lookup map for the step state provenances in setTaskRunStatusBasedOnStepStatus
  • added a comment for readability on updateTaskRunProvenance usages
  • added a third commit to populate TaskRun.Status.Steps in all scenarios (also in the case of inline steps only).

For the third commit, in absolute terms, when inline steps only, we don't have the issue I'm trying to solve; however, populating TaskRun.Status.Steps in this case has the advantage of having a consistent behaviour and will allow the UI to display the steps a little earlier (no need to wait for the pod to be created).

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/status.go 92.2% 92.4% 0.2

@waveywaves
Copy link
Member

/retest

@waveywaves
Copy link
Member

Thank you @Maximilien-R can you rebase the PR, there seems to be an issue with the CI

The `TaskRun.Status.Steps` list was not guaranteed to reflect the true
execution order when StepActions were involved. Steps populated earlier
(from StepAction resolution) were left at the front, with inline steps
appended later, causing a mismatch with the pod container order and
confusing dashboards.

This change fixes the issue by creating a temporary slice aligned with
the (already sorted) pod step container sequence and then replace
`trs.Steps` in one shot. We still preserve existing `Provenance` for
matching steps by name.
…ns are used

When at least one StepAction is referenced, `TaskRun.Status.Steps` is
pre-populated to capture provenance for remote steps during resolution.
However, inline steps were not included at that stage and only appeared
later during pod-based status reconciliation. This contributed to confusing
ordering and visibility for dashboards relying on `Status.Steps`.

In the Tekton Dashboard specifically, StepAction-backed steps showed up
first (because they were present in `Status.Steps` right after resolution),
while inline steps appeared only after the pod was created and
`MakeTaskRunStatus` had run to append them. This resulted in steps
"popping" into view and reshuffling, which was confusing.

When a `TaskRun` references at least one `StepAction`, in the resolution
flow, the inline `Step` should also append an entry in `Status.Steps` even
if it has no `Provenance`.

add comment for readibility
With this change, TaskRun.Status.Steps are now populated even when there
is only inline steps defined.
@Maximilien-R Maximilien-R force-pushed the fix-taskrun-status-steps-ordering branch from faa13e8 to 52a30a3 Compare October 1, 2025 07:13
@Maximilien-R
Copy link
Contributor Author

@waveywaves I did the rebase but the e2e tests still don't pass 😞

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 1, 2025
@vdemeester
Copy link
Member

2025-10-01T07:45:58.1625444Z   step-perl:
2025-10-01T07:45:58.1625619Z     Container ID:  
2025-10-01T07:45:58.1626055Z     Image:         mirror.gcr.io/perl:devel-bullseye
2025-10-01T07:45:58.1626222Z     Image ID:      
2025-10-01T07:45:58.1626412Z     Port:          <none>
2025-10-01T07:45:58.1626621Z     Host Port:     <none>
2025-10-01T07:45:58.1626772Z     Command:
2025-10-01T07:45:58.1626987Z       /tekton/bin/entrypoint
2025-10-01T07:45:58.1627127Z     Args:
2025-10-01T07:45:58.1627288Z       -wait_file
2025-10-01T07:45:58.1627482Z       /tekton/run/6/out
2025-10-01T07:45:58.1627644Z       -post_file
2025-10-01T07:45:58.1627837Z       /tekton/run/7/out
2025-10-01T07:45:58.1628042Z       -termination_path
2025-10-01T07:45:58.1628230Z       /tekton/termination
2025-10-01T07:45:58.1628428Z       -step_metadata_dir
2025-10-01T07:45:58.1628642Z       /tekton/run/7/status
2025-10-01T07:45:58.1628813Z       -entrypoint
2025-10-01T07:45:58.1629060Z       /tekton/scripts/script-7-zgq5v
2025-10-01T07:45:58.1629201Z       --
2025-10-01T07:45:58.1629378Z     State:          Waiting
2025-10-01T07:45:58.1629612Z       Reason:       ImagePullBackOff
2025-10-01T07:45:58.1629788Z     Ready:          False

Hmm, are those images not available anymore ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

TaskRun Status.Steps order incorrect when using StepAction
4 participants