Skip to content

Conversation

perlfu
Copy link
Contributor

@perlfu perlfu commented Sep 10, 2025

Attempt to avoid marking PHIs divergent in join blocks. Only mark PHIs which contain values with sync dependence on the divergent terminator condition.

Attempt to avoid marking PHIs divergent in join blocks.
Only mark PHIs which contain values with sync dependence on the
divergent terminator condition.
Copy link
Collaborator

@dstutt dstutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor things...

Copy link
Collaborator

@nhaehnle nhaehnle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some inline comments, but I think they ultimately boil down to this:

What I suspect the change should really be doing is to open-code isConstantOrUndefValuePhi, but skip over predecessors that aren't on any of the diverged paths.

Does that make sense?

@perlfu
Copy link
Contributor Author

perlfu commented Sep 25, 2025

What I suspect the change should really be doing is to open-code isConstantOrUndefValuePhi, but skip over predecessors that aren't on any of the diverged paths.

I agree. I already had the feeling that half of this was open coding isConstantOrUndefValuePhi.
I have reworked the methodology inline with that.

There is now one test change where I PHI becomes uniform.
I think this case is correct, but have not spent too much time on it.

One oddity is that hasConstantOrUndefValue normally ignores self references in PHIs.
I found that apply that logic here produced uniformity results which I didn't agree with.
I have left comments in that indicate where this logic should be, but will remove them if this is merged.

@jayfoad
Copy link
Contributor

jayfoad commented Sep 25, 2025

@ssahasra it would be great if you could take a close look at this.

%merge.y = phi i32 [ 42, %X ], [ %b, %B ]
ret void
; CHECK: DIVERGENT: %div.merge.y =
; CHECK-NOT: DIVERGENT: %merge.y =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is wrong. Once we go entry->H->G, there is a divergent branch which can choose either G->X->Y or G->L->C->H->B->Y (!). Therefore the choice of whether we arrive at Y from X or from B is divergent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this by reinstating my previous code which respected temporal divergence.

Comment on lines 951 to 965
// Track common value for all inputs.
if (!PhiCommon)
PhiCommon = Values[I];
else if (Values[I] != *PhiCommon)
HasSingleValue = false;

// Only consider predecessors on divergent path.
if (Blocks[I] != &DivTermBlock &&
!DivDesc.BlockLabels.lookup_or(Blocks[I], nullptr))
continue;

// Phi uniformity is maintained if all values on divergent path match.
if (!PathCommon)
PathCommon = Values[I];
else if (Values[I] != *PathCommon)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not able to understand the intuition behind considering the values on a divergent path here? Isn't HasSingleValue = false sufficient here for marking it divergent?

if we really don't want to consider a non-divergent path, can we skip that early to make logic cleaner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HasSingleValue here exists because this loop has absorbed the logic from isConstantOrUndefValuePhi.
We want to track both if the PHI has a single common value across all inputs, or failing that, if it has a single value across all predecessors on the same divergent path.
It should be uniform in both cases -- the existing code only handles the trivial case that the PHI only has a single value across all inputs.
The new test cases provide examples of this uniformity.

@PankajDwivedi-25
Copy link
Contributor

In the case when all the divergent path has the same value and `hasSingleValue = false', are we considering it as uniform?


// Allow an empty Blocks/Values list to signify getPhiInputs is not
// implemented; in which case no uniformity is possible.
bool Uniform = !Values.empty();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just exit early at this point instead of going through the for-loop below over all incoming blocks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loop is already predicated on Uniform so it won't go through blocks in this case.

if (!PathCommon)
PathCommon = Values[I];
else if (Values[I] != *PathCommon)
Uniform = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can break here and just skip the remaining blocks. It might be possible to eliminate the Uniform variable if we moved the for-loop into a separate function or lambda with early returns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only break here if HasSingleValue is also false -- the loop condition will handle this.

Uniform = false;

// Phi is reached via divergent exit (i.e. respect temporal divergence).
if (DivDesc.CycleDivBlocks.contains(Blocks[I]))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe check this first and avoid all other work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can move it up in the loop a bit.

for (unsigned I = 0, E = Phi->getNumIncomingValues(); I != E; ++I) {
const Value *Incoming = Phi->getIncomingValue(I);
const BasicBlock *Block = Phi->getIncomingBlock(I);
// FIXME: should this also consider Incoming == &Instr undef?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all the dominating inputs are well-defined, then this self-reference is not undef, right? If at this point we find that indeed Incoming == &Instr, then I am not able to decide if treating it as undef is always correct. Ultimately, it the uniformity itself has to be false unless proven otherwise.

@@ -0,0 +1,114 @@
; RUN: opt %s -mtriple amdgcn-- -passes='print<uniformity>' -disable-output 2>&1 | FileCheck %s

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be useful to have one or two lines explaining what's the distinguishing pattern in each function compared to other functions in the file.

; CHECK: DIVERGENT: %div.val =
ret void
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a test with temporal divergence?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the existing tests cover this.

@perlfu
Copy link
Contributor Author

perlfu commented Oct 6, 2025

I have refactor the loop, and change variable naming to try and make its function more obvious.


// Phi is reached via divergent exit (i.e. respect temporal divergence).
if (DivDesc.CycleDivBlocks.contains(Blocks[I])) {
UniformOnPath = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, HasSingleValue might still be true, and then the PHINode will not get marked as divergent. But temporal divergence means that even the same input can have different values for different threads. Either this check should happen early to mark the PHINode divergent, or in fact this check is not needed at all, because eventually UA will mark all temporally divergent uses anyway when it calls propagateTemporalDivergence().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, HasSingleValue might still be true, and then the PHINode will not get marked as divergent.
Yes, this is what the current implementation prior to this patch does.

The key difference between this test and propagateTemporalDivergence is whether it is related to the CFG edge or the value definition.
propagateTemporalDivergence only marks PHIs if they use a value defined within a loop.
This marks the PHI based on its reachability from a divergent exit.

The test hidden_doublebreak_diverge in hidden_loopdiverge.ll highlights this.
The PHI %div.merge.y = phi i32 [ 42, %X ], [ %b, %B ] only uses values defined outside the function, so the values themselves will not cause temporal divergence.

To borrow Jay's summary:

Once we go entry->H->G, there is a divergent branch which can choose either G->X->Y or G->L->C->H->B->Y. Therefore the choice of whether we arrive at Y from X or from B is divergent.

The divergent path via B only has a single value (%b), so this PHI would not be marked divergent (UniformOnPath = true) unless we consider the temporal divergence of the path as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants