-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[Uniformity] Avoid marking all PHIs as divergent in join blocks #157808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Attempt to avoid marking PHIs divergent in join blocks. Only mark PHIs which contain values with sync dependence on the divergent terminator condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor things...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some inline comments, but I think they ultimately boil down to this:
What I suspect the change should really be doing is to open-code isConstantOrUndefValuePhi
, but skip over predecessors that aren't on any of the diverged paths.
Does that make sense?
I agree. I already had the feeling that half of this was open coding There is now one test change where I PHI becomes uniform. One oddity is that |
@ssahasra it would be great if you could take a close look at this. |
%merge.y = phi i32 [ 42, %X ], [ %b, %B ] | ||
ret void | ||
; CHECK: DIVERGENT: %div.merge.y = | ||
; CHECK-NOT: DIVERGENT: %merge.y = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is wrong. Once we go entry->H->G, there is a divergent branch which can choose either G->X->Y or G->L->C->H->B->Y (!). Therefore the choice of whether we arrive at Y from X or from B is divergent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've fixed this by reinstating my previous code which respected temporal divergence.
- Address reviewer comments
// Track common value for all inputs. | ||
if (!PhiCommon) | ||
PhiCommon = Values[I]; | ||
else if (Values[I] != *PhiCommon) | ||
HasSingleValue = false; | ||
|
||
// Only consider predecessors on divergent path. | ||
if (Blocks[I] != &DivTermBlock && | ||
!DivDesc.BlockLabels.lookup_or(Blocks[I], nullptr)) | ||
continue; | ||
|
||
// Phi uniformity is maintained if all values on divergent path match. | ||
if (!PathCommon) | ||
PathCommon = Values[I]; | ||
else if (Values[I] != *PathCommon) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not able to understand the intuition behind considering the values on a divergent path here? Isn't HasSingleValue = false
sufficient here for marking it divergent
?
if we really don't want to consider a non-divergent path, can we skip that early to make logic cleaner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HasSingleValue
here exists because this loop has absorbed the logic from isConstantOrUndefValuePhi
.
We want to track both if the PHI has a single common value across all inputs, or failing that, if it has a single value across all predecessors on the same divergent path.
It should be uniform in both cases -- the existing code only handles the trivial case that the PHI only has a single value across all inputs.
The new test cases provide examples of this uniformity.
In the case when all the |
|
||
// Allow an empty Blocks/Values list to signify getPhiInputs is not | ||
// implemented; in which case no uniformity is possible. | ||
bool Uniform = !Values.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just exit early at this point instead of going through the for-loop below over all incoming blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loop is already predicated on Uniform
so it won't go through blocks in this case.
if (!PathCommon) | ||
PathCommon = Values[I]; | ||
else if (Values[I] != *PathCommon) | ||
Uniform = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can break here and just skip the remaining blocks. It might be possible to eliminate the Uniform
variable if we moved the for-loop into a separate function or lambda with early returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can only break here if HasSingleValue
is also false -- the loop condition will handle this.
Uniform = false; | ||
|
||
// Phi is reached via divergent exit (i.e. respect temporal divergence). | ||
if (DivDesc.CycleDivBlocks.contains(Blocks[I])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe check this first and avoid all other work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can move it up in the loop a bit.
for (unsigned I = 0, E = Phi->getNumIncomingValues(); I != E; ++I) { | ||
const Value *Incoming = Phi->getIncomingValue(I); | ||
const BasicBlock *Block = Phi->getIncomingBlock(I); | ||
// FIXME: should this also consider Incoming == &Instr undef? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all the dominating inputs are well-defined, then this self-reference is not undef, right? If at this point we find that indeed Incoming == &Instr
, then I am not able to decide if treating it as undef
is always correct. Ultimately, it the uniformity itself has to be false
unless proven otherwise.
@@ -0,0 +1,114 @@ | |||
; RUN: opt %s -mtriple amdgcn-- -passes='print<uniformity>' -disable-output 2>&1 | FileCheck %s | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be useful to have one or two lines explaining what's the distinguishing pattern in each function compared to other functions in the file.
; CHECK: DIVERGENT: %div.val = | ||
ret void | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a test with temporal divergence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the existing tests cover this.
I have refactor the loop, and change variable naming to try and make its function more obvious. |
|
||
// Phi is reached via divergent exit (i.e. respect temporal divergence). | ||
if (DivDesc.CycleDivBlocks.contains(Blocks[I])) { | ||
UniformOnPath = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, HasSingleValue might still be true, and then the PHINode will not get marked as divergent. But temporal divergence means that even the same input can have different values for different threads. Either this check should happen early to mark the PHINode divergent, or in fact this check is not needed at all, because eventually UA will mark all temporally divergent uses anyway when it calls propagateTemporalDivergence().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, HasSingleValue might still be true, and then the PHINode will not get marked as divergent.
Yes, this is what the current implementation prior to this patch does.
The key difference between this test and propagateTemporalDivergence
is whether it is related to the CFG edge or the value definition.
propagateTemporalDivergence
only marks PHIs if they use a value defined within a loop.
This marks the PHI based on its reachability from a divergent exit.
The test hidden_doublebreak_diverge
in hidden_loopdiverge.ll
highlights this.
The PHI %div.merge.y = phi i32 [ 42, %X ], [ %b, %B ]
only uses values defined outside the function, so the values themselves will not cause temporal divergence.
To borrow Jay's summary:
Once we go entry->H->G, there is a divergent branch which can choose either G->X->Y or G->L->C->H->B->Y. Therefore the choice of whether we arrive at Y from X or from B is divergent.
The divergent path via B only has a single value (%b
), so this PHI would not be marked divergent (UniformOnPath = true
) unless we consider the temporal divergence of the path as well.
Attempt to avoid marking PHIs divergent in join blocks. Only mark PHIs which contain values with sync dependence on the divergent terminator condition.