Skip to content

Conversation

@asukaminato0721
Copy link
Contributor

@asukaminato0721 asukaminato0721 commented Oct 26, 2025

fix #793

Added collect_var_from_hint/hint_from_types to gather element candidates across union branches and reused them in the container decomposers to keep contextual hints from collapsing to the first branch

Taught list inference to try each union branch with its own hint before falling back, plus factored the existing logic into a reusable helper

Unblocked the regression test by removing the expected errors and bug marker now that the behavior is fixed

@meta-cla meta-cla bot added the cla signed label Oct 26, 2025
@asukaminato0721 asukaminato0721 changed the title 793 fix Contextual typing works only with the first element in a union #793 Oct 26, 2025
@asukaminato0721 asukaminato0721 marked this pull request as ready for review October 28, 2025 05:33
@rchen152 rchen152 requested a review from samwgoldman October 30, 2025 19:59
@yangdanny97
Copy link
Contributor

cc @samwgoldman @grievejia for review

@meta-codesync
Copy link

meta-codesync bot commented Nov 24, 2025

@stroxler has imported this pull request. If you are a Meta employee, you can view this in D87788142.

} else {
let elem_tys = self.elts_infer(&x.elts, elt_hint, errors);
self.stdlib.list(self.unions(elem_tys)).to_type()
if let Some(hint_ref) = hint.as_ref()
Copy link
Contributor

@stroxler stroxler Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asukaminato0721 do you remember why this change is needed for list, but set and dict work correctly out of the box given the unwrap.rs changes?

I'm not seeing why list is obviously different. Overall this looks pretty good to me, I'll try to get a second review on it as soon as I understand the list behavior a little better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, list literals are the only container literals that we feed into boolean short‑circuit contexts.

In those expressions we don’t want to collapse a contextual hint like list[A] | list[C] into a single “soft” hint for their elements, because list_with_hint otherwise builds one list[self.unions(elem_tys)], i.e. list[A | C], and that loses the ability to prove the result is assignable to either list[A] or list[C].

The new branch iteration lets us reuse the prior decomposition work from unwrap.rs verbatim. Still, it also keeps each branch’s element inference isolated, so we can short‑circuit as soon as we find a branch whose fully built list is a subtype of the contextual branch.

Without that, even though decompose_list now collects per-branch hints, the final list(...) we return is still “the union of all element hints”, which is exactly what flagged list[A|C] vs list[A] | list[C] in boolop_soft.


Set and dict literals don’t currently appear in any places where we need that stronger guarantee. We still decompose their hints per branch, but after unions, the result stays a single set[...]/dict[...].

Because those expressions aren’t fed back into another branch-sensitive context (like boolop mixing container unions), we haven’t yet found a case that needs the same “try each branch independently and bail out on first success” treatment.

If/when we run into a repro similar to the list case—for example, x or {1} against set[int] | set[str]—we can apply the same pattern there. For now, the regression tests only exercised the list form, hence the localized change.

Copy link
Contributor

@stroxler stroxler Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing along some notes from @samwgoldman, who has thought far more about contextual typing (and general issues with our constraint solver) than anyone else on the Pyrefly team:


After reading the diff somewhat closely, I think the main idea is good, but the specific execution is off.

The main idea is that a union-like annotation with N branches should be treated as N hints, i.e. Hint(Vec<Type>) not Hint(Type). For example x: T | U = e should contextually type e in the context of Hint(vec![T, U]). When we decompose a lint, we should decompose each branch of the hint separately (to avoid Var confusion). When we go to check an expression w.r.t. some hint, we treat multiple branches disjunctively -- only one branch needs to match.

This diff fixes the issue with Var confusion during decomposition by splitting union hints (collect_var_from_hint), but then packages the result back up into a union (hint_from_types). This is why the ExprList case needs to match on union in order to actually implement the disjunctive contextual typing behavior. This has two problems:

  1. It confuses the hint representations for list[A] | list[B] and list[A | B], since these are now the same. As a result, when checking against either case, we need to check against each of list[A] and list[B] and list[A, B]. We might have cases like list[BigUnionTypeAlias] which now check the RHS expression N+1 times unnecessarily.

  2. More importantly, if we don't check for the union case, we do the wrong thing! See the following cases:

xs: dict[int, int] | dict[str, str] = {} # error: `dict[int | str, int | str]` is not assignable to `dict[int, int] | dict[str, str]`
ys: set[int] | set[str] = reveal_type({1}) # error: `set[int | str]` is not assignable to `set[int] | set[str]`

The dict case is particularly bad, since we can't recover dict[int, int] | dict[str, str] from dict[int | str, int | str].

(From me, not Sam: our internal CI includes an error delta against Pytorch that we haven't figured out how to reproduce in github cheaply and another one against instagram and this issue does seem to cause a significant number of new errors in both codebases).

There are a few other issues:

  • The disjunctive logic isn't disjunctive enough, and can leak errors between branches. We rarely emit errors when contextually typing expressions, but we do sometimes (for typed dict specifically).
from typing import TypedDict

class A(TypedDict):
    x: int

class B(TypedDict):
    x: str

xs: list[A] | list[B] = [{"x": "foo"}] # error: `Literal['foo']` is not assignable to TypedDict key `x` with type `int`

Note that the above errors when we consider the option of list[A], but then continues with the next one and actually succeeds. The error is from a failed disjunctive branch. This is a smaller bug, but one that needs to be fixed.

  • This diff implements it's own kind of map_over_union logic, but it only handles UnionType. We should probably also handle unions of the form type[A | B]. We should also handle Quantified types with union-typed bounds (I'm not sure if we handle bounded quantifieds even in the non-union case yet, though...). Do we need to handle Var? I wouldn't expect to see Var from an annotation type, but maybe it comes up?

My preference is to more-or-less start over, using the same idea, but a different approach:

  1. Change the representation of Hint to be a Vec1 of types in disjunctive normal form.
  2. Instead of trying to break up the unions when we go to decompose, we should do this at construction of the hint. Decomposition doesn't create new unions, so this seems like the wrong place to split the unions.

I had some additional ideas to optimize the disjunctive logic, to avoid re-inferring expressions unnecessarily, but this is just an optimization.

Added a union-aware fast path in expr_infer_with_hint plus branch-aware
list inference so we can explore hints that don’t contain type variables
before the generic ones. Together with the new type_contains_var (now
driven by Type::universe) and prefer_union_branch_without_vars,
contextual typing no longer eagerly binds callable/lambda type vars when
a non-generic branch of a union would have sufficed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Contextual typing works only with the first element in a union

4 participants