-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[red-knot] Support multiple overloads when binding parameters at call sites #16568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/cc @dhruvmanila |
| Type::Dynamic(_) | Type::Never => { | ||
| Ok(CallOutcome::Single(CallBinding::from_return_type(self))) | ||
| } | ||
| Type::Dynamic(_) | Type::Never => OverloadBinding::from_return_type(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the one remaining call to from_return_type. Removing it introduces a salsa cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind! Picking up the latest main lets me remove this too...
* main: [red-knot] Understand `typing.Callable` (#16493) [red-knot] Support unpacking `with` target (#16469) [red-knot] Attribute access and the descriptor protocol (#16416) [`pep8-naming`] Add links to `ignore-names` options in various rules' documentation (#16557) [red-knot] avoid inferring types if unpacking fails (#16530)
| Type::Dynamic(_) | Type::Never => { | ||
| Ok(CallOutcome::Single(CallBinding::from_return_type(self))) | ||
| } | ||
| Type::Dynamic(_) | Type::Never => OverloadBinding::from_return_type(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind! Picking up the latest main lets me remove this too...
| // TODO remove this constructor and construct always from `bind_call` | ||
| pub(crate) fn from_return_type(return_ty: Type<'db>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...which lets me address this TODO!
| let first_argument_is_none = | ||
| arguments.first_argument().is_some_and(|ty| ty.is_none(db)); | ||
|
|
||
| let signature = Signature::new( | ||
| Parameters::new([ | ||
| Parameter::new( | ||
| Some("instance".into()), | ||
| Some(Type::object(db)), | ||
| ParameterKind::PositionalOnly { default_ty: None }, | ||
| ), | ||
| if first_argument_is_none { | ||
| let not_none = Type::none(db).negate(db); | ||
| let overloads = Overloads::from_overloads([ | ||
| Signature::new( | ||
| Parameters::new([ | ||
| Parameter::new( | ||
| Some("instance".into()), | ||
| Some(Type::none(db)), | ||
| ParameterKind::PositionalOnly { default_ty: None }, | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these updates will help separate call binding into two phases, since all of the Signatures that we construct no longer depend on the inferred argument types. (We still have special case logic to update the return type of the call based on argument types, but that won't conflict with the two-phase change)
| // TODO annotated return type on `__new__` or metaclass `__call__` | ||
| // TODO check call vs signatures of `__new__` and/or `__init__` | ||
| Type::ClassLiteral(ClassLiteralType { .. }) => { | ||
| let signature = | ||
| Signature::new(Parameters::gradual_form(), Some(self.to_instance(db))); | ||
| let binding = bind_call(db, arguments, &signature.into(), self); | ||
| binding.into_outcome() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That's exactly what I am changing.
I see that you've extracted the known classes into separate arms with hard coded signatures. That's awesome as it improves the current code.
However, as we've decided yesterday I was going to add handling of __new__, which should enable signature inference from the typeshed.
Not a big deal. If this PR is merged first I'll rebase and adjust accordingly. If mine happens to be first, these additional arms may or may not be necessary.
This all looks great and I like simpler API to create the outcome! Thanks for keeping me updated too!
| # error: [missing-argument] "No arguments provided for required parameters `instance`, `owner` of overload 1 of wrapper descriptor `FunctionType.__get__`" | ||
| # error: [missing-argument] "No argument provided for required parameter `instance` of overload 2 of wrapper descriptor `FunctionType.__get__`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not suggesting that we have to solve this in your PR but raising multiple diagnostics doesn't seem to be a great user experience.
I think I'd expect a diagnostic in the form of:
- Failed to call
FunctionType.__get__because - No arguments provided for the required parameters
instance, andownerof the overload defined here (...code frame) - No arguments provided for the required parameter
instanceof the overload defined here (...code frame).
However, this does raise an interesting question and may require us to rethink our diagnostic groups because I'd expect that we use the same pattern when one overload can't be called because of a missing argument and another because of an invalid argument type (or too many positional etc)
Unless emitting a single diagnostic would change your approach in this PR, then I think it's worthwhile to take a closer look on how we want to support this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like mypy's diagnostics for overloads quite a lot. I find them very readable and helpful: https://mypy-play.net/?mypy=latest&python=3.12&gist=7bd36f8b5b3fce49a59f4bdfb34c5fef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't change the binding machinery. For now, I've updated this PR to print out a single new no-matching-overload error, mimicking Alex's mypy link, which is only shown when there are multiple overloads in the thing you're trying to call. If you call a non-overloaded function, you get the existing multiple diagnostics for each parameter error.
The information is there to add sub-diagnostics about how each particular overload failed, which I think we should tackle in future PRs.
* main: [red-knot] Add support for calling `type[…]` (#16597) Update migration guide with the new `ruff.configuration` (#16567) [red-knot] Add 'mypy_primer' workflow (#16554) Update Rust crate indoc to v2.0.6 (#16585) Update Rust crate syn to v2.0.100 (#16590) Update Rust crate thiserror to v2.0.12 (#16591) Update Rust crate serde_json to v1.0.140 (#16589) Update Rust crate quote to v1.0.39 (#16587) Update Rust crate serde to v1.0.219 (#16588) Update Rust crate proc-macro2 to v1.0.94 (#16586) Update Rust crate anyhow to v1.0.97 (#16584) Update dependency ruff to v0.9.10 (#16593) Update Rust crate unicode-ident to v1.0.18 (#16592) [red-knot] Do not ignore typeshed stubs for 'venv' module (#16596) [red-knot] Reduce Salsa lookups in `Type::find_name_in_mro` (#16582) Fix broken red-knot property tests (#16574) [red-knot] Consistent spelling of "metaclass" and "meta-type" (#16576)
|
carljm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is excellent, thank you!
None of my comments are blocking, any/all can become TODO comments (if that).
| can. | ||
|
|
||
| ```py | ||
| type("Foo", ()) # error: [no-matching-overload] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this test located in invalid_argument_type.md? It looks to me like this is an arity error, not an argument type issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved it into a new no_matching_overloads.md file
| // TODO: This checks every overload. Consider short-circuiting this loop once we find the first | ||
| // overload that is a successful match against the argument list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a fair bit more complexity involved in fully binding overloads according to the spec. See python/typing#1839 for the proposed spec, which isn't merged yet but will likely be merged in something similar to that form.
(Not suggesting to do more of it in this PR, just that the TODO comment could be broader.)
One particular item that might be worth noting in the context of your next steps around splitting call checking, is that the spec says overloads should first all be checked for arity match, and only matching ones should be checked for type assignability. I don't think this actually changes the effective semantics, but it should be better for perf. (It might just be what falls out naturally with call splitting.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link to the proposed spec, that's very helpful!
I've updated this TODO to call out that we'll only type-check against overloads whose arity matches.
| /// If the callable has multiple overloads, the first one that matches is used as the overall | ||
| /// binding match. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should perhaps be a TODO here as well, as the specified overload-matching algorithm has quite a few wrinkles missing from this description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| /// Returns whether there were any errors binding this call site. If the callable has multiple | ||
| /// overloads, they must _all_ have errors. | ||
| pub(crate) fn has_binding_errors(&self) -> bool { | ||
| self.overloads | ||
| .iter() | ||
| .all(OverloadBinding::has_binding_errors) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be simpler to frame (and implement) this as self.matching_overload().is_none()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| if let [overload] = self.overloads.as_ref() { | ||
| return overload.return_type(); | ||
| } | ||
| UnionType::from_elements(db, self.overloads.iter().map(OverloadBinding::return_type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should actually do this. It can cause a lot of bogus cascading errors due to getting an argument wrong at a call site. It is likely the case that subsequent code may be (correctly) written to handle only one or some of the possible overload results, and evaluating it as if it were required to handle any possible overload result will probably result in false positives.
I think this should just be Unknown (or arguably Never, but we've typically preferred Unknown in such cases) instead.
(Of course we should keep the prior special case in case there's actually just one "overload"; in that case we know the correct return type.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| } | ||
| KnownFunction::AssertType => { | ||
| if let [actual_ty, asserted_ty] = binding.parameter_types() { | ||
| if let [actual_ty, asserted_ty] = overload.parameter_types() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Orthogonal to this PR, but since you've changed this line: should this use overload.two_parameter_types() as below?
Or alternately, do we not really need those per-arity methods, when it's this easy to just match directly on overload.parameter_types()?
(Feel free to do nothing with this comment in this PR.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or alternately, do we not really need those per-arity methods, when it's this easy to just match directly on
overload.parameter_types()?
Went with this. The main drawback is that matching against the slice gives you &Type references, so there are a couple of places where you have to add some dereferences.
| let not_none = Type::none(db).negate(db); | ||
| let overloads = Overloads::from_overloads([ | ||
| Signature::new( | ||
| Parameters::new([ | ||
| Parameter::new( | ||
| Some(Name::new_static("instance")), | ||
| Some(Type::none(db)), | ||
| ParameterKind::PositionalOnly { default_ty: None }, | ||
| ), | ||
| Parameter::new( | ||
| Some("owner".into()), | ||
| Some(Name::new_static("owner")), | ||
| Some(KnownClass::Type.to_instance(db)), | ||
| ParameterKind::PositionalOnly { default_ty: None }, | ||
| ) | ||
| } else { | ||
| ), | ||
| ]), | ||
| None, | ||
| ), | ||
| Signature::new( | ||
| Parameters::new([ | ||
| Parameter::new( | ||
| Some("owner".into()), | ||
| Some(Name::new_static("instance")), | ||
| Some(not_none), | ||
| ParameterKind::PositionalOnly { default_ty: None }, | ||
| ), | ||
| Parameter::new( | ||
| Some(Name::new_static("owner")), | ||
| Some(UnionType::from_elements( | ||
| db, | ||
| [KnownClass::Type.to_instance(db), Type::none(db)], | ||
| )), | ||
| ParameterKind::PositionalOnly { | ||
| default_ty: Some(Type::none(db)), | ||
| }, | ||
| ) | ||
| }, | ||
| ]), | ||
| if function.has_known_class_decorator(db, KnownClass::Classmethod) | ||
| && function.decorators(db).len() == 1 | ||
| { | ||
| if let Some(owner) = arguments.second_argument() { | ||
| Some(Type::Callable(CallableType::BoundMethod( | ||
| BoundMethodType::new(db, function, owner), | ||
| ))) | ||
| } else if let Some(instance) = arguments.first_argument() { | ||
| Some(Type::Callable(CallableType::BoundMethod( | ||
| BoundMethodType::new(db, function, instance.to_meta_type(db)), | ||
| ))) | ||
| } else { | ||
| Some(Type::unknown()) | ||
| } | ||
| } else { | ||
| Some(match arguments.first_argument() { | ||
| Some(ty) if ty.is_none(db) => Type::FunctionLiteral(function), | ||
| Some(instance) => Type::Callable(CallableType::BoundMethod( | ||
| BoundMethodType::new(db, function, instance), | ||
| )), | ||
| _ => Type::unknown(), | ||
| }) | ||
| }, | ||
| ); | ||
| ), | ||
| ]), | ||
| None, | ||
| ), | ||
| ]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of this depends on any context from the current call; what with union construction and all, it feels like a lot of work to be unnecessarily repeating on every call to any __get__ method? I wonder how much difference it would make to memoize this behind a zero-argument Salsa query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I had also thought about a LazyLock, but that wouldn't work because of how you need the db to construct things. The nullary Salsa query is a good suggestion!
| // Here, we also model `types.FunctionType.__get__`, but now we consider a call to | ||
| // this as a function, i.e. we also expect the `self` argument to be passed in. | ||
|
|
||
| let second_argument_is_none = | ||
| arguments.second_argument().is_some_and(|ty| ty.is_none(db)); | ||
|
|
||
| let signature = Signature::new( | ||
| Parameters::new([ | ||
| Parameter::new( | ||
| Some("self".into()), | ||
| Some(KnownClass::FunctionType.to_instance(db)), | ||
| ParameterKind::PositionalOnly { default_ty: None }, | ||
| ), | ||
| Parameter::new( | ||
| Some("instance".into()), | ||
| Some(Type::object(db)), | ||
| ParameterKind::PositionalOnly { default_ty: None }, | ||
| ), | ||
| if second_argument_is_none { | ||
| let not_none = Type::none(db).negate(db); | ||
| let overloads = Overloads::from_overloads([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like we ought to be able to unify this with the previous case, which is a degenerate case. E.g. do we really need both signatures, or just this one and call it with prepended self type above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a TODO for this one
dhruvmanila
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes looks good to me 👍
I don't think this needs to be addressed in this PR but I think using the term "overloads" for a function with a single signature (no @overload) is a bit confusing although I do see why that might be required. One possibility would be to use Signatures (plural) instead of Overloads which can either be a Single signature or an Overloaded variant which has a list of signatures.
Per discussion in Discord, I've renamed this to We'll also want to mimic that structure on the binding side, where (as of this PR) we have |
* main: [red-knot] Rework `Type::to_instance()` to return `Option<Type>` (#16428) [red-knot] Add tests asserting that `KnownClass::to_instance()` doesn't unexpectedly fallback to `Type::Unknown` with full typeshed stubs (#16608) [red-knot] Handle gradual intersection types in assignability (#16611) [red-knot] mypy_primer: split installation and execution (#16622) [red-knot] mypy_primer: pipeline improvements (#16620) [red-knot] Infer `lambda` expression (#16547) [red-knot] mypy_primer: strip ANSI codes (#16604) [red-knot] mypy_primer: comment on PRs (#16599)
sharkdp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work — just two minor post-merge comments.
| /// Failing to provide the correct arguments to one of the overloads will raise a `TypeError` | ||
| /// at runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor point: it's not necessarily true that this will raise a TypeError. Calling a function with arguments of the wrong type can result in any kind of error, or none at all.
| /// @overload | ||
| /// def func(x: int): ... | ||
| /// @overload | ||
| /// def func(x: bool): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can choose something else as an example here, because int is a supertype of bool, which would mean that the second overload would never be selected(?).
| // TODO: This checks every overload. In the proposed more detailed call checking spec [1], | ||
| // arguments are checked for arity first, and are only checked for type assignability against | ||
| // the matching overloads. Make sure to implement that as part of separating call binding into | ||
| // two phases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I specifically called out the arity thing (that's step 1 and 2 in the proposed spec), but there are also a number of other wrinkles, like union-argument expansion (step 3), preference for variadic signature matches to variadic arguments (step 4), and special handling for non-fully-static argument types matching multiple arguments (step 5). Not at all urgent, but at some point would be good to broaden this TODO comment even more to cover these things, as long as we haven't yet implemented them.
EDIT: oh, actually perhaps this is sufficiently covered by the TODO on CallBinding below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, here I wanted to focus on the bit about only type-checking the overloads that arity-match
|
|
||
| /// Binding information for one of the overloads of a callable. | ||
| #[derive(Debug, Clone, PartialEq, Eq)] | ||
| pub(crate) struct OverloadBinding<'db> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you decide against the parallel rename here, where this would just be Binding, and we'd have CallableBinding and Bindings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had planned on doing that in a follow-on PR, which also renames/consolidates CallOutcome
This cleans up how we handle calling unions of types. #16568 adding a three-level structure for callable signatures (`Signatures`, `CallableSignature`, and `Signature`) to handle unions and overloads. This PR updates the bindings side to mimic that structure. What used to be called `CallOutcome` is now `Bindings`, and represents the result of binding actual arguments against a possible union of callables. `CallableBinding` is the result of binding a single, possibly overloaded, callable type. `Binding` is the result of binding a single overload. While we're here, this also cleans up `CallError` greatly. It was previously extracting error information from the bindings and storing it in the error result. It is now a simple enum, carrying no data, that's used as a status code to talk about whether the overall binding was successful or not. We are now more consistent about walking the binding itself to get detailed information about _how_ the binding was unsucessful. --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com> Co-authored-by: Carl Meyer <carl@astral.sh>
This updates the
SignatureandCallBindingmachinery to support multiple overloads for a callable. This is currently only used forKnownFunctions that we special-case in our type inference code. It does not yet update the semantic index builder to handle@overloaddecorators and construct a multi-signatureOverloadsinstance for real Python functions.While I was here, I updated many of the
try_callspecial cases to use signatures (possibly overloaded ones now) andbind_callto check parameter lists. We still need some of the mutator methods onOverloadBindingfor the special cases where we need to update return types based on some Rust code.