-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable professional-grade mypy #129
base: main
Are you sure you want to change the base?
Conversation
46dca74
to
9173823
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #129 +/- ##
=======================================
+ Coverage 85.9% 86.9% +1.0%
=======================================
Files 227 228 +1
Lines 7583 8141 +558
=======================================
+ Hits 6514 7081 +567
+ Misses 1069 1060 -9
|
If we adopt this PR, we should also present a mypy badge in our readme :) |
pyproject.toml
Outdated
plugins = ['numpy.typing.mypy_plugin', 'pandera.mypy', 'pydantic.mypy', 'sqlalchemy.ext.mypy.plugin'] | ||
disallow_untyped_defs = true | ||
disallow_any_unimported = true | ||
# disallow_any_generics = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another note: while working on this PR, I stumbled upon the setting disallow_any_generics = true
. It is not active as of now, as that would introduce a few hundred more errors, but it would be an even stricter rule we could employ (to follow the spirit of using the strictest possible settings for mypy). This rule would prevent us from returning dict
or list
somewhere: all such generics would need to be explicitly specified as list[int]
or some such. Happy to hear your thoughts on whether we should do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are already at it and its a comparatively small amount of work I would tend to go ahead! "wenn schon denn schon"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this suggestion, this PR enables now all strict mypy rules except for no_implicit_reexport
, which we discussed in person and wanted to keep for another PR as it might change the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, what a feat! Thanks a lot, this is very cool.
I couldnt find what you mentioned in point 5. It sounds like that should be possible, but I've seen ParamSpec for the first time in this PR and couldnt tell you exactly what to do..
I also left some comments inline.
ixmp4/data/abstract/model.py
Outdated
name__notlike: str | ||
name__notilike: str | ||
iamc: ( | ||
dict[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not solve this with a more general, recursive type-hint?
For example:
FilterDict = dict[
str,
int
| str
| Iterable[int]
| Iterable[str]
| "FilterDict"
]
Okay, the update is here :) I still need to do some cleanup, I want to look into your suggestion of recursive hints (and maybe employ some |
e44a0c9
to
72f9bb1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This took longer than expected again, but here it is: calling out every single thing I had a question about or an issue with! I'll push one or two more clean-up commits with things I noticed while going through all files, but after that, I'm happy with this PR :)
if res.status_code != 200: | ||
raise ManagerApiError(f"[{str(res.status_code)}] {res.text}") | ||
return res.json() | ||
# TODO Can we really assume this type? | ||
return cast(dict[str, Any], res.json()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to work based on all test we have, but I'm not sure if this is actually true since res.json()
is just annotated to return Any
. I think we can leave this, but please let me knof if this should be tested in some way.
# TODO can't find a similar open issues, should I open one? | ||
# For reference, this occurred after adding disallow_subclassing_any | ||
# NOTE mypy seems to say Exception has type Any, don't see how we could change that | ||
class RemoteExceptionMeta(ExcMeta): # type: ignore[misc] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mypy complained here about subclassing from Any
, which I don't see how we could avoid if Exception
really inherits from that. But in that case, I would have expected this to be fairly common and noted in some mypy issue, but I couldn't find any. Should I open one or do you know what's going on here?
return None | ||
elif isinstance(variable, str): | ||
# TODO this check seems kind of redundant... | ||
if isinstance(variable, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check seems kind of redundant since variable
can only be a string, anyway. But I didn't know what to do about the error: would we instead want to raise some error if the variable name is unknown?
return None | ||
elif isinstance(model, str): | ||
# TODO this check seems kind of redundant... | ||
if isinstance(model, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for iamc/variable: what should we do about this error message when removing the if-check?
return None | ||
elif isinstance(scenario, str): | ||
# TODO this check seems kind of redundant... | ||
if isinstance(scenario, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another redundant check: what do we do with the error?
@@ -84,6 +87,7 @@ def add( | |||
|
|||
@guard("view") | |||
def get(self, run_id: int, name: str) -> Variable: | |||
# TODO by avoiding super().select, don't we also miss out on filters and auth? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I noticed while working on this PR, likely warrants its own follow-up issue.
# raise ProgrammingError( | ||
# "Field argument `lookups` must be `list` of `str`." | ||
# ) | ||
override_lookups = _ensure_str_list(jschema_lookups) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This solution isn't ideal, as mentioned in the TODO comment above, but it seems to never be called (at least by our test suite), so the performance impact should be negligible. Please raise better ideas, though :)
|
||
|
||
def deepgetattr(obj, attr): | ||
# TODO How to type hint this to return Repos with .list() and .count()? | ||
def deepgetattr(obj: Backend, attr: str) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type hint is not great, but I don't know how to make it better.
platform: ixmp4.Platform, | ||
# TODO: improve type hints for fixtures | ||
profiled: Any, | ||
benchmark: Any, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't figured out how to properly annotate these fixtures. Do you know? If not, should we figure this out or accept them as they are now?
ixmp4/data/api/base.py
Outdated
# TODO The order is important here at the moment, in particular having int before | ||
# float! This should likely not be the case, but using StrictInt and StrictFloat | ||
# from pydantic only created even more errors. | ||
data: ( | ||
list[ | ||
list[ | ||
bool | ||
| datetime | ||
| int | ||
| float | ||
| str | ||
| dict[str, Any] | ||
# TODO should be able to remove this once PR#122 is merged | ||
| list[float | int | str] | ||
| None | ||
] | ||
] | ||
| None | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type hint is not great as explained in the comment. I think I really don't like pydantic using the order in type hints to coerce types, because that's something that I wouldn't expect. Here, it seems to be important that int
come before float
. Similar to meta
, I tried using StrictInt
etc, but this broke some other tests. Please let me know if you have a good idea for this.
* Complete type hints, even for tests * Bump mypy to latest release version
72f9bb1
to
3fb15b9
Compare
Finally, here it is!
After reading this blog post about "professional-grade" mypy, I began adopting the professional-grade config for ixmp4. This PR is the result of the process.
It type hints everything, though some type hints might still need relaxation or tightening. Even our test suite is type-hinted here. This allows us to trust all variables we're dealing with have the correct type. In particular, refactoring can be done with improved confidence as mypy would catch a wide variety of bugs if the refactoring doesn't fit the existing code. Having these type hints also helps with readability, I think, as it is one kind of documentation in itself. I found this process especially insightful for understanding how the filters work :)
Of course, this PR also bumps mypy to its latest version.
That being said, there are a few points where I think we can still improve the PR, or add improvements through a future PR:
create()
functions in the DB layer, I often kept the*args
and**kwargs
since the facade layer often uses positional arguments, while the rest/api layer uses keywords. This is different fromtabulate()
andlist()
, which are generally only called viaenumerate()
and with kwargs, so this would be a place to streamline the signatures.data/abstract/meta
, I needed to type hint the_type_map
, for which I needed to use the built-intype
. This led to confusion for mypy sincemeta
already had an attribute calledtype
. I have renamed the attribute todtype
to enable the type hint.kwargs
, I have learned that the best way to type hint them is by creating aTypedDict
subclass, which explicitly spells out all possible kwargs and their types. I have not done so for some/base/
functions, which mainly act as templates or pass on all*args, **kwargs
that they receive. Whenargs
orkwargs
were unused, I sometimes removed them from the signature and type-hinted them asAny
at other times, depending on how I understood the function's purpose. So there are currently two approached visible, which come down to different understandings of type hints, I suppose:On the one hand, one could try to spell out everything explicitly how it is currently used. This improves the documentation aspect and enables mypy to catch more bugs before they reach production. On the other hand, this would block newly introduced kwargs until they are added to these kwargs-classes, which increases maintenance burden. Alternatively, one could convey the purpose of the function through the type hints, being more general. The pros and cons are essentially the inverse of the above.
For code that I didn't write, it's harder to correctly guess the purpose of functions, so I have generally opted more for the first approach. However, I'm open to discussion here.
EnumerateKwargs
could often inherit fromCreateKwargs
at the moment. For the filter classes, spelling out everything likename__like
is also lengthy, we could e.g. think of using mixins there./base/
functions that pass on their arguments, I usedAny
because I couldn't get the alternative to work: in theory, this sounds like a job forParamSpec
: "They are used to forward the parameter types of one callable to another callable". However, I couldn't figure out exactly how to do that. On that note, if you know a good tutorial onParamSpec
, I'd be happy to read it as this almost seems like magic to me right now.Lastly, I applied some
type: ignore
markers where I couldn't quite understand why mypy was complaining or where I intend to refactor the code very soon, anyway. In principle, though, almost none of that should be necessary, strictly-speaking. "Code I want to change soon" mainly refers to the DB model for parameters, equations, and variables, where I expect a more performant model can make do without the wholecolumn
thing. And I would maybe also droptable
entirely since this is used nowhere in the tutorials. This is less certain, however, which is why I did type hinttable
, still.Note
As so nicely displayed by codecov, we don't have 100% test coverage yet (that's another juicy small project goal...). So when reviewing this, please use the branch like
main
might be used to ensure that things we don't test did not get broken. In particular, this probably means running some CLI commands from this branch.