-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LIU-383: Initial work towards linter workflows #266
Conversation
Initial work to get our CI providing Error reports when running the pylint workflow.
Initial work to get our CI providing Error reports when running the pylint workflow.
Reviewer's Guide by SourceryThis pull request (PR) addresses the issue LIU-383 by enabling the 'Error' level of pylint messages in the CI workflows. The changes include fixing various pylint errors across multiple files, updating the linting configuration, and adding or modifying code to comply with pylint's error-level checks. File-Level Changes
Tips
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @myxie - I've reviewed your changes and found some issues that need to be addressed.
Blocking issues:
- String interpolation issue (link)
Here's what I looked at during the review
- 🔴 General issues: 1 blocking issue, 18 other issues
- 🟢 Security: all looks good
- 🟡 Testing: 4 issues found
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.
else: | ||
param = default_value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Removed else block
The else block that assigns default_value to param has been removed. Ensure that this change does not affect the logic where param should be assigned default_value if no other conditions are met.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this any more now that we are initialising the param
variable with the default, and then modifying the value based on the conditionals. This protects us from a potential variable undeclared in scope error.
daliuge-engine/dlg/apps/app_base.py
Outdated
raise Exception( | ||
"%r: More effective inputs (%d) than inputs (%d)" | ||
"%r: More effective inputs (%s) than inputs (%d)" | ||
% (self, self.n_effective_inputs, n_inputs) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Raise a specific error instead of the general Exception
or BaseException
(raise-specific-error
)
Explanation
If a piece of code raises a specific exception type rather than the generic [`BaseException`](https://docs.python.org/3/library/exceptions.html#BaseException) or [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception), the calling code can:- get more information about what type of error it is
- define specific exception handling for it
This way, callers of the code can handle the error appropriately.
How can you solve this?
- Use one of the built-in exceptions of the standard library.
- Define your own error class that subclasses
Exception
.
So instead of having code raising Exception
or BaseException
like
if incorrect_input(value):
raise Exception("The input is incorrect")
you can have code raising a specific error like
if incorrect_input(value):
raise ValueError("The input is incorrect")
or
class IncorrectInputError(Exception):
pass
if incorrect_input(value):
raise IncorrectInputError("The input is incorrect")
@@ -133,7 +133,7 @@ def setCompleted(self): | |||
|
|||
@property | |||
def dataURL(self) -> str: | |||
return "ngas://%s:%d/%s" % (self.ngasSrv, self.ngasPort, self.fileId) | |||
return "ngas://%s:%s/%s" % (self.ngasSrv, self.ngasPort, self.fileId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Replace interpolated string formatting with f-string (replace-interpolation-with-fstring
)
return "ngas://%s:%s/%s" % (self.ngasSrv, self.ngasPort, self.fileId) | |
return f"ngas://{self.ngasSrv}:{self.ngasPort}/{self.fileId}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be taken care of another day when addressing Pylint's warnings .
daliuge-engine/dlg/drop.py
Outdated
return "<%s oid=%s, uid=%s>" % ( | ||
self.__class__.__name__, | ||
self.oid, | ||
self.uid, | ||
"self.oid", | ||
"self.uid", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): We've found these issues:
- Replace interpolated string formatting with f-string (
replace-interpolation-with-fstring
) - Simplify unnecessary nesting, casting and constant values in f-strings (
simplify-fstring-formatting
)
return "<%s oid=%s, uid=%s>" % ( | |
self.__class__.__name__, | |
self.oid, | |
self.uid, | |
"self.oid", | |
"self.uid", | |
) | |
return f"<{self.__class__.__name__} oid=self.oid, uid=self.uid>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be addressed at some point in the future as we transition to removing all the Pylint warnings.
daliuge-engine/dlg/graph_loader.py
Outdated
roots: List[AbstractDROP] = [] | ||
for drop in drops.values(): | ||
if not droputils.getUpstreamObjects(drop): | ||
roots.append(drop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Convert for loop into list comprehension (list-comprehension
)
roots: List[AbstractDROP] = [] | |
for drop in drops.values(): | |
if not droputils.getUpstreamObjects(drop): | |
roots.append(drop) | |
roots: List[AbstractDROP] = [ | |
drop | |
for drop in drops.values() | |
if not droputils.getUpstreamObjects(drop) | |
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great.
@@ -73,7 +74,7 @@ | |||
""" | |||
block_a = DlgSharedMemory("A") | |||
data = pickle.dumps(3) | |||
block_a.buf[0 : len(data)] = data | |||
block_a.buf[0: len(data)] = data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Replace a[0:x] with a[:x] and a[x:len(a)] with a[x:] (remove-redundant-slice-index
)
block_a.buf[0: len(data)] = data | |
block_a.buf[:len(data)] = data |
@@ -139,7 +139,7 @@ def __exit__(self, typ, value, traceback): | |||
|
|||
def _get_json(self, url): | |||
ret = self._GET(url) | |||
return json.load(ret) if ret else None | |||
return json.load(ret) if ret else {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated this to return a dictionary as we were not checking for None
when returning the value in get_json
. This means we could receive a TypeError: NoneType is not iterable
error at runtime.
Returning the dictionary iterable here is more 'Pythonic' as we are able to use the empty dictionary as an iterable regardless of how many elements are in it (i.e. duck-typing).
@@ -383,8 +383,8 @@ def submit_and_monitor_pgt(self): | |||
""" | |||
Combines submission and monitoring steps of a pgt. | |||
""" | |||
session_id = self.submit_pgt() | |||
monitoring_thread = self._monitor(session_id) | |||
self.submit_pgt() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.submit_pgt
was never returning session_id
(and from what I can tell, that would be a bit of work to derive). Hence, we've always been passing None
as a parameter all this time. It is cleaner to just remove this.
@@ -5,6 +5,8 @@ | |||
To run it standalone, change the directories, which are now hardcoded | |||
""" | |||
|
|||
# pylint: skip-file | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am skipping this file completely because it is using mstransform
, which is a CASA method that, at some point in time, was available in this code (although I'm not sure how/where it ever ran). This file is broken/un-runable code, but I don't want to remove it from the code base as I do not know it's use case. A decision can be made either now or in a future PR about what to do with it.
@@ -100,52 +100,6 @@ def __init__(self, jd, group_q, done_dict, ssid): | |||
# def __str__(self): | |||
# return json.dumps(self.jd) | |||
|
|||
@property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are being removed because there were duplicate definitions below. As Python is interpreted, these definitions would have been over-written and so the latter definitions are the ones we have been using anyway.
|
||
tw = 1 | ||
sz = 1 | ||
dst = "outputs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if "outputs"/"inputs" are what we want to be defaults here, but it is important to have defaults. An alternative may simply be empty strings.
@@ -513,7 +513,7 @@ def __init__(self, drop_list, max_dop=8, dag=None): | |||
else: | |||
self._dag = dag | |||
self._max_dop = max_dop | |||
self._parts = None # partitions | |||
self._parts = [] # partitions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is similar to changes above, in which we have None
types being returned but we are treating them like iterators. This makes it clear across the board that we only expect to have a list, which can be empty, rather than having to go to all the places self._parts
is mentioned and check for None
.
Hi @awicenec, Apologies that there's quite a few changes here; I thought it would be worthwhile starting this off, and then at least our pipelines are doing something. There's a (minimal) plan of action set out in the PR for future work that could be incremental; I'd be curious to hear your thoughts. This is lower priority so I don't expect to get a review before you're back from your travels. |
@@ -22,4 +22,4 @@ jobs: | |||
|
|||
- name: Run pylint | |||
run: | | |||
pylint daliuge-common daliuge-translator daliuge-engine | |||
pylint daliuge-common daliuge-translator daliuge-engine --fail-under=9 --fail-on=E |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These settings mean Pylint will return an error code of 0 unless we are either under 9 or fail on Errors.
We should be at a 10.0 currently, and should have no Errors after this PR. What these settings also allow is for us to enable Warnings (thus seeing them introduced in the code), without us Failing any tests. This means we can, if we want to, enable Warnings at some point whilst we address them incrementally, without it affecting the CI. Something to think about moving forward I expect.
We score a 9.47(?) with the Warnings enabled, so it's unlikely we'd ever go below 9. It is necessary to get the --fail-on=E
to work properly, however.
@awicenec just a 'bump' to have a look at this PR when you get the chance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very good start, indeed! Thanks for taking this on. We will to work in this direction quite a bit more in the future as well.
LIU-383: Initial work towards linter workflows
Issue
This PR partially address LIU-383: currently, we run pylint as part of our CI workflows, which is good practice. However, we also disable all checkers, so the value of running that workflow is questionable.
There are a number of levels of linter warnings, and it is rare that we would want all warnings enabled for all files and across such a large codebase. It would be valuable to have some form of linting moving forward, especially to improve and maintain the quality of the codebase.
Solution
This PR starts our progress to using linting across the codebase by enabling (only) the Error level of messages. This should alleviate the most significant potential issues that are currently in our code base and preventing them from being introduced in the future.
A few messages have been disabled globally due to their being false positives (e.g. when interfacing with a 3rd party API, or when the class MRO confuses the checker). I have also disabled some checks for particular files where there are Python version number checks that are not taking into account by Pylint.
I will put PR comments on non-trivial fixes to provide reasoning behind the decisions I have made.
Moving forward
I think we should start working towards adding Warnings to pylint, but do this on specific modules (or even sub-modules). I would start with
daliuge-translator
, as it has less warnings than the daliuge-engine, and also is less robust a piece of code. Pylint let's us specify the pylintrc file, so there is scope for having separate pylint settings for the different modules whilst we move towards the 'final' settings.Summary by Sourcery
This pull request introduces initial work towards enabling linter workflows by configuring pylint to check for error-level messages in the CI pipeline. It includes various code refactorings to improve code quality, consistency, and readability across multiple files. Additionally, it addresses false positives in pylint checks by adding appropriate disable comments.
daliuge-engine/test/test_drop.py
to remove redundant try-finally blocks.daliuge-translator/dlg/dropmake/lg_node.py
.daliuge-engine/dlg/apps/app_base.py
.daliuge-engine/dlg/graph_loader.py
anddaliuge-engine/dlg/data/drops/s3_drop.py
.List
andDefaultDict
fromtyping
module.daliuge-engine/dlg/deploy/helm_client.py
to remove unused variables.daliuge-engine/dlg/data/drops/s3_drop.py
.daliuge-engine/dlg/deploy/create_dlg_job.py
anddaliuge-engine/dlg/lifecycle/dlm.py
.daliuge-engine/dlg/manager/node_manager.py
to useOptional
for type hinting.