LIU-383: Initial work towards linter workflows #266

myxie · 2024-06-13T13:21:53Z

Issue

This PR partially address LIU-383: currently, we run pylint as part of our CI workflows, which is good practice. However, we also disable all checkers, so the value of running that workflow is questionable.

There are a number of levels of linter warnings, and it is rare that we would want all warnings enabled for all files and across such a large codebase. It would be valuable to have some form of linting moving forward, especially to improve and maintain the quality of the codebase.

Solution

This PR starts our progress to using linting across the codebase by enabling (only) the Error level of messages. This should alleviate the most significant potential issues that are currently in our code base and preventing them from being introduced in the future.

A few messages have been disabled globally due to their being false positives (e.g. when interfacing with a 3rd party API, or when the class MRO confuses the checker). I have also disabled some checks for particular files where there are Python version number checks that are not taking into account by Pylint.

I will put PR comments on non-trivial fixes to provide reasoning behind the decisions I have made.

Moving forward

I think we should start working towards adding Warnings to pylint, but do this on specific modules (or even sub-modules). I would start with daliuge-translator, as it has less warnings than the daliuge-engine, and also is less robust a piece of code. Pylint let's us specify the pylintrc file, so there is scope for having separate pylint settings for the different modules whilst we move towards the 'final' settings.

Summary by Sourcery

This pull request introduces initial work towards enabling linter workflows by configuring pylint to check for error-level messages in the CI pipeline. It includes various code refactorings to improve code quality, consistency, and readability across multiple files. Additionally, it addresses false positives in pylint checks by adding appropriate disable comments.

Enhancements:
- Enabled pylint error-level messages in CI workflows to improve code quality.
- Refactored test cases in daliuge-engine/test/test_drop.py to remove redundant try-finally blocks.
- Removed unused properties and methods in daliuge-translator/dlg/dropmake/lg_node.py.
- Improved formatting and consistency in daliuge-engine/dlg/apps/app_base.py.
- Fixed string formatting issues in daliuge-engine/dlg/graph_loader.py and daliuge-engine/dlg/data/drops/s3_drop.py.
- Updated type annotations in various files to use List and DefaultDict from typing module.
- Added pylint disable comments for false positives in multiple test files.
- Refactored daliuge-engine/dlg/deploy/helm_client.py to remove unused variables.
- Improved error handling and logging in daliuge-engine/dlg/data/drops/s3_drop.py.
- Updated method signatures and fixed minor issues in daliuge-engine/dlg/deploy/create_dlg_job.py and daliuge-engine/dlg/lifecycle/dlm.py.
- Refactored daliuge-engine/dlg/manager/node_manager.py to use Optional for type hinting.
- Fixed minor issues and improved code readability in various test files.
CI:
- Updated GitHub Actions workflow to run pylint with error-level checks and set a minimum score threshold.

Initial work to get our CI providing Error reports when running the pylint workflow.

sourcery-ai · 2024-06-13T13:22:11Z

Reviewer's Guide by Sourcery

This pull request (PR) addresses the issue LIU-383 by enabling the 'Error' level of pylint messages in the CI workflows. The changes include fixing various pylint errors across multiple files, updating the linting configuration, and adding or modifying code to comply with pylint's error-level checks.

File-Level Changes

Files	Changes
`daliuge-engine/test/test_drop.py` `daliuge-engine/test/manager/test_smm.py` `daliuge-engine/test/test_shared_memory.py` `daliuge-engine/test/apps/test_simple.py` `daliuge-engine/test/memoryUsage.py` `daliuge-engine/test/test_input_fired_app_drop.py` `daliuge-engine/test/test_S3Drop.py`	Fixed pylint errors and added pylint disable comments where necessary in test files.
`daliuge-engine/dlg/apps/app_base.py` `daliuge-engine/dlg/graph_loader.py` `daliuge-engine/dlg/deploy/helm_client.py` `daliuge-engine/dlg/drop.py` `daliuge-engine/dlg/event.py` `daliuge-engine/dlg/data/io.py` `daliuge-engine/dlg/data/drops/s3_drop.py` `daliuge-engine/dlg/droputils.py` `daliuge-engine/dlg/deploy/configs/__init__.py` `daliuge-engine/dlg/deploy/create_dlg_job.py` `daliuge-engine/dlg/lifecycle/dlm.py` `daliuge-engine/dlg/manager/node_manager.py` `daliuge-engine/dlg/named_port_utils.py` `daliuge-engine/dlg/deploy/deployment_utils.py`	Fixed pylint errors and improved code quality in various engine and deploy files.
`daliuge-translator/dlg/dropmake/lg_node.py` `daliuge-translator/dlg/dropmake/pgtp.py` `daliuge-translator/dlg/dropmake/scheduler.py` `daliuge-translator/dlg/dropmake/lg.py`	Fixed pylint errors and improved code quality in dropmake files.
`daliuge-common/dlg/common/__init__.py` `daliuge-common/dlg/common/reproducibility/reproducibility.py` `daliuge-common/dlg/restutils.py`	Fixed pylint errors and improved code quality in common files.
`.github/workflows/linting.yml`	Updated linting configuration to include --fail-under=9 and --fail-on=E options.

Tips

Trigger a new Sourcery review by commenting @sourcery-ai review on the pull request.
You can change your review settings at any time by accessing your dashboard:
- Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
- Change the review language;
You can always contact us if you have any questions or feedback.

sourcery-ai

Hey @myxie - I've reviewed your changes and found some issues that need to be addressed.

Blocking issues:

String interpolation issue (link)

Here's what I looked at during the review

🔴 General issues: 1 blocking issue, 18 other issues
🟢 Security: all looks good
🟡 Testing: 4 issues found
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.}

daliuge-engine/dlg/graph_loader.py

daliuge-engine/dlg/deploy/helm_client.py

daliuge-engine/dlg/drop.py

sourcery-ai · 2024-06-13T13:25:05Z

daliuge-engine/dlg/drop.py

-            else:
-                param = default_value


question: Removed else block

The else block that assigns default_value to param has been removed. Ensure that this change does not affect the logic where param should be assigned default_value if no other conditions are met.

We don't need this any more now that we are initialising the param variable with the default, and then modifying the value based on the conditionals. This protects us from a potential variable undeclared in scope error.

sourcery-ai · 2024-06-13T13:25:07Z

daliuge-engine/dlg/apps/app_base.py

            raise Exception(
-                "%r: More effective inputs (%d) than inputs (%d)"
+                "%r: More effective inputs (%s) than inputs (%d)"
                % (self, self.n_effective_inputs, n_inputs)
            )


issue (code-quality): Raise a specific error instead of the general Exception or BaseException (raise-specific-error)

Explanation
If a piece of code raises a specific exception type rather than the generic [`BaseException`](https://docs.python.org/3/library/exceptions.html#BaseException) or [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception), the calling code can:

get more information about what type of error it is

define specific exception handling for it

This way, callers of the code can handle the error appropriately.

How can you solve this?

Use one of the built-in exceptions of the standard library.

Define your own error class that subclasses Exception.

So instead of having code raising Exception or BaseException like

if incorrect_input(value): raise Exception("The input is incorrect")

you can have code raising a specific error like

if incorrect_input(value): raise ValueError("The input is incorrect")

or

class IncorrectInputError(Exception): pass if incorrect_input(value): raise IncorrectInputError("The input is incorrect")

sourcery-ai · 2024-06-13T13:25:07Z

daliuge-engine/dlg/data/drops/ngas.py

@@ -133,7 +133,7 @@ def setCompleted(self):

    @property
    def dataURL(self) -> str:
-        return "ngas://%s:%d/%s" % (self.ngasSrv, self.ngasPort, self.fileId)
+        return "ngas://%s:%s/%s" % (self.ngasSrv, self.ngasPort, self.fileId)


suggestion (code-quality): Replace interpolated string formatting with f-string (replace-interpolation-with-fstring)

Suggested change

return "ngas://%s:%s/%s" % (self.ngasSrv, self.ngasPort, self.fileId)

return f"ngas://{self.ngasSrv}:{self.ngasPort}/{self.fileId}"

This can be taken care of another day when addressing Pylint's warnings .

sourcery-ai · 2024-06-13T13:25:08Z

daliuge-engine/dlg/drop.py

        return "<%s oid=%s, uid=%s>" % (
            self.__class__.__name__,
-            self.oid,
-            self.uid,
+            "self.oid",
+            "self.uid",
        )


suggestion (code-quality): We've found these issues:

Replace interpolated string formatting with f-string (replace-interpolation-with-fstring)

Simplify unnecessary nesting, casting and constant values in f-strings (simplify-fstring-formatting)

Suggested change

return "<%s oid=%s, uid=%s>" % (

self.__class__.__name__,

self.oid,

self.uid,

"self.oid",

"self.uid",

)

return f"<{self.__class__.__name__} oid=self.oid, uid=self.uid>"

This will be addressed at some point in the future as we transition to removing all the Pylint warnings.

sourcery-ai · 2024-06-13T13:25:08Z

daliuge-engine/dlg/graph_loader.py

+    roots: List[AbstractDROP] = []
    for drop in drops.values():
        if not droputils.getUpstreamObjects(drop):
            roots.append(drop)


suggestion (code-quality): Convert for loop into list comprehension (list-comprehension)

Suggested change

roots: List[AbstractDROP] = []

for drop in drops.values():

if not droputils.getUpstreamObjects(drop):

roots.append(drop)

roots: List[AbstractDROP] = [

drop

for drop in drops.values()

if not droputils.getUpstreamObjects(drop)

]

This is great.

sourcery-ai · 2024-06-13T13:25:08Z

daliuge-engine/test/test_shared_memory.py

@@ -73,7 +74,7 @@
        """
        block_a = DlgSharedMemory("A")
        data = pickle.dumps(3)
-        block_a.buf[0 : len(data)] = data
+        block_a.buf[0: len(data)] = data


suggestion (code-quality): Replace a[0:x] with a[:x] and a[x:len(a)] with a[x:] (remove-redundant-slice-index)

Suggested change

block_a.buf[0: len(data)] = data

block_a.buf[:len(data)] = data

coveralls · 2024-06-13T13:30:37Z

coverage: 79.544% (+0.08%) from 79.469%
when pulling b4340ab on LIU-383
into db91b75 on master.

myxie · 2024-06-14T05:05:08Z

daliuge-common/dlg/restutils.py

@@ -139,7 +139,7 @@ def __exit__(self, typ, value, traceback):

    def _get_json(self, url):
        ret = self._GET(url)
-        return json.load(ret) if ret else None
+        return json.load(ret) if ret else {}


I have updated this to return a dictionary as we were not checking for None when returning the value in get_json. This means we could receive a TypeError: NoneType is not iterable error at runtime.
Returning the dictionary iterable here is more 'Pythonic' as we are able to use the empty dictionary as an iterable regardless of how many elements are in it (i.e. duck-typing).

myxie · 2024-06-14T05:44:56Z

daliuge-engine/dlg/deploy/helm_client.py

@@ -383,8 +383,8 @@ def submit_and_monitor_pgt(self):
        """
        Combines submission and monitoring steps of a pgt.
        """
-        session_id = self.submit_pgt()
-        monitoring_thread = self._monitor(session_id)
+        self.submit_pgt()


self.submit_pgt was never returning session_id (and from what I can tell, that would be a bit of work to derive). Hence, we've always been passing None as a parameter all this time. It is cleaner to just remove this.

myxie · 2024-06-14T05:50:46Z

daliuge-engine/test/integrate/example_split.py

@@ -5,6 +5,8 @@
 To run it standalone, change the directories, which are now hardcoded
 """

+# pylint: skip-file
+


I am skipping this file completely because it is using mstransform, which is a CASA method that, at some point in time, was available in this code (although I'm not sure how/where it ever ran). This file is broken/un-runable code, but I don't want to remove it from the code base as I do not know it's use case. A decision can be made either now or in a future PR about what to do with it.

myxie · 2024-06-14T06:20:12Z

daliuge-translator/dlg/dropmake/lg_node.py

@@ -100,52 +100,6 @@ def __init__(self, jd, group_q, done_dict, ssid):
    # def __str__(self):
    #     return json.dumps(self.jd)

-    @property


These are being removed because there were duplicate definitions below. As Python is interpreted, these definitions would have been over-written and so the latter definitions are the ones we have been using anyway.

myxie · 2024-06-14T06:21:00Z

daliuge-translator/dlg/dropmake/pgtp.py

-
+        tw = 1
+        sz = 1
+        dst = "outputs"


I don't know if "outputs"/"inputs" are what we want to be defaults here, but it is important to have defaults. An alternative may simply be empty strings.

myxie · 2024-06-14T06:22:12Z

daliuge-translator/dlg/dropmake/scheduler.py

@@ -513,7 +513,7 @@ def __init__(self, drop_list, max_dop=8, dag=None):
        else:
            self._dag = dag
        self._max_dop = max_dop
-        self._parts = None  # partitions
+        self._parts = []  # partitions


This is similar to changes above, in which we have None types being returned but we are treating them like iterators. This makes it clear across the board that we only expect to have a list, which can be empty, rather than having to go to all the places self._parts is mentioned and check for None.

coveralls · 2024-06-14T06:46:08Z

coverage: 79.677% (+0.2%) from 79.469%
when pulling 4eec538 on LIU-383
into db91b75 on master.

myxie · 2024-06-14T08:34:52Z

Hi @awicenec,

Apologies that there's quite a few changes here; I thought it would be worthwhile starting this off, and then at least our pipelines are doing something. There's a (minimal) plan of action set out in the PR for future work that could be incremental; I'd be curious to hear your thoughts.

This is lower priority so I don't expect to get a review before you're back from your travels.

myxie · 2024-06-14T08:39:59Z

.github/workflows/linting.yml

@@ -22,4 +22,4 @@ jobs:

      - name: Run pylint
        run: |
-          pylint daliuge-common daliuge-translator daliuge-engine
+          pylint daliuge-common daliuge-translator daliuge-engine --fail-under=9 --fail-on=E                    


These settings mean Pylint will return an error code of 0 unless we are either under 9 or fail on Errors.
We should be at a 10.0 currently, and should have no Errors after this PR. What these settings also allow is for us to enable Warnings (thus seeing them introduced in the code), without us Failing any tests. This means we can, if we want to, enable Warnings at some point whilst we address them incrementally, without it affecting the CI. Something to think about moving forward I expect.

We score a 9.47(?) with the Warnings enabled, so it's unlikely we'd ever go below 9. It is necessary to get the --fail-on=E to work properly, however.

myxie · 2024-06-28T07:27:03Z

@awicenec just a 'bump' to have a look at this PR when you get the chance.

coveralls · 2024-06-28T07:34:54Z

coverage: 79.555% (+0.08%) from 79.479%
when pulling ecbf3aa on LIU-383
into 2efb3ba on master.

awicenec

This is a very good start, indeed! Thanks for taking this on. We will to work in this direction quite a bit more in the future as well.

LIU-383: Initial work towards linter workflows

myxie added 2 commits June 13, 2024 20:54

LIU-383: Transition to reporting pylint errors

348cacf

Initial work to get our CI providing Error reports when running the pylint workflow.

LIU-383: Transition to reporting pylint errors

b4340ab

Initial work to get our CI providing Error reports when running the pylint workflow.

sourcery-ai bot reviewed Jun 13, 2024

View reviewed changes

myxie commented Jun 14, 2024

View reviewed changes

LIU-383: AI code-review updates.

4eec538

myxie assigned awicenec Jun 14, 2024

myxie commented Jun 14, 2024

View reviewed changes

Merge branch 'master' of https://github.com/icrar/daliuge into LIU-383

ecbf3aa

awicenec approved these changes Jul 2, 2024

View reviewed changes

myxie merged commit 2c4049f into master Jul 2, 2024
21 checks passed

awicenec pushed a commit that referenced this pull request Oct 10, 2024

Merge pull request #266 from ICRAR/LIU-383

f2e5c9f

LIU-383: Initial work towards linter workflows

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LIU-383: Initial work towards linter workflows #266

LIU-383: Initial work towards linter workflows #266

myxie commented Jun 13, 2024 •

edited

Loading

sourcery-ai bot commented Jun 13, 2024 •

edited

Loading

sourcery-ai bot left a comment

sourcery-ai bot Jun 13, 2024

myxie Jun 14, 2024

sourcery-ai bot Jun 13, 2024

sourcery-ai bot Jun 13, 2024

myxie Jun 14, 2024

sourcery-ai bot Jun 13, 2024

myxie Jun 14, 2024

sourcery-ai bot Jun 13, 2024

myxie Jun 14, 2024

sourcery-ai bot Jun 13, 2024

coveralls commented Jun 13, 2024

myxie Jun 14, 2024

myxie Jun 14, 2024

myxie Jun 14, 2024

myxie Jun 14, 2024

myxie Jun 14, 2024

myxie Jun 14, 2024

coveralls commented Jun 14, 2024

myxie commented Jun 14, 2024

myxie Jun 14, 2024

myxie commented Jun 28, 2024

coveralls commented Jun 28, 2024

awicenec left a comment

	return "ngas://%s:%s/%s" % (self.ngasSrv, self.ngasPort, self.fileId)
	return f"ngas://{self.ngasSrv}:{self.ngasPort}/{self.fileId}"

	block_a.buf[0: len(data)] = data
	block_a.buf[:len(data)] = data

LIU-383: Initial work towards linter workflows #266

LIU-383: Initial work towards linter workflows #266

Conversation

myxie commented Jun 13, 2024 • edited Loading

Issue

Solution

Moving forward

Summary by Sourcery

sourcery-ai bot commented Jun 13, 2024 • edited Loading

Reviewer's Guide by Sourcery

File-Level Changes

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot Jun 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sourcery-ai bot Jun 13, 2024

Choose a reason for hiding this comment

sourcery-ai bot Jun 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sourcery-ai bot Jun 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sourcery-ai bot Jun 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sourcery-ai bot Jun 13, 2024

Choose a reason for hiding this comment

coveralls commented Jun 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Jun 14, 2024

myxie commented Jun 14, 2024

Choose a reason for hiding this comment

myxie commented Jun 28, 2024

coveralls commented Jun 28, 2024

awicenec left a comment

Choose a reason for hiding this comment

myxie commented Jun 13, 2024 •

edited

Loading

sourcery-ai bot commented Jun 13, 2024 •

edited

Loading