Adopt pre-commit and pre-commit CI #1039

mattwthompson · 2021-08-09T20:17:31Z

Here's my pitch for adopting pre-commit and pre-commit CI which I discussed at this morning's developers' call. There's a trojan horse in #1022 which should have followed up this proposal instead of the order it's happening now (#1040).

I made a little demo in the Interchange (openforcefield/openff-interchange#277) which I annotated there in some detail. I can reproduce it all in this repo, but I wanted to avoid changing this repo's settings for a demo.

I tried to be brief but failed - probably about half of below will be merged into developer docs if we do adopt this.

Happy to continue discussion below.

Set up and adoption

At the level of maintainers' responsibility, everything with pre-commit is configured with a single YAML file. This includes the "hooks" that are used, their versions, and whatever other settings (files to include/exclude, custom arguments to pass to other programs, etc.).

For all developers that opt in (or manually run linters, or use some other tooling in their IDE, or magically write perfect code) pre-commit pretty much stays out of the way. Aside from bugs and versions of hooks being out of date, anything that passes local hooks should pass hooks on CI. The only things that might be noticed are the lack of commits named "Run black" and, in cases in which the bot makes a commit, those changes do need to be pulled in locally.

Since it effectively "wraps" other tools, it does have machinery for updating the versions of those tools. This can be done by running pre-commit autoupdate locally and pushing changes, but pre-commit CI automatically does this once a week (for example, https://github.com/openforcefield/openff-interchange/pull/276/files). The frequency of these updates can be configured, but if we're only using black and isort it shouldn't generate much unwanted noise. Instead, it'll automatically detect and fix problems relating to updates that we currently have to detect (huh, why is CI failing?) and fix (https://github.com/openforcefield/openff-interchange/pull/276/files) manually. So, some PRs will have merge commits from upstream/feature-branch-4 in their history; I don't personally find this to be objectionable.

Adding new hooks can sometimes be a bit of a work, but not really because of this tool. Changing/fixing/enforcing a style guide tends to lead to large and/or cumbersome diffs in the git history; I don't think there's any way around this.

Pre-commit CI requires a marketplace app be installed. I've already installed it for Interchange, but I think not given it access anywhere else.

Developer experience

Its use is opt-in for each developer. If somebody doesn't want to use it, they don't have to. The opt-in is installing pre-commit locally; not installing it (or a liberal use of git commit --no-veriify) declines opting in.

Installation is brief and roughly follows two steps. First, install the pre-commit tool itself:

$ conda install -c conda-forge --yes
...

Then, install the "hooks":

$ pre-commit install

Here, it does something in isolated virtual environments (not conda environments) intentionally hidden somewhere away from developers. (This means that the actual executable called by running black openff will literally differ, but should not functionally differ if everything is kept up to date. This may mean that auto-formatters and linters don't need to be included in test environments, but I'll leave that open to preference.)

Now, the developer runs off and makes something cool. When it comes time for commit, the tool will run black and isort. It can look something like this

(openff-dev) [openforcefield] vi openff/toolkit/topology/molecule.py                                       5a0ba3f5  ✱
(openff-dev) [openforcefield] git add openff/toolkit/topology/molecule.py                                5a0ba3f5  ✭ ✱
(openff-dev) [openforcefield] git commit -m "Dummy commit"                                               5a0ba3f5  ✈ ✱
black....................................................................Passed
isort....................................................................Passed
[detached HEAD fc4ee5de] Dummy commit
 1 file changed, 1 insertion(+), 1 deletion(-)

... if the changes need no fixing in the eyes of the auto-formatters. If changes ARE needed (I added a line with foo= ['1', "1" ]) an error will be printed when git commit is attempted and the commit will be aborted:

(openff-dev) [openforcefield] vi openff/toolkit/topology/molecule.py                                       fc4ee5de  ✱
(openff-dev) [openforcefield] git add openff/toolkit/topology/molecule.py                                fc4ee5de  ✭ ✱
(openff-dev) [openforcefield] git commit -m "Commit with bad formatting"                                 fc4ee5de  ✈ ✱
black....................................................................Failed
- hook id: black
- files were modified by this hook

reformatted openff/toolkit/topology/molecule.py
All done! ✨ 🍰 ✨
1 file reformatted.

isort....................................................................Passed

At this point, changes are still being changed, so we can see what happened when black ran:

(openff-dev) [openforcefield] git diff | cat                                                           fc4ee5de  ✭ ✈ ✱
diff --git a/openff/toolkit/topology/molecule.py b/openff/toolkit/topology/molecule.py
index d8e810a5..9af9d2ee 100644
--- a/openff/toolkit/topology/molecule.py
+++ b/openff/toolkit/topology/molecule.py
@@ -1,5 +1,5 @@
 #!/usr/bin/env python
-foo=  ['1', "1"   ]
+foo = ["1", "1"]

 # =============================================================================================
 # MODULE DOCSTRING

then add them in and continue with the commiit

(openff-dev) [openforcefield] git add openff/toolkit/topology/molecule.py                              fc4ee5de  ✭ ✈ ✱
(openff-dev) [openforcefield] git commit -m "Commit with fixed formatting"                                 fc4ee5de  ✈ ✱
black....................................................................Passed
isort....................................................................Passed
[detached HEAD 3cb44e60] Commit with fixed formatting
 1 file changed, 2 insertions(+), 1 deletion(-)

Now everything passes the formatting checks!

Pre-commit CI

I made a pitch for this elsewhere and I think the demo above roughly summarizes it in action. It uses the same config file as developers' machines would and runs pretty quickly (15 seconds in my demo) compared to our other checks. Most of the time (small fixes and/or developers using pre-commit locally) it stays out of the way, i.e. a random PR from last week I didn't even think about it: openforcefield/openff-interchange#274

The developer of the project has promised "pre-commit.ci will always be free for open source repositories." (https://pre-commit.ci/#pricing).

Git history

The bot/CI service directly makes commits to PRs. In virtually all cases, these should not affect the code at a functional level. The actual changes are likely to be bumping the version of a commit or making small formatting fix(es). While we are squash-merging PRs, this only impacts the history of the master branch by adding another author to the squash-merge commit (and possibly extra lines of "bot did automatic fixes" to the description). I consider these changes to be of no meaningful consequence. If in the future we move away from squash-merges and move away from using GitHub and happen to have functional changes to our code that are not caught by our test suite, it is possible that this bot will be the commit author of a commit that breaks something. I expect, however, that this will be extremely unlikely. I also suspect that this scenario shifts the blame from our current safeguards: PR reviews (by humans), automated tests, deployment tests, and (sometimes) pre-release testing.

Later considerations

The linting action should probably go away eventually. It doesn't necessarily need to be removed, but its purpose is completely duplicated by pre-commit/pre-commit CI. I would think that keeping it installed for a short period of time would give us confidence that everything is synced up, but it wasteful maintenance burden to hold on to long-term.

I'd like to remove LGTM (sooner than later). I personally don't like it since I can't run its checks locally, so in some cases I don't understand why it's complaining, and it's also a little embarrassing to be yelled at by a bot whose complaints I cannot preempt because I can't install the same tools it uses.

Over time, it would be useful to add more hooks. There are tons that can be "installed" by adding a few lines of YAML, ranging from small safeguards (end-of-file-fixer) to more involved frameworkes like flake8. These would happen alongside actual changes to the code, which is where the cost lies. I'm not proposing we do anything right now but move the black and isort checks/automation to pre-commit. For now, though, the scope is small (#1040).

The text was updated successfully, but these errors were encountered:

mikemhenry · 2021-08-10T17:31:39Z

I really like pre-commit CI, in the signac project we use it in all of our main repos: https://github.com/glotzerlab/signac/blob/master/.pre-commit-config.yaml
It makes it easier for new people to contribute code since they don't need to setup git hooks locally, which we could never enforce anyway.

mattwthompson · 2021-08-19T16:09:16Z

After #1040 and #1022, this is implemented. I'll leave this open for a bit in case people have questions. I added some rough developer docs which explain what came to mind immediately; as always, it's a living document so we can make changes there as people run into any issues.

Eventually we should drop the linting action, but I'm not in a rush to do that.

mattwthompson · 2022-03-08T15:14:00Z

Closing as the original feature has been implemented. Future changes should be reflected in the developer docs!

mattwthompson mentioned this issue Aug 9, 2021

Add .pre-commit-config.yaml #1040

Merged

1 task

mattwthompson mentioned this issue Aug 13, 2021

Lint notebooks #1022

Merged

2 tasks

mattwthompson closed this as completed Mar 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adopt pre-commit and pre-commit CI #1039

Adopt pre-commit and pre-commit CI #1039

mattwthompson commented Aug 9, 2021 •

edited

Loading

mikemhenry commented Aug 10, 2021

mattwthompson commented Aug 19, 2021

mattwthompson commented Mar 8, 2022

Adopt pre-commit and pre-commit CI #1039

Adopt pre-commit and pre-commit CI #1039

Comments

mattwthompson commented Aug 9, 2021 • edited Loading

Set up and adoption

Developer experience

Pre-commit CI

Git history

Later considerations

mikemhenry commented Aug 10, 2021

mattwthompson commented Aug 19, 2021

mattwthompson commented Mar 8, 2022

mattwthompson commented Aug 9, 2021 •

edited

Loading