Split the github workflow in CI and CD #1063

famura · 2024-03-20T16:31:53Z

What does this implement/fix? Explain your changes

The general idea is that we want to run a CI workflow that is more lightweight, hence, allows for faster turnover rates, and a CD workflow that runs more tests, computes the coverage, and maybe in the future directly pushes the latest package version to PyPi.

Does this close any currently open issues?

No.

Any relevant code examples, logs, error output, etc?

See the actions.

Any other comments?

I chose to run the pre-commit hooks in a different way than in test.yml such that we only need to specify one container and do everything in it.

For the first checks, the CD workflow will run on draft PRs. This should be changed when this PR is close to acceptance.

Let's debate on the CD workflow running the slow tests.

Checklist

Put an x in the boxes that apply. You can also fill these out after creating
the PR. If you're unsure about any of them, don't hesitate to ask. We're here to
help! This is simply a reminder of what we are going to look for before merging
your code.

I have read and understood the contribution
guidelines
I agree with re-licensing my contribution from AGPLv3 to Apache-2.0.
I have commented my code, particularly in hard-to-understand areas
I have added tests that prove my fix is effective or that my feature works
I have reported how long the new tests run and potentially marked them
with pytest.mark.slow.
New and existing unit tests pass locally with my changes
I performed linting and formatting as described in the contribution
guidelines
I rebased on main (or there are no conflicts with main)

Baschdl · 2024-03-20T16:53:50Z

I like the idea of running the slow tests during CD to make sure they are run at all.
Did you test how big the impact of running with/without coverage is? The downside of running CI without coverage would be that we don't get a notification if a PR reduces test coverage significantly.

famura · 2024-03-20T17:22:27Z

Did you test how big the impact of running with/without coverage is? The downside of running CI without coverage would be that we don't get a notification if a PR reduces test coverage significantly.

I very much agree. I am thinking of putting the coverage back in for CI, especially since the CD will only run once a PR is merged (if it stays as I intended)

I did not check the difference in cov with included slow tests

famura · 2024-03-20T17:23:34Z

Even though this PR is not marked as a draft any more, it still is. The reason is that I want to debug the if clause for ignoring draft PRs but first I need to make sure that the respective workflow triggers at all xD

.github/workflows/cd.yml

famura · 2024-03-21T09:45:06Z

Ignoring draft PRs works

famura · 2024-03-25T08:21:47Z

@tomMoral @janfb @Baschdl I now configured it to run only the fast tests for every push and the fast+slow tests for every push on a PR. Note, that this will lead to double execution of tests. I don't have a strong opinion in this and can easily revert it. As said before

I don't think we should do that in the CD workflow as its purpose is a bit of a different one (computing max coverage, doc, etc.). Moreover, I think that it is very unlikely that there is a failure introduced due to Python version incompatibility which only occurs in the slow tests and not the others.

I don't believe that we will catch too many error's with this, but the coverage estimate will be better for PRs

Baschdl · 2024-03-25T10:56:18Z

To quote myself (#1063 (comment)):

I think @michaeldeistler had a strong opinion against increasing the runtime of our CI.

and increasing it to over 30min is quite a lot. I would be in favor of running the slow tests as well as tests on other Python versions somewhere but an improvement to the current state would already be if that means running them only once a day etc.

famura · 2024-03-25T10:56:44Z

Interesting test fail. I think it is unrelated to this PR. What do you think @Baschdl ?

Note: it happened during the slow tests, which are newly integrated in this PR.

I think it might be due to test parallelism.

famura · 2024-03-25T10:59:51Z

To quote myself (#1063 (comment)):

I think @michaeldeistler had a strong opinion against increasing the runtime of our CI.

and increasing it to over 30min is quite a lot. I would be in favor of running the slow tests as well as tests on other Python versions somewhere but an improvement to the current state would already be if that means running them only once a day etc.

Yeah sure, I mean my original proposal was to only run them once a PR is merged (not ideal, but minimally invasive from the current state)

Baschdl · 2024-03-25T11:05:03Z

Interesting test fail. I think it is unrelated to this PR. What do you think @Baschdl ?

Quick summary for everyone who doesn't want to look at CI. This test fails for sim_batch_size=1, num_workers=10 but works for all other configurations:

sbi/tests/multiprocessing_test.py

Lines 28 to 31 in c383d7f

    
           @pytest.mark.slow 
        
           @pytest.mark.parametrize("num_workers", [10, -2]) 
        
           @pytest.mark.parametrize("sim_batch_size", ((1, 10, 100))) 
        
           def test_benchmarking_parallel_simulation(sim_batch_size, num_workers):

I also had a look and wanted to open an issue for it. I could imagine multiple reasons for it: overhead of creating 10 workers (on a machine with only 2 cores) is not worth it if we only have 1 sample per batch or problems with running this test with -n auto. I think the first is more likely because all other variants work.

famura · 2024-03-25T11:15:03Z

I also had a look and wanted to open an issue for it. I could imagine multiple reasons for it: overhead of creating 10 workers (on a machine with only 2 cores) is not worth it if we only have 1 sample per batch or problems with running this test with -n auto. I think the first is more likely because all other variants work.

It is a bit unintuitve though that it fails for the smallest batch size.
Anyhow, I think one could safely use 2 or 3 workers instead of 10. It proves the same point.

janfb · 2024-03-25T13:57:46Z

A bit late to the party but here is my input:

slow tests used to be really slow, like 8h or so. Therefore, we really wanted to run them separately, usually before every new release.
they are much faster now, but I still like the separation and the possibility to add more extensive functional tests without having to worry so much about CI time.
Thus, I think we should stick with -m "not slow and not gpu" tests during the "CI" phase
But it would be good to run all tests before merging into main, or before each new release at least. Thus, I really like the separation into CI and CD.
However, we would still need to test GPU tests locally, right?

Baschdl · 2024-03-25T14:37:46Z

However, we would still need to test GPU tests locally, right?

Yes, they aren't run anywhere currently. We could set up a self-hosted runner with a GPU but we would need to think about long-term maintenance.

famura · 2024-03-26T09:08:19Z

they are much faster now, but I still like the separation and the possibility to add more extensive functional tests without having to worry so much about CI time.

Thus, I think we should stick with -m "not slow and not gpu" tests during the "CI" phase

But it would be good to run all tests before merging into main, or before each new release at least. Thus, I really like the separation into CI and CD.

OK, so this PR is almost ready then. Just a few questions / points from my side. I added the verbose -v flag, however, I am not sure if we really want this in the CI tests, since we just care if one fails. The -x (exit on first fail) flag seems to be ineffective with parallelism. Finally, the new workflow which just tests can not be tested before it's merged (quite ironic). The idea here is to run the slow tests manually if desired. I can also make CD run the low non-GPU tests, but this workflow will only happen when a PR is merged, i.e., the damage has been done.

famura · 2024-03-28T08:58:59Z

So, what is left to do here? The PR is still missing a review... 🙏 @Baschdl @janfb

Baschdl

I would say it's done but we should first fix #1111, otherwise we'll get a failing main.

janfb

Great! Thanks for adding this and discussing it in detail!
One question: The main difference between CI and CD are the slow tests and building the docs, right?
I.e., codecov is checked in both?

I suggest that we merge this now and just see how it works in practice and then adapt. The CI-CD separation gives a lot of flexibility..

famura · 2024-04-03T09:38:27Z

The main difference between CI and CD are the slow tests and building the docs, right? I.e., codecov is checked in both?

Yes, but there is also a difference in when they are triggered:

CI as soon an ppl are committing on a non-draft PR
CD as soon as a PR is merged (actually on every push to main)

Both compute and upload the coverage, but under different (new) names, i.e., codecov-sbi-fast-cpu and codecov-sbi-all-cpu

Moreover, CI runs the OS-PyTorch matrix, while CD only runs the combination Python3.8 and pip's choice for the torch version.

CD also builds the docs to check if that process runs without any errors. However, it does not upload.

famura added enhancement New feature or request architecture Internal changes without API consequences less-urgent This is beyond the current 2 week horizon hackathon labels Mar 20, 2024

famura self-assigned this Mar 20, 2024

famura marked this pull request as ready for review March 20, 2024 17:17

Baschdl reviewed Mar 20, 2024

View reviewed changes

.github/workflows/cd.yml Outdated Show resolved Hide resolved

famura added 14 commits March 21, 2024 09:05

First version

033b40d

Deleted the old CI

f89a91a

Do not run on draft PRs

aaad036

Deleted comment

fe86809

Added --exitfirst

14cb7d4

Commented out if for testing the workflow

4e9be3c

Moved the pull_request tag from CI to CD

3b35f9c

Changing on push and pull_request for CI and CD

3916b90

More specific concurrency

0a37af8

Debugging why CD is not triggered

55587fb

Moved if clause

a999891

Bug fix

1e0e73a

Changed flags for code cov

f08e7b5

Added -n auto from recent PR

5f8266d

famura force-pushed the feat/split_ci_cd branch from a6260ac to 5f8266d Compare March 21, 2024 08:07

famura requested a review from janfb March 21, 2024 08:28

Installing pytest cov with the dev option

b4bee53

famura marked this pull request as draft March 21, 2024 09:22

Updated testing and cov options

4c6987b

famura added 2 commits March 25, 2024 09:06

More detailed caching key

f3a15bf

Run fast tests for pushes and fast+slow for PRs

67bfef5

famura and others added 3 commits March 25, 2024 09:30

Testing double exec

283fd87

Revert since the last commit is only executed when the PR is merged

6036a3b

Add fallback keys if full key is not available

aaec3f7

famura added 2 commits March 25, 2024 12:10

Do not run the slow tests in CI

6958f94

Fixed typo

8be7169

famura added 2 commits March 25, 2024 12:27

Add a workflow for manual selection of markers

b472ab0

Added coverage report for fast CI back in

fc53abb

Baschdl mentioned this pull request Mar 25, 2024

test_benchmarking_parallel_simulation fails on CI for one configuration #1111

Closed

This was referenced Mar 25, 2024

Missing packages when building the docs #1114

Closed

Broken links in docs website #1115

Closed

Baschdl approved these changes Mar 28, 2024

View reviewed changes

manuelgloeckler mentioned this pull request Apr 2, 2024

multiprocessing only with num_cores #1117

Merged

janfb reviewed Apr 2, 2024

View reviewed changes

janfb approved these changes Apr 2, 2024

View reviewed changes

janfb merged commit 6bbb446 into main Apr 3, 2024
5 checks passed

janfb deleted the feat/split_ci_cd branch April 3, 2024 06:23

janfb mentioned this pull request Apr 12, 2024

Only running tests on PRs marked ready for review oder review requested #1054

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split the github workflow in CI and CD #1063

Split the github workflow in CI and CD #1063

famura commented Mar 20, 2024 •

edited

Loading

Baschdl commented Mar 20, 2024

famura commented Mar 20, 2024

famura commented Mar 20, 2024

famura commented Mar 21, 2024

famura commented Mar 25, 2024 •

edited

Loading

Baschdl commented Mar 25, 2024

famura commented Mar 25, 2024 •

edited

Loading

famura commented Mar 25, 2024

Baschdl commented Mar 25, 2024

famura commented Mar 25, 2024

janfb commented Mar 25, 2024

Baschdl commented Mar 25, 2024

famura commented Mar 26, 2024

famura commented Mar 28, 2024

Baschdl left a comment

janfb left a comment •

edited

Loading

famura commented Apr 3, 2024

Split the github workflow in CI and CD #1063

Split the github workflow in CI and CD #1063

Conversation

famura commented Mar 20, 2024 • edited Loading

What does this implement/fix? Explain your changes

Does this close any currently open issues?

Any relevant code examples, logs, error output, etc?

Any other comments?

Checklist

Baschdl commented Mar 20, 2024

famura commented Mar 20, 2024

famura commented Mar 20, 2024

famura commented Mar 21, 2024

famura commented Mar 25, 2024 • edited Loading

Baschdl commented Mar 25, 2024

famura commented Mar 25, 2024 • edited Loading

famura commented Mar 25, 2024

Baschdl commented Mar 25, 2024

famura commented Mar 25, 2024

janfb commented Mar 25, 2024

Baschdl commented Mar 25, 2024

famura commented Mar 26, 2024

famura commented Mar 28, 2024

Baschdl left a comment

Choose a reason for hiding this comment

janfb left a comment • edited Loading

Choose a reason for hiding this comment

famura commented Apr 3, 2024

famura commented Mar 20, 2024 •

edited

Loading

famura commented Mar 25, 2024 •

edited

Loading

famura commented Mar 25, 2024 •

edited

Loading

janfb left a comment •

edited

Loading