Refactor groovy script to include batching. #1008

nuclearsandwich · 2023-06-26T20:45:52Z

The last several years, we've added batching to the this template script when executing it on the official build farms but never actually updated it here.

This change includes what is essentially, a rewrite of the batch process example script to more closely match the one I've been using the last year or so.

I someday hope to get even more meta about this and create a management job on Jenkins that runs variations of this script on templated job types in order to remove the need for batching and the chance for a typo when copy-pasting and modifying the example script.

The last several years, we've added batching to the this template script when executing it on the official build farms but never actually updated it here. This change includes what is essentially, a rewrite of the batch process example script to more closely match the one I've been using the last year or so. I someday hope to get even more meta about this and create a management job on Jenkins that runs variations of this script on templated job types in order to remove the need for batching and the chance for a typo when copy-pasting and modifying the example script.

nuclearsandwich · 2023-06-26T20:47:33Z

doc/ongoing_operations.rst


 The following Groovy script is a good starting point for various actions:

 .. code-block:: groovy

-   import hudson.model.Cause
+   DISTRO_LETTER = "F"
+   PREFIXES = ["ci_", "dev_", "pr_", "rel", "src_", "bin_"].collect({pre -> DISTRO_LETTER + pre})


It might be overkill to add a DISTRO_LETTER constant one line before its only use but I like making the bits that are meant to be adjusted very obvious rather than just hard-coding the letter into the map/collect that generates all of the prefixes.

Just as an FYI, if we wanted to get just a bit fancy, we could do this at the top:

DISTRO_NAME = "foxy" DISTRO_LETTER = DISTRO_NAME[0..0].toUpperCase() PREFIXES = [DISTRO_NAME] PREFIXES += ["ci_", "dev_", "pr_", "rel", "src_", "bin_"].collect({pre -> DISTRO_LETTER + pre})

That way we'll pick up the jobs that are prefixed with the DISTRO_NAME as well.

FYI, this is also missing the doc jobs, so it should also be added to the list above.

nuclearsandwich · 2023-06-26T20:48:36Z

doc/ongoing_operations.rst

+   /* Some operations take time and an http timeout
+    * creates performance issues with dangling script
+    * runs.
+    * To prevent them use a batch size that returns
+    * fairly instant results and run the script multiple
+    * times
+    */


I hoisted this comment up into the documentation above as well. I am torn between leaving it in the script and cutting it since it's documented above.

nuclearsandwich · 2023-06-26T20:52:27Z

doc/ongoing_operations.rst

+   {
+     if (count >= BATCH_SIZE)
+     {
+       println("Reached ${BATCH_SIZE} limit before processing ${job.name}.")


I could see someone omitting the print of processed job names, which is otherwise the only feedback when processing batches that you aren't processing the same 100 jobs over again (such as if there's a mismatch between the predicate and operation) so I added this print as an indicator of where we stopped.

If I was more bored I might be tempted to implement some kind of cursor pattern.

nuclearsandwich · 2023-06-26T20:54:29Z

doc/ongoing_operations.rst

+
+   if (count <= BATCH_SIZE)
+   {
+     println("Completed execution of the last batch.")


I've always previously just run the script until it returns 0 results but this way the script positively reports that it's processed the last batch (based on the fact that the batch is smaller than the max batch size).

It is possible if the number of jobs to process is precisely divisible by the batch size that this will print only after an empty batch, but it should still print.

nuclearsandwich · 2023-06-26T20:55:37Z

doc/ongoing_operations.rst

   }

 This script will print only the matched job names.
-You can uncomment any of the actions to disable, enable, trigger or delete these projects.
+You can uncomment any of the actions to disable, enable, or delete these projects.


I removed the instructions for triggering builds using this script because the _trigger-jobs jobs should cover those use cases now and there is not an entirely reasonable way to track that state using the batch setup.

quarkytale

Makes sense to me, but wouldn't mind waiting for Scott's comments.

quarkytale · 2023-06-26T21:53:31Z

doc/ongoing_operations.rst

+Some operations take time and an HTTP timeout creates performance issues with dangling script runs.
+To prevent them the example script uses a ``BATCH_SIZE`` constant to control how many jobs to change before returning.
+You can leave the default, fairly conservative batch size or increase that constant steadily until results are no longer instantaneous and then adjusting back down.


I'm confused, does it process jobs even if they're processed/disabled in the previous iteration?

I think it's fair to say that in this context, "processing" means changing. That's what seems to take significant time and causes the script invocation to time out.

Simply inspecting all of the jobs can happen in every invocation. The continue in the loop for each should avoid incrementing the counter if the operation would have been a no-op, so "processing" be 1-to-1 with "changing" here.

Good question! @cottsay is correct here that iterating over jobs does not take much time at all but performing operations on them (enable, disable, delete) does.

So each subsequent run of the script would iterate over a growing number of (previously processed jobs) but checking the predicate is not very time intensive compared to actually performing an operation, so it does not have much of an observable effect on the return time of the script.

I resisted the urge to do even more programming and create predicate,operation pairs and setting another top-level variable so that we could actually add the operation-predicate to the filter closure that's passed to getItems and thus only jobs that meet the predicate would be iterated over.

cottsay

🎉

cottsay · 2023-06-26T22:39:13Z

doc/ongoing_operations.rst

+Some operations take time and an HTTP timeout creates performance issues with dangling script runs.
+To prevent them the example script uses a ``BATCH_SIZE`` constant to control how many jobs to change before returning.
+You can leave the default, fairly conservative batch size or increase that constant steadily until results are no longer instantaneous and then adjusting back down.


I think it's fair to say that in this context, "processing" means changing. That's what seems to take significant time and causes the script invocation to time out.

Simply inspecting all of the jobs can happen in every invocation. The continue in the loop for each should avoid incrementing the counter if the operation would have been a no-op, so "processing" be 1-to-1 with "changing" here.

quarkytale

Edits from the recent run

doc/ongoing_operations.rst

Groovy does not allow a closure variable to shadow another one. Co-authored-by: Dharini Dutia <[email protected]>

Co-authored-by: Dharini Dutia <[email protected]>

nuclearsandwich requested a review from cottsay June 26, 2023 20:45

nuclearsandwich self-assigned this Jun 26, 2023

nuclearsandwich commented Jun 26, 2023

View reviewed changes

quarkytale approved these changes Jun 26, 2023

View reviewed changes

cottsay approved these changes Jun 26, 2023

View reviewed changes

quarkytale reviewed Jun 30, 2023

View reviewed changes

doc/ongoing_operations.rst Outdated Show resolved Hide resolved

doc/ongoing_operations.rst Show resolved Hide resolved

nuclearsandwich and others added 3 commits July 5, 2023 13:41

Fix shadwowed variable name.

1505d62

Groovy does not allow a closure variable to shadow another one. Co-authored-by: Dharini Dutia <[email protected]>

Print job name on each run as a progress indicator.

5a80a00

Co-authored-by: Dharini Dutia <[email protected]>

Merge branch 'master' into batched-scripts

869be77

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor groovy script to include batching. #1008

Refactor groovy script to include batching. #1008

nuclearsandwich commented Jun 26, 2023

nuclearsandwich Jun 26, 2023

clalancette Jul 5, 2023

clalancette Jul 7, 2023

nuclearsandwich Jun 26, 2023

nuclearsandwich Jun 26, 2023

nuclearsandwich Jun 26, 2023

nuclearsandwich Jun 26, 2023

quarkytale left a comment

quarkytale Jun 26, 2023

cottsay Jun 26, 2023

nuclearsandwich Jun 26, 2023 •

edited

Loading

cottsay left a comment

cottsay Jun 26, 2023

quarkytale left a comment

Refactor groovy script to include batching. #1008

Are you sure you want to change the base?

Refactor groovy script to include batching. #1008

Conversation

nuclearsandwich commented Jun 26, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

quarkytale left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nuclearsandwich Jun 26, 2023 • edited Loading

Choose a reason for hiding this comment

cottsay left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

quarkytale left a comment

Choose a reason for hiding this comment

nuclearsandwich Jun 26, 2023 •

edited

Loading