fix: always request total needed memory, as snakemake seems to count as single process #4

dlaehnemann · 2024-03-13T11:39:00Z

We have a cluster with the following setting:

        LSB_JOB_MEMLIMIT = N

For this case, the lsf docs on LSB_JOB_MEMLIMIT specify:

When LSB_JOB_MEMLIMIT is N or not defined, the LSF-enforced per-job limit is disabled, and the OS-enforced per-process limit is enabled.

So your interpretation previously was, that each requested cpu is its own process. However, I think that the whole job is always one single process, because it simply runs snakemake again in the main bsub job. So I think we always need to request the full amount of needed memory via -R rusage[mem={mem_}], no matter what the LSB_JOB_MEMLIMIT setting.

As a bit of backup / reasoning, here's what I saw for a rule that has threads: 8 specified, and resources: mem_mb = lambda wc, threads: threads * 4000 when run with the executor-pluging-lsf without the change proposed here. TL;DR is, it will only reserve the memory for one thread / cpu, but the process requires the total memory. Here's the logging output (shortened and redacted a bit):

TERM_MEMLIMIT: job killed after reaching LSF memory usage limit.
Exited with exit code 1.

Resource usage summary:

    CPU time :                                   15.81 sec.
    Max Memory :                                 4000 MB
    Average Memory :                             1403.00 MB
    Total Requested Memory :                     4000.00 MB
    Delta Memory :                               0.00 MB
    Max Swap :                                   -
    Max Processes :                              10
    Max Threads :                                26
    Run time :                                   57 sec.
    Turnaround time :                            57 sec.

The output (if any) follows:

[...]
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided remote nodes: 1
Provided resources: mem_mb=32000, mem_mib=30518, disk_mb=11452, disk_mib=10922
Select jobs to execute...
Execute 1 jobs...

[Tue Mar 12 23:20:42 2024]
rule bwa_mem:
[...]
    jobid: 0
    reason: Forced execution
[...]
    threads: 8
    resources: mem_mb=32000, mem_mib=30518, disk_mb=11452, disk_mib=10922, tmpdir=<TBD>, lsf_queue=long

[...]
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 8
Rules claiming more threads will be scaled down.
Provided resources: mem_mb=32000, mem_mib=30518, disk_mb=11452, disk_mib=10922
Select jobs to execute...
Execute 1 jobs...

[Tue Mar 12 23:20:51 2024]
localrule bwa_mem:
[...]
    jobid: 0
    reason: Forced execution
[...]
    threads: 8
    resources: mem_mb=32000, mem_mib=30518, disk_mb=11452, disk_mib=10922, tmpdir=/local/<user>/cluster_tmp, lsf_queue=long

Activating conda environment: .snakemake/conda/5dbb9e4bb1f45b4c08879cdb19603237_
Activating conda environment: .snakemake/conda/5dbb9e4bb1f45b4c08879cdb19603237_
Traceback (most recent call last):
  File "/omics/odcf/analysis/OE0377_projects/ec_dna/analyses/colo_320_dm/.snakemake/scripts/tmpdkme9zmf.wrapper.py", line 68, in <module>
    shell(
  File "/omics/groups/OE0377/internal/laehnemann/mambaforge/envs/snakemake_8/lib/python3.12/site-packages/snakemake/shell.py", line 297, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail;  (<command>' returned non-zero exit status 1.
RuleException:
CalledProcessError in file https://raw.githubusercontent.com/snakemake-workflows/dna-seq-short-read-circle-map/v1.2.3/workflow/rules/mapping.smk, line 18:
Command '<command>' returned non-zero exit status 1.
  File "https://raw.githubusercontent.com/snakemake-workflows/dna-seq-short-read-circle-map/v1.2.3/workflow/rules/mapping.smk", line 18, in __rule_bwa_mem
[Tue Mar 12 23:21:17 2024]
Error in rule bwa_mem:
    jobid: 0
[...]

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Storing output in storage.
WorkflowError:
At least one job did not complete successfully.

[Tue Mar 12 23:21:17 2024]
Error in rule bwa_mem:
    jobid: 0
[...]

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Storing output in storage.
WorkflowError:
At least one job did not complete successfully.

…as single process

dlaehnemann · 2024-03-13T11:41:31Z

P.S.: With the fix, it reserves the memory as I would expect and runs through. Forgot to clearly state this... 😅

BEFH · 2024-03-13T14:38:47Z

I'm going to need to take a look at this. The issue is that on some clusters, -R rusage[mem={mem_}] is per job, and on some clusters, it's per thread. On our cluster, it is per-thread.

The previous LSF profile actually used the Snakemake mem_mb value in -R rusage[mem=MEM], which went against the Snakemake definitions for most LSF clusters. This is an attempt to address that, which I believe was working on our cluster.

If you have provided per-thread memory requests, I have a helper function to dynamically correct that.

Could you please let me know if -R rusage[mem=MEM] is per job or per thread on your cluster?

The relevant documentation is here: https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=o-r#bsub_R_description__title__3 From what I understand, this setting will overwrite any cluster-wide settings and ensure, that memory is always reserved for the entire job (not per host or task). I think this is what we want snakemake to always do in this context, as this always submits one job and we know how much memory the entire job should use.

dlaehnemann · 2024-03-13T16:17:39Z

I amended my PR after some further digging. Here's the reasoning behind this, with the relevant documentation being:
https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=o-r#bsub_R_description__title__3

From what I understand, the -R rusage[mem={mem_}/job] setting will overwrite any existing cluster-wide settings and ensure, that memory is always reserved for the entire job (not per host or task). I think this is what we want snakemake to always do in this context, as this always submits one job and we know how much memory the entire job should use.

Does this make sense? Could you try this "at home"?

BEFH · 2024-03-13T16:56:36Z

I have confirmed that -R rusage[mem={mem_}/job] works! We now need to simplify the code to get the LSF settings, since I do not believe we are using that setting.

dlaehnemann · 2024-03-13T17:10:04Z

Good point, I think I have now done this. But please double-check in your code review.

BEFH · 2024-03-13T17:51:15Z

This looks good from me reading it, but we don't have any real unit tests in the repo other than code quality. I am going to manually change the pyproject.toml and make a new release later today, and then I will do some more testing. (I don't have Release Please set up right now.

Thanks so much for the help!

dlaehnemann · 2024-03-13T17:58:28Z

Many thanks for setting up this executor. It's a huge head start for the snakemake 8 transition.

And thanks for being so responsive. I'll keep filing PRs as further stuff comes up. And should you want to share the maintenance burden at some point, let me know.

dlaehnemann · 2024-03-13T18:01:35Z

Also, I'll watch out for the bioconda bot auto-update of the respective recipe, and shepherd that along. Unless you beat me to it... ;)

dlaehnemann · 2024-03-19T11:01:29Z

It seems like this /job syntax actually runs into an LSF bug on the system that I work on. They check for a correct setting of the LSB_SUB_MEM_USAGE parameter mentioned here:
https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=controls-configuration-enable-job-submission-execution

And this does not get parsed correctly by LSF, if you add a resource reservation method to the rusage string. So currently this breaks our setup, but I haven't found a way to more generally check this. So this is basically just a ping, in case you have another idea of how to achieve a more general behaviour. Otherwise, we'll have to wait for a workaround on our system, or a fix of the bug in LSF...

BEFH · 2024-03-19T13:44:13Z

What if we have an env variable that can be set as a workaround that modifies the behavior of the executor? One value for total memory without /job, and one value for memory/core. If the variable is unset, the executor behaves like it does now.

…

On Tue, Mar 19, 2024, 7:01 AM David Laehnemann ***@***.***> wrote: It seems like this /job syntax actually runs into an LSF bug on the system that I work on. They check for a correct setting of the LSB_SUB_MEM_USAGE parameter mentioned here: https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=controls-configuration-enable-job-submission-execution And this does not get parsed correctly by LSF, if you add a resource reservation method to the rusage string. So currently this breaks our setup, but I haven't found a way to more generally check this. So this is basically just a ping, in case you have another idea of how to achieve a more general behaviour. Otherwise, we'll have to wait for a workaround on our system, or a fix of the bug in LSF... — Reply to this email directly, view it on GitHub <#4 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZ2Z2AU7KV3FRNJMM6ELODYZALJ5AVCNFSM6AAAAABET7OWACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBWHA2TCMZQG4> . You are receiving this because you modified the open/close state.Message ID: ***@***.***>

dlaehnemann · 2024-03-19T14:47:08Z

That's a good idea, we could try that.

An alternative thought: Do you know which exact setting determines that memory is requested per thread? If so, we could query that setting and only append the /job specifier in this case. That way, we would not have it in there unnecessarily...

BEFH · 2024-03-19T15:28:26Z

I have looked and not been able to find this. I thought I had, but no luck.

dlaehnemann · 2024-03-19T16:17:40Z

So I got some extra pointers from our IT. From what I understand, this is how they set up things so that memory requests are set and enforced per job (and per host):

LSB_JOB_MEMLIMIT = N in lsf.conf (to disable per-job memory limits)
LSB_MEMLIMIT_ENFORCE=N in lsf.conf (to disable per-process memory limits)

And instead they enforce it via setting:

LSB_RESOURCE_ENFORCE="memory" in lsf.conf (memory is one of multiple space-separated entries, here)

There is a docs page for LSB_RESOURCE_ENFORCE and a page providing more detail for this setting for memory.

Do these variables point to any differences that you can see in your local setup, that would explain the per-thread behaviour?

Otherwise, I think this command-line option sounds like a good alternative to the /jobs syntax:

https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=options-hl

So I will open a PR with this, and then you can also test this in your setting.

BEFH · 2024-03-19T16:27:15Z

I was previously using and maybe misunderstanding mem_perjob = self.lsf_config.get("LSB_JOB_MEMLIMIT", "n").lower(). This seems to be a custom config edge case on your server, and I wonder if other sysadmins are doing the same thing in different ways

dlaehnemann · 2024-03-19T16:31:54Z

Just to link this up, the new PR is #5 .

dlaehnemann · 2024-03-19T16:34:20Z

I was previously using and maybe misunderstanding mem_perjob = self.lsf_config.get("LSB_JOB_MEMLIMIT", "n").lower(). This seems to be a custom config edge case on your server, and I wonder if other sysadmins are doing the same thing in different ways

With the state of the docs, my working theory is that any server setup will end up being a config edge case... 👀

Also, lots to misunderstand in all these docs, and with stuff seemingly implicitly happening before and during submission, that are not transparent to the user. But let's try to at least get a solution, that works for both our setups. 🙈

fix: always request total needed memory, as snakemake seems to count …

7a0dc4f

…as single process

chore: remove now unused LSB_JOB_MEMLIMIT config retrieval code

245bd80

BEFH merged commit ffc0c8b into BEFH:main Mar 13, 2024
3 checks passed

dlaehnemann deleted the patch-1 branch March 13, 2024 17:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: always request total needed memory, as snakemake seems to count as single process #4

fix: always request total needed memory, as snakemake seems to count as single process #4

dlaehnemann commented Mar 13, 2024

dlaehnemann commented Mar 13, 2024

BEFH commented Mar 13, 2024

dlaehnemann commented Mar 13, 2024

BEFH commented Mar 13, 2024

dlaehnemann commented Mar 13, 2024

BEFH commented Mar 13, 2024

dlaehnemann commented Mar 13, 2024

dlaehnemann commented Mar 13, 2024

dlaehnemann commented Mar 19, 2024

BEFH commented Mar 19, 2024 via email

dlaehnemann commented Mar 19, 2024

BEFH commented Mar 19, 2024

dlaehnemann commented Mar 19, 2024

BEFH commented Mar 19, 2024

dlaehnemann commented Mar 19, 2024

dlaehnemann commented Mar 19, 2024

fix: always request total needed memory, as snakemake seems to count as single process #4

fix: always request total needed memory, as snakemake seems to count as single process #4

Conversation

dlaehnemann commented Mar 13, 2024

dlaehnemann commented Mar 13, 2024

BEFH commented Mar 13, 2024

dlaehnemann commented Mar 13, 2024

BEFH commented Mar 13, 2024

dlaehnemann commented Mar 13, 2024

BEFH commented Mar 13, 2024

dlaehnemann commented Mar 13, 2024

dlaehnemann commented Mar 13, 2024

dlaehnemann commented Mar 19, 2024

BEFH commented Mar 19, 2024 via email

dlaehnemann commented Mar 19, 2024

BEFH commented Mar 19, 2024

dlaehnemann commented Mar 19, 2024

BEFH commented Mar 19, 2024

dlaehnemann commented Mar 19, 2024

dlaehnemann commented Mar 19, 2024