Skip to content

Commit

Permalink
feat: env vars for setting time, memory, shebang and set [#5]
Browse files Browse the repository at this point in the history
  • Loading branch information
mbhall88 committed Aug 21, 2024
1 parent 74d2e0f commit 9e87374
Show file tree
Hide file tree
Showing 2 changed files with 167 additions and 56 deletions.
142 changes: 91 additions & 51 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,30 @@ Submit a job that needs 8 CPUs
$ ssubmit -m 16g -t 1d align "minimap2 -t 8 ref.fa query.fq > out.paf" -- -c 8
```

```
$ ssubmit -h
Submit sbatch jobs without having to create a submission script
Usage: ssubmit [OPTIONS] <NAME> <COMMAND> [-- <REMAINDER>...]
Arguments:
<NAME> Name of the job
<COMMAND> Command to be executed by the job
[REMAINDER]... Options to be passed on to sbatch
Options:
-o, --output <OUTPUT> File to write job stdout to. (See `man sbatch | grep -A 3 'output='`) [default: %x.out]
-e, --error <ERROR> File to write job stderr to. (See `man sbatch | grep -A 3 'error='`) [default: %x.err]
-m, --mem <size[unit]> Specify the real memory required per node. e.g., 4.3kb, 7 Gb, 9000, 4.1MB become 5KB, 7000M, 9000M, and 5M, respectively [env: SSUBMIT_MEMORY=] [default: 1G]
-t, --time <TIME> Time limit for the job. e.g. 5d, 10h, 45m21s (case-insensitive) [env: SSUBMIT_TIME=] [default: 1d]
-S, --shebang <SHEBANG> The shell shebang for the submission script [env: SSUBMIT_SHEBANG=] [default: "#!/usr/bin/env bash"]
-s, --set <SET> Options for the set command in the shell script [env: SSUBMIT_SET=] [default: "euxo pipefail"]
-n, --dry-run Print the sbatch command and submission script that would be executed, but do not execute them
-T, --test-only Return an estimate of when the job would be scheduled to run given the current queue. No job is actually submitted. [sbatch --test-only]
-h, --help Print help (see more with '--help')
-V, --version Print version
```

The basic anatomy of a `ssubmit` call is

```
Expand All @@ -114,6 +138,8 @@ For example, 1.1M will be rounded up to 2M. If you want to use the default memor

For simplicity's sake, all values over one megabyte are passed to sbatch as megabytes - e.g., 1.1G will be passed as 1100M.

The environment variable `SSUBMIT_MEM` can be set to a default memory limit. This can be overridden by passing `-m`.

### Time

As with memory, time (`-t,--time`) is intended to be simple. If you want a time limit of
Expand All @@ -124,6 +150,8 @@ a full list of supported time units, check out the
[`duration-str`](https://github.com/baoyachi/duration-str) repo. One thing to note is that passing a single digit, without a unit, will be interpreted by
slurm as minutes. However, not providing a unit in the example of `5m3` will be interpreted as 5 minutes and 3 seconds.

The environment variable `SSUBMIT_TIME` can be set to a default time limit. This can be overridden by passing `-t`.

### Dry run

You can see what `ssubmit` would do without actually submitting a job using dry run
Expand All @@ -137,7 +165,7 @@ sbatch -c 8 <script>
=====<script>=====
#!/usr/bin/env bash
#SBATCH --job-name=dry
#SBATCH --mem=4G
#SBATCH --mem=4000M
#SBATCH --time=24:0:0
#SBATCH --error=%x.err
#SBATCH --output=%x.out
Expand All @@ -150,12 +178,12 @@ rsync -az src/ dest/
### Script settings

The default shebang for the script is `#!/usr/bin/env bash`. However, if you'd prefer
something else, pass this with `-S,--shebang`.
something else, pass this with `-S,--shebang` or set the environment variable `SSUBMIT_SHEBANG`.

Additionally, we use `set -euxo pipefail` by default, which will exit when a command exits with a
non-zero exit code (`e`), error when trying to use an unset variable (`u`), print
all commands that were run to stderr (`x`), and exit if a command in a pipeline fails
(`-o pipefail`). You can change these setting with `-s,--set`. You can turn this off
(`-o pipefail`). You can change these setting with `-s,--set` or the environment variable `SSUBMIT_SET`. You can turn this off
by passing `-s ''`.

### Log files
Expand All @@ -168,7 +196,7 @@ You don't have to use patterns of course.

### Full usage

```shell
```
$ ssubmit --help
ssubmit 0.2.0
Michael Hall <[email protected]>
Expand All @@ -185,80 +213,92 @@ $ ssubmit -m 600m rsync_my_data "rsync -az src/ dest/"
Submit a command that involves piping the output into another command. sbatch options
are passed after a `--`.
$ ssubmit -m 4G align "minimap2 -t 8 ref.fa reads.fq | samtools sort -o sorted.bam" -- -c 8
Submit sbatch jobs without having to create a submission script
-----------
# EXAMPLES
-----------
Submit a simple rsync command with a 600MB memory limit.
$ ssubmit -m 600m rsync_my_data "rsync -az src/ dest/"
Submit a command that involves piping the output into another command. sbatch options
are passed after a `--`.
$ ssubmit -m 4G align "minimap2 -t 8 ref.fa reads.fq | samtools sort -o sorted.bam" -- -c 8
USAGE:
ssubmit [OPTIONS] <NAME> <COMMAND> [-- <REMAINDER>...]
Usage: ssubmit [OPTIONS] <NAME> <COMMAND> [-- <REMAINDER>...]
ARGS:
<NAME>
Name of the job
Arguments:
<NAME>
Name of the job
See `man sbatch | grep -A 2 'job-name='` for more details.
See `man sbatch | grep -A 2 'job-name='` for more details.
<COMMAND>
Command to be executed by the job
<COMMAND>
Command to be executed by the job
<REMAINDER>...
Options to be passed on to sbatch
[REMAINDER]...
Options to be passed on to sbatch
OPTIONS:
-e, --error <ERROR>
File to write job stderr to. (See `man sbatch | grep -A 3 'error='`)
Options:
-o, --output <OUTPUT>
File to write job stdout to. (See `man sbatch | grep -A 3 'output='`)
Run `man sbatch | grep -A 37 '^filename pattern'` to see available patterns.
Run `man sbatch | grep -A 37 '^filename pattern'` to see available patterns.
[default: %x.err]
[default: %x.out]
-h, --help
Print help information
-e, --error <ERROR>
File to write job stderr to. (See `man sbatch | grep -A 3 'error='`)
-m, --mem <size[units]>
Specify the real memory required per node. e.g., 4.3kb, 7G, 9000, 4.1MB
Run `man sbatch | grep -A 37 '^filename pattern'` to see available patterns.
Note, floating point numbers will be rounded up. e.g., 10.1G will request 11G. This is
because sbatch only allows integers. See `man sbatch | grep -A 4 'mem='` for the full
details.
[default: %x.err]
[default: 1G]
-m, --mem <size[unit]>
Specify the real memory required per node. e.g., 4.3kb, 7 Gb, 9000, 4.1MB become 5KB, 7000M, 9000M, and 5M, respectively.
-n, --dry-run
Print the sbatch command and submission script would be executed, but do not execute
them
If no unit is specified, megabytes will be used, as per the sbatch default. The value will be rounded up to the nearest megabyte. If the value is less than 1M, it will be rounded up to the nearest kilobyte. See `man sbatch | grep -A 4 'mem='` for the full details.
-o, --output <OUTPUT>
File to write job stdout to. (See `man sbatch | grep -A 3 'output='`)
[env: SSUBMIT_MEMORY=]
[default: 1G]
Run `man sbatch | grep -A 37 '^filename pattern'` to see available patterns.
-t, --time <TIME>
Time limit for the job. e.g. 5d, 10h, 45m21s (case-insensitive)
[default: %x.out]
Run `man sbatch | grep -A 7 'time=<'` for more details. If a single digit is passed, it will be passed straight to sbatch (i.e. minutes). However, 5m5 will be considered 5 minutes and 5 seconds.
-s, --set <SET>
Options for the set command in the shell script
[env: SSUBMIT_TIME=]
[default: 1d]
For example, to exit when the command exits with a non-zero code and to treat unset
variables as an error during substitution, pass 'eu'. Pass '' or "" to set nothing
-S, --shebang <SHEBANG>
The shell shebang for the submission script
[default: "euxo pipefail"]
[env: SSUBMIT_SHEBANG=]
[default: "#!/usr/bin/env bash"]
-S, --shebang <SHEBANG>
The shell shebang for the submission script
-s, --set <SET>
Options for the set command in the shell script
[default: "#!/usr/bin/env bash"]
For example, to exit when the command exits with a non-zero code and to treat unset variables as an error during substitution, pass 'eu'. Pass '' or "" to set nothing
-t, --time <TIME>
Time limit for the job. e.g. 5d, 10h, 45m21s (case insensitive)
[env: SSUBMIT_SET=]
[default: "euxo pipefail"]
Run `man sbatch | grep -A 7 'time=<'` for more details.
-n, --dry-run
Print the sbatch command and submission script that would be executed, but do not execute them
[default: 1w]
-T, --test-only
Return an estimate of when the job would be scheduled to run given the current queue. No job is actually submitted. [sbatch --test-only]
-T, --test-only
Return an estimate of when the job would be scheduled to run given the current queue. No
job is actually submitted. [sbatch --test-only]
-h, --help
Print help (see a summary with '-h')
-V, --version
Print version information
-V, --version
Print version
```


Expand Down
81 changes: 76 additions & 5 deletions src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@ use regex::Regex;

use ssubmit::SlurmTime;

const SSUBMIT_SHEBANG: &str = "SSUBMIT_SHEBANG";
const SSUBMIT_MEMORY: &str = "SSUBMIT_MEMORY";
const SSUUBMIT_TIME: &str = "SSUBMIT_TIME";
const SSUBMIT_SET: &str = "SSUBMIT_SET";

/// Submit sbatch jobs without having to create a submission script
///
/// -----------
Expand Down Expand Up @@ -47,17 +52,17 @@ pub struct Cli {
/// be rounded up to the nearest megabyte. If the value is less than 1M, it will be rounded up
/// to the nearest kilobyte.
/// See `man sbatch | grep -A 4 'mem='` for the full details.
#[arg(short, long = "mem", value_name = "size[unit]", default_value = "1G", value_parser = parse_memory)]
#[arg(short, long = "mem", value_name = "size[unit]", default_value = "1G", value_parser = parse_memory, env = SSUBMIT_MEMORY)]
pub memory: String,
/// Time limit for the job. e.g. 5d, 10h, 45m21s (case-insensitive)
///
/// Run `man sbatch | grep -A 7 'time=<'` for more details. If a single digit is passed, it will
/// be passed straight to sbatch (i.e. minutes). However, 5m5 will be considered 5 minutes and
/// 5 seconds.
#[arg(short, long, value_parser = parse_time, default_value = "1w")]
#[arg(short, long, value_parser = parse_time, default_value = "1d", env = SSUUBMIT_TIME)]
pub time: String,
/// The shell shebang for the submission script
#[arg(short = 'S', long, default_value = "#!/usr/bin/env bash")]
#[arg(short = 'S', long, default_value = "#!/usr/bin/env bash", env = SSUBMIT_SHEBANG)]
pub shebang: String,
/// Options for the set command in the shell script
///
Expand All @@ -67,10 +72,11 @@ pub struct Cli {
short,
long,
default_value = "euxo pipefail",
allow_hyphen_values = true
allow_hyphen_values = true,
env = SSUBMIT_SET
)]
pub set: String,
/// Print the sbatch command and submission script would be executed, but do not execute them
/// Print the sbatch command and submission script that would be executed, but do not execute them
#[arg(short = 'n', long)]
pub dry_run: bool,
/// Return an estimate of when the job would be scheduled to run given the current
Expand Down Expand Up @@ -460,4 +466,69 @@ mod tests {
let expected = "0";
assert_eq!(actual, expected);
}

#[test]
fn test_cli_parse_remainder() {
let args = Cli::parse_from(["ssubmit", "name", "command", "--", "-c", "8"]);

let actual = args.remainder.join(" ");
let expected = "-c 8";
assert_eq!(actual, expected);
}

#[test]
fn test_cli_parse_set_shebang_with_environment_variable() {
let shebang = "#!/bin/zsh";
unsafe {
std::env::set_var(SSUBMIT_SHEBANG, shebang);
}

let args = Cli::parse_from(["ssubmit", "name", "command"]);

let actual = args.shebang;
let expected = shebang;
assert_eq!(actual, expected);
}

#[test]
fn test_cli_parse_set_memory_with_environment_variable() {
let memory = "4M";
unsafe {
std::env::set_var(SSUBMIT_MEMORY, memory);
}

let args = Cli::parse_from(["ssubmit", "name", "command"]);

let actual = args.memory;
let expected = memory;
assert_eq!(actual, expected);
}

#[test]
fn test_cli_parse_set_time_with_environment_variable() {
let time = "1:0";
unsafe {
std::env::set_var(SSUUBMIT_TIME, time);
}

let args = Cli::parse_from(["ssubmit", "name", "command"]);

let actual = args.time;
let expected = time;
assert_eq!(actual, expected);
}

#[test]
fn test_cli_parse_set_with_environment_variable() {
let set = "eu";
unsafe {
std::env::set_var(SSUBMIT_SET, set);
}

let args = Cli::parse_from(["ssubmit", "name", "command"]);

let actual = args.set;
let expected = set;
assert_eq!(actual, expected);
}
}

0 comments on commit 9e87374

Please sign in to comment.