forked from viash-hub/biobox
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into samtools_fastq
- Loading branch information
Showing
14 changed files
with
615 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
``` | ||
samtools view | ||
``` | ||
|
||
Usage: samtools view [options] <in.bam>|<in.sam>|<in.cram> [region ...] | ||
|
||
Output options: | ||
-b, --bam Output BAM | ||
-C, --cram Output CRAM (requires -T) | ||
-1, --fast Use fast BAM compression (and default to --bam) | ||
-u, --uncompressed Uncompressed BAM output (and default to --bam) | ||
-h, --with-header Include header in SAM output | ||
-H, --header-only Print SAM header only (no alignments) | ||
--no-header Print SAM alignment records only [default] | ||
-c, --count Print only the count of matching records | ||
-o, --output FILE Write output to FILE [standard output] | ||
-U, --unoutput FILE, --output-unselected FILE | ||
Output reads not selected by filters to FILE | ||
-p, --unmap Set flag to UNMAP on reads not selected | ||
then write to output file. | ||
-P, --fetch-pairs Retrieve complete pairs even when outside of region | ||
Input options: | ||
-t, --fai-reference FILE FILE listing reference names and lengths | ||
-M, --use-index Use index and multi-region iterator for regions | ||
--region[s]-file FILE Use index to include only reads overlapping FILE | ||
-X, --customized-index Expect extra index file argument after <in.bam> | ||
|
||
Filtering options (Only include in output reads that...): | ||
-L, --target[s]-file FILE ...overlap (BED) regions in FILE | ||
-N, --qname-file [^]FILE ...whose read name is listed in FILE ("^" negates) | ||
-r, --read-group STR ...are in read group STR | ||
-R, --read-group-file [^]FILE | ||
...are in a read group listed in FILE | ||
-d, --tag STR1[:STR2] ...have a tag STR1 (with associated value STR2) | ||
-D, --tag-file STR:FILE ...have a tag STR whose value is listed in FILE | ||
-q, --min-MQ INT ...have mapping quality >= INT | ||
-l, --library STR ...are in library STR | ||
-m, --min-qlen INT ...cover >= INT query bases (as measured via CIGAR) | ||
-e, --expr STR ...match the filter expression STR | ||
-f, --require-flags FLAG ...have all of the FLAGs present | ||
-F, --excl[ude]-flags FLAG ...have none of the FLAGs present | ||
--rf, --incl-flags, --include-flags FLAG | ||
...have some of the FLAGs present | ||
-G FLAG EXCLUDE reads with all of the FLAGs present | ||
--subsample FLOAT Keep only FLOAT fraction of templates/read pairs | ||
--subsample-seed INT Influence WHICH reads are kept in subsampling [0] | ||
-s INT.FRAC Same as --subsample 0.FRAC --subsample-seed INT | ||
|
||
Processing options: | ||
--add-flags FLAG Add FLAGs to reads | ||
--remove-flags FLAG Remove FLAGs from reads | ||
-x, --remove-tag STR | ||
Comma-separated read tags to strip (repeatable) [null] | ||
--keep-tag STR | ||
Comma-separated read tags to preserve (repeatable) [null]. | ||
Equivalent to "-x ^STR" | ||
-B, --remove-B Collapse the backward CIGAR operation | ||
-z, --sanitize FLAGS Perform sanitity checking and fixing on records. | ||
FLAGS is comma separated (see manual). [off] | ||
|
||
General options: | ||
-?, --help Print long help, including note about region specification | ||
-S Ignored (input format is auto-detected) | ||
--no-PG Do not add a PG line | ||
--input-fmt-option OPT[=VAL] | ||
Specify a single input file format option in the form | ||
of OPTION or OPTION=VALUE | ||
-O, --output-fmt FORMAT[,OPT[=VAL]]... | ||
Specify output format (SAM, BAM, CRAM) | ||
--output-fmt-option OPT[=VAL] | ||
Specify a single output file format option in the form | ||
of OPTION or OPTION=VALUE | ||
-T, --reference FILE | ||
Reference sequence FASTA FILE [null] | ||
-@, --threads INT | ||
Number of additional threads to use [0] | ||
--write-index | ||
Automatically index the output files [off] | ||
--verbosity INT | ||
Set level of verbosity |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
#!/bin/bash | ||
|
||
## VIASH START | ||
## VIASH END | ||
|
||
set -e | ||
|
||
[[ "$par_bam" == "false" ]] && unset par_bam | ||
[[ "$par_cram" == "false" ]] && unset par_cram | ||
[[ "$par_fast" == "false" ]] && unset par_fast | ||
[[ "$par_uncompressed" == "false" ]] && unset par_uncompressed | ||
[[ "$par_with_header" == "false" ]] && unset par_with_header | ||
[[ "$par_header_only" == "false" ]] && unset par_header_only | ||
[[ "$par_no_header" == "false" ]] && unset par_no_header | ||
[[ "$par_count" == "false" ]] && unset par_count | ||
[[ "$par_unmap" == "false" ]] && unset par_unmap | ||
[[ "$par_use_index" == "false" ]] && unset par_use_index | ||
[[ "$par_fetch_pairs" == "false" ]] && unset par_fetch_pairs | ||
[[ "$par_customized_index" == "false" ]] && unset par_customized_index | ||
[[ "$par_no_PG" == "false" ]] && unset par_no_PG | ||
[[ "$par_write_index" == "false" ]] && unset par_write_index | ||
[[ "$par_remove_B" == "false" ]] && unset par_remove_B | ||
|
||
samtools view \ | ||
${par_bam:+-b} \ | ||
${par_cram:+-C} \ | ||
${par_fast:+--fast} \ | ||
${par_uncompressed:+-u} \ | ||
${par_with_header:+--with-header} \ | ||
${par_header_only:+-H} \ | ||
${par_no_header:+--no-header} \ | ||
${par_count:+-c} \ | ||
${par_output:+-o "$par_output"} \ | ||
${par_output_unselected:+-U "$par_output_unselected"} \ | ||
${par_unmap:+-p "$par_unmap"} \ | ||
${par_fetch_pairs:+-P "$par_fetch_pairs"} \ | ||
${par_fai_reference:+-t "$par_fai_reference"} \ | ||
${par_use_index:+-M "$par_use_index"} \ | ||
${par_region_file:+--region-file "$par_region_file"} \ | ||
${par_customized_index:+-X} \ | ||
${par_target_file:+-L "$par_target_file"} \ | ||
${par_qname_file:+-N "$par_qname_file"} \ | ||
${par_read_group:+-r "$par_read_group"} \ | ||
${par_read_group_file:+-R "$par_read_group_file"} \ | ||
${par_tag:+-d "$par_tag"} \ | ||
${par_tag_file:+-D "$par_tag_file"} \ | ||
${par_min_MQ:+-q "$par_min_MQ"} \ | ||
${par_library:+-l "$par_library"} \ | ||
${par_min_qlen:+-m "$par_min_qlen"} \ | ||
${par_expr:+-e "$par_expr"} \ | ||
${par_require_flags:+-f "$par_require_flags"} \ | ||
${par_excl_flags:+-F "$par_excl_flags"} \ | ||
${par_incl_flags:+--rf "$par_incl_flags"} \ | ||
${par_excl_all_flags:+-G "$par_excl_all_flags"} \ | ||
${par_subsample:+--subsample "$par_subsample"} \ | ||
${par_subsample_seed:+--subsample-seed "$par_subsample_seed"} \ | ||
${par_add_flags:+--add-flags "$par_add_flags"} \ | ||
${par_remove_flags:+--remove-flags "$par_remove_flags"} \ | ||
${par_remove_tag:+-x "$par_remove_tag"} \ | ||
${par_keep_tag:+--keep-tag "$par_keep_tag"} \ | ||
${par_remove_B:+-B} \ | ||
${par_sanitize:+-z "$par_sanitize"} \ | ||
${par_input_fmt_option:+--input-fmt-option "$par_input_fmt_option"} \ | ||
${par_output_fmt:+-O "$par_output_fmt"} \ | ||
${par_output_fmt_option:+--output-fmt-option "$par_output_fmt_option"} \ | ||
${par_reference:+-T "$par_reference"} \ | ||
${par_write_index:+--write-index} \ | ||
${par_no_PG:+--no-PG} \ | ||
"$par_input" | ||
|
||
exit 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
#!/bin/bash | ||
|
||
test_dir="${meta_resources_dir}/test_data" | ||
temp_dir="${meta_resources_dir}/out" | ||
|
||
############################################################################################ | ||
|
||
echo ">>> Test 1: Import SAM to BAM when @SQ lines are present in the header" | ||
"$meta_executable" \ | ||
--bam \ | ||
--output "$temp_dir/a.bam" \ | ||
--input "$test_dir/a.sam" | ||
|
||
echo ">>> Checking whether output exists" | ||
[ ! -f "$temp_dir/a.bam" ] && echo "File 'a.bam' does not exist!" && exit 1 | ||
|
||
echo ">>> Checking whether output is non-empty" | ||
[ ! -s "$temp_dir/a.bam" ] && echo "File 'a.bam' is empty!" && exit 1 | ||
|
||
echo ">>> Checking whether output is correct" | ||
# compare output of "samtools view" for both files | ||
diff <(samtools view "$temp_dir/a.bam") <(samtools view "$test_dir/a.bam") || \ | ||
(echo "Output file a.bam does not match expected output" && exit 1) | ||
|
||
############################################################################################ | ||
|
||
echo ">>> Test 2: ${meta_functionality_name} with CRAM format output" | ||
|
||
"$meta_executable" \ | ||
--cram \ | ||
--output "$temp_dir/a.cram" \ | ||
--input "$test_dir/a.sam" | ||
|
||
echo ">>> Checking whether output exists" | ||
[ ! -f "$temp_dir/a.cram" ] && echo "File 'a.cram' does not exist!" && exit 1 | ||
|
||
echo ">>> Checking whether output is non-empty" | ||
[ ! -s "$temp_dir/a.cram" ] && echo "File 'a.cram' is empty!" && exit 1 | ||
|
||
echo ">>> Checking whether output is correct" | ||
# compare output of "samtools view" for both files | ||
diff <(samtools view "$temp_dir/a.cram") <(samtools view "$test_dir/a.cram") || \ | ||
(echo "Output file a.cram does not match expected output" && exit 1) | ||
|
||
############################################################################################ | ||
|
||
echo ">>> Test 3: ${meta_functionality_name} with --count option" | ||
|
||
"$meta_executable" \ | ||
--count \ | ||
--output "$temp_dir/a.count" \ | ||
--input "$test_dir/a.sam" | ||
|
||
echo ">>> Checking whether output exists" | ||
[ ! -f "$temp_dir/a.count" ] && echo "File 'a.count' does not exist!" && exit 1 | ||
|
||
echo ">>> Checking whether output is non-empty" | ||
[ ! -s "$temp_dir/a.count" ] && echo "File 'a.count' is empty!" && exit 1 | ||
|
||
echo ">>> Checking whether output is correct" | ||
diff "$temp_dir/a.count" "$test_dir/a.count" || \ | ||
(echo "Output file a.count does not match expected output" && exit 1) | ||
|
||
############################################################################################ | ||
|
||
echo ">>> Test 4: ${meta_functionality_name} including only the forward reads from read pairs" | ||
|
||
"$meta_executable" \ | ||
--output "$temp_dir/a.forward" \ | ||
--excl_flags "0x80" \ | ||
--input "$test_dir/a.sam" | ||
|
||
echo ">>> Checking whether output exists" | ||
[ ! -f "$temp_dir/a.forward" ] && echo "File 'a.forward' does not exist!" && exit 1 | ||
|
||
echo ">>> Checking whether output is non-empty" | ||
[ ! -s "$temp_dir/a.forward" ] && echo "File 'a.forward' is empty!" && exit 1 | ||
|
||
echo ">>> Checking whether output is correct" | ||
diff "$temp_dir/a.forward" "$test_dir/a.forward" || \ | ||
(echo "Output file a.forward does not match expected output" && exit 1) | ||
|
||
############################################################################################ | ||
|
||
echo ">>> All test passed successfully" | ||
rm -rf "${temp_dir}" | ||
exit 0 |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
6 |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
a1 99 xx 1 1 10M = 11 20 AAAAAAAAAA ********** | ||
b1 99 xx 1 1 10M = 11 20 AAAAAAAAAA ********** | ||
c1 99 xx 1 1 10M = 11 20 AAAAAAAAAA ********** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
@SQ SN:xx LN:20 | ||
a1 99 xx 1 1 10M = 11 20 AAAAAAAAAA ********** | ||
b1 99 xx 1 1 10M = 11 20 AAAAAAAAAA ********** | ||
c1 99 xx 1 1 10M = 11 20 AAAAAAAAAA ********** | ||
a1 147 xx 11 1 10M = 1 -20 TTTTTTTTTT ********** | ||
b1 147 xx 11 1 10M = 1 -20 TTTTTTTTTT ********** | ||
c1 147 xx 11 1 10M = 1 -20 TTTTTTTTTT ********** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
#!/bin/bash | ||
|
||
# dowload test data from snakemake wrapper | ||
if [ ! -d /tmp/view_source ]; then | ||
git clone --depth 1 --single-branch --branch master https://github.com/snakemake/snakemake-wrappers.git /tmp/view_source | ||
fi | ||
|
||
cp -r /tmp/idxstats_source/bio/samtools/view/test/*.sam src/samtools/samtools_view/test_data |