picocli.CommandLine$ExecutionException: Error while running command refineTagsAndSort java.lang.IllegalArgumentException: input file has no tags mixcr/4.7.0 #1855
-
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 1 reply
-
[singhd6@cc-dclrilog62 G1_SUBJ065_IgM_S7]$ ls -lart
8-66BAE,9CF@-C979<CFF,<C77+6+8@E<<EE9,,CC++++++>>FEG,,,,8?FCCCC8?,<,,<CFCF8,,8<<,?,?>F,,?FF:FEF,,2,?,,,?F,,;;@8;;:A?,,,,,,+3@22;CF++2;<<++<+2<++<9+3<1?/+2+<+<C7C:23<+<@++++++0<:///+;::CDEC**)2))277*)1+2::9:09@:19(05::1:2,349(809B0BB9??<DDFFAA49::(-(,303:<<
IMPORTANT: MiXCR will use at most 12000MB of RAM,
IMPORTANT: MiXCR will use at most 12000MB of RAM,
|
Beta Was this translation helpful? Give feedback.
-
The correct preset for data with UMIs is:
You need to specify the correct tag pattern based on your data to identify the UMI location accurately. What whitelist are you using and why (--set-whitelist UMI=file:umi_whitelist.txt )? UMIs usually do not have a fixed set of sequences.” |
Beta Was this translation helpful? Give feedback.
-
Hi Mark,
Thanks for your reply and valuable suggestions, however I have couple of questions. When I dissected my raw data, I have UMIs.
Also, I have questions about the clone ids as in the word document attached.
***@***.*** G1_SUBJ065_IgM_S7]$ xterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256colorxterm-256color^C
***@***.*** G1_SUBJ065_IgM_S7]$ zcat G1-SUBJ065-IgM_S7_L001_R1_001.fastq.gz | head -n 8
@M02288:126:000000000-JR9PP:1:1101:19449:1369 1:N:0:GCAATGCA+GGAACGTT
CAGGTGCAGCTGGTGGAGTCTGGGGGAGGCCTGGTCAAGCCGGGGGGGTCCCTGAGACTCTCCTGTGCAGCCTCTGGATTCACATTCAATTCCTTTGGCATGAACTGGGTCCGCCAGGCTCCAGGGAATGGACTGGCGTGGGTCTCAGCCATCAGTGCCAGTGTTCGCTACACATACTACGCAGACTCACTGAATGGCCGCCTCACCGTCTCCCGCGACAACCGCAATAACTTATTTCATCTTCAATTGAACAGCCTGAGAGCCGAGGACACGGCTGTTTATTTCTGCACGAAAGAAATT
+
***@***.******@***.***<<EE9,,CC++++++*>>FEG,,,,8?FCCCC8?,<,,<CFCF8,,8<<,?,?>F,,?FF:FEF,,2,?,,,?F,,*;;@*8;***;:A?,,,***,,,+3@****2***2;CF++2;*<<++<+2<++<9+3<*1?*/+2+<+<C7C*:***23<+<@++++++0<*:///+;:*:CDEC***)*2))27*7*)1+*2::*9:***09@*:***1***9*(05::1:2,349(809B0<F>BB9??<DDFFAA49::(-(,303:<<
@M02288:126:000000000-JR9PP:1:1101:19295:1381 1:N:0:GCAATGCA+GGAACGTT
CAGGTGCAGCTGGTGGAGTCTGGAGCTGAGGTGAAGAAGCCTGGGGCCTCAGTGAAGGTCTCCTGCAAGGCTTCTGGTTACACCTTTACCAGCTATGGTATCAGCTGGGTGCGACAGGCCCCTGGACAAGGGCTTGAGTGGATGGGATGGATCAGCGCTTACAATGGTAACACAAACTATGCACAGAAGCTCCAGGGCAGAGTCACCATGACCACAGACACCTCCACGAGCACAGCCTACATGGATCTGAGGAGCCTGAGATCTGACGACACGGCCGTGTATTACTGTGCGAGCGATAAC
+
***@***.******@***.***@***@***.******@***.******@***.******@***.***,AC8F,,A;D8+6DD,C99,3,,,,,@><D?8,+@+3+3@;3<9,4,*<@***@***.******@***.******@***.******@***.***>:>>BF;68:AFBFAAF9BF?;,11<1:2
***@***.*** G1_SUBJ065_IgM_S7]$ zcat G1-SUBJ065-IgM_S7_L001_R2_001.fastq.gz | head -n 8
@M02288:126:000000000-JR9PP:1:1101:19449:1369 2:N:0:GCAATGCA+GGAACGTT
CGTTGGGGCGGATGCACTCCCTGAGGAGACGGTGACTAAAGTCCCTTGGCCCCAGACATCAAGGGCGTCGCCCTTATCAGGTCCCACTGTGCCAATTTCTTTCGTGCAGAAATAAACAGCCGTGTCCTCGGCTCTCCGGCTGTTCCTTTTCAGATGCAATAAGTTATTGCGGTTGTCTCTGGAGACGGTGAGGCGGCCCTTCTGTGCGTCTTCGTAGTATGTGTACCGCCCACTTTCTCTTTTTTCTTCTTCCCTCTCCTTCCCCTTCCCTCTTCCCTTTCCTCCCCTTTTCTTCCCTCC
+
***@***.***:FDFGGGGGFCFGCGCFDDGGFF,CCDGFDFGGGGCEFGGE7DDFFG<9@@***@***.***,EEE,@,,,@@9,,7,3,7DADF,53*@***@***.***)71::BE))*.-9))--)-(,),)).).-))-)(.-))))-).),((()))((,(((,()(())),)))6)))((,((,.-))))()((
@M02288:126:000000000-JR9PP:1:1101:19295:1381 2:N:0:GCAATGCA+GGAACGTT
CGTTGGGGCGGATGCACTCCCTGAGGAGACGGTGACCAGGGTTCCCTGGCCCCAGTAGTCAAATTGAACCTCCGAGCTGCTATACCAGTTATCTCTCGCACAGTAATACACGGCCGTGTCGTCAGATCTCAGGCTCCTCAGCTCCATTTAGTCTGTGCTCGTGGATGTGTCTGTGGTCATGGTGACTCTGCCCTGGCTCTTCTGTGCATAGTTTGTGTTACCATTTTCTGCGCTTTTCCTTCCCCTCCCCTCCTCCCCTTTTCCCCCCCCCTCTCCCTCCCCTCTCTTTCCTTTTCTTTT
+
***@***.***+F+FE:AD:+<AACDA,,3@@bd>,3>=BB,6,@,,,57,@,7>;::8+*@,@***@***.***)9)+))(),((/1)))),/(/,((/(((((,,((,)-)))(((((-((..(.(,(-(((,())).))))))))))
I have umis in my raw data,and my UMi sequence 12 nucleotide and is located with first few bases of R1, specifically, right after N{0:3} (which seems to be a flexible region allowing up to 3 bases),
my tag pattern is :
mixcr analyze generic-amplicon-with-umi \
--species hsa \
--rna \
--tag-pattern "^N{0:3}(UMI:N{12})attcGCCA(R1:*)\^N{18}(R2:*)" \
--rigid-left-alignment-boundary \
--floating-right-alignment-boundary C \
-Xmx12000m \
--verbose \
--assemble-clonotypes-by '{FR1Begin:FR4End}' \
G1-SUBJ065-IgM_S7_L001_R1_001.fastq.gz \
G1-SUBJ065-IgM_S7_L001_R2_001.fastq.gz \
result/G1-SUBJ065-IgM_S7_L001
When I ran the above command, I have the absent barcode in my results quality check.
I am wondering if the mixcr/4.7.0 is compatible with the above mentioned tag patterns. Beacuse my raw data is not getting aligned with the reference vdjca files and refined vdjca files and .clns files i am receiving is small in size.
Attached word document contain all the information about the alignment.
-> to answer the question about the whitelist, I created a whitelist document based on the mixcr tagpattern recommnedations, to separately include UMIs as in the image attached about the presets.
[cid:910391cb-2d14-4de4-8eaa-7f13f08b5adb]
Reason I am interested in UMIs because I want to analyze the mutation frequencies then each clone will be tagged with the unique UMIs, though right now, the clone Ids could be identified as unique clones.
Having the UMIs in the data, i will process for downstream, analysis e.g. the image attached:
[cid:6c88e342-ae5f-4f16-800b-612e77014e1a]
Happy antibody discovery,
Have a great day,
Sincerely
Divya
From: mizraelson ***@***.***>
Sent: Thursday, November 14, 2024 5:49 PM
To: milaboratory/mixcr ***@***.***>
Cc: Singh, Divya Jyoti ***@***.***>; Author ***@***.***>
Subject: [EXT] Re: [milaboratory/mixcr] picocli.CommandLine$ExecutionException: Error while running command refineTagsAndSort java.lang.IllegalArgumentException: input file has no tags mixcr/4.7.0 (Issue #1854)
PROCEED WITH CAUTION: Slow down and pay close attention to emails sent from outside the organization. If you receive an unsolicited email from an unknown sender or are suspicious of the tone, style, vocabulary or urgency of the email message, never click links or open attachments within it. When in doubt, you should either delete the email, verify its authenticity by contacting the sender using an alternative method not listed in the email, or submit it via the BlueFish button in Outlook for investigation. If you don't have the BlueFish button or are using a mobile device, forward the email as an attachment to ***@***.******@***.***?subject=Report%20a%20Suspicious%20Email>
…________________________________
The correct preset for data with UMIs is:
mixcr analyze generic-amplicon-with-umi \
--species hsa \
--rna \
--tag-pattern "^(R1:*) \ ^(UMI:N{12})GTAC(R2:*)" \
--rigid-left-alignment-boundary \
--floating-right-alignment-boundary C \
input_R1.fastq.gz \
input_R2.fastq.gz \
result
You need to specify the correct tag pattern based on your data to identify the UMI location accurately. What whitelist are you using and why (--set-whitelist UMI=file:umi_whitelist.txt )? UMIs usually do not have a fixed set of sequences.”
—
Reply to this email directly, view it on GitHub<#1854 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BH7E6ZO2WZUULEWPA6ADZ7L2AUSG5AVCNFSM6AAAAABRZZ5BRWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZXGU3DGNJRGU>.
You are receiving this because you authored the thread.
Please consider the environment before printing this e-mail
Cleveland Clinic is a nonprofit, multispecialty academic medical center that's recognized in the U.S. and throughout the world for its expertise and care. Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you.
|
Beta Was this translation helpful? Give feedback.
-
The pattern must match the structure of your data; otherwise, it will not work.
It does not appear there is any unaligned sequence that could represent a UMI. |
Beta Was this translation helpful? Give feedback.
-
Thankyou for your reply Mark, I thought these sequences could be the UMIs/barcodes :N:0:GCAATGCA+GGAACGTT, and then i added tag patterns, I double checked, UMIs were not used for sequencing thus we do not have UMIs. |
Beta Was this translation helpful? Give feedback.
These sequences are Illumina barcodes that distinguish different samples, not UMIs. For non-UMI data, you should use the following command:
You can’t use '{FR1Begin:FR4End}' because your reads do not cover the beginning of FR1, at least based on the example in my previous message.
After running the command, you can proceed with generating SHM trees or other types of analyses. Whenever you need a count of clones, you can refer…