Guidance on bulk BCR-seq analysis to quantify clonal sharing across samples #1738
-
Hi, I am currently using MiXCR to re-analyze a large bulk BCR-seq dataset with the hope to quantify clonal sharing across tumour sites for each patient. The dataset:
MiXCR 4.6.0-50-develop workflow used to date:
Here are my questions:
Thanks so much in advance for your help! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
"For a fair amount of samples, mixcr qc reports 45-70 % of successfully aligned reads with 12-50% of off target reads. I believe I am using the correct preset so I’m thinking that the low immune infiltration of some tumours led to the amplification of spurious sequences. Does that sound plausible? Any advice on what to do with these samples?" I suggest exporting the non-aligned reads by adding the following parameter to your analyze command: --output-not-used-reads1. Then you can manually inspect these reads to identify their origin. "For most samples, I noticed that the % of reads used in clonotypes is <10%. This percentage increases if I set --assemble-clonotypes-by to CDR3 rather than VDJRegion. I’m thinking that this is caused by a low sequencing quality as the % of overlapping read-pairs (overlappedPercents) sits around 40-60% for most samples. Would you recommend focusing on CDR3 rather than VDJRegion or can I still trust the VDJRegion reconstructed here?" Most likely, 250+250 sequencing is not enough to cover the full VDJRegion, considering it’s a 5’RACE protocol. Usually, you would want to use 300+300. You can proceed with assembling clones by CDR3 unless you are specifically interested in hypermutations. "What are your thoughts around reconstructing lineage trees from libraries that do not include UMIs? Is it reliable enough to quantify clonal sharing across tumour sites for each patient or not? Would it be best to compute pairwise distance metrics between samples based on the CDR3 sequences (https://mixcr.com/mixcr/reference/mixcr-postanalysis/#overlap-postanalysis)?" You would definitely need to assemble clones by a longer feature than CDR3 to reconstruct lineage trees. With this data, you can try "I observed some public clonotypes, I.e. clonotypes present in a lot of samples. Is there a way to filter them out before reconstructing lineage trees or performing the overlap postanalysis?" Currently, we do not have any filter for public clonotypes in MiXCR. You can do it manually by filtering the .clns files. |
Beta Was this translation helpful? Give feedback.
"For a fair amount of samples, mixcr qc reports 45-70 % of successfully aligned reads with 12-50% of off target reads. I believe I am using the correct preset so I’m thinking that the low immune infiltration of some tumours led to the amplification of spurious sequences. Does that sound plausible? Any advice on what to do with these samples?"
I suggest exporting the non-aligned reads by adding the following parameter to your analyze command: --output-not-used-reads1. Then you can manually inspect these reads to identify their origin.
"For most samples, I noticed that the % of reads used in clonotypes is <10%. This percentage increases if I set --assemble-clonotypes-by to CDR3 rather than …