Replies: 2 comments
-
Hi Rachel, You are right: MiXCR supports only substitutions in tag pattern fuzzy matching. For short patterns you can simulate indels with logical OR, however in your case of long linker sequence in ONT data, indeed you can loose some sequences due to indels. We plan to add support for indels for pattern matching in future, but this is very complicated from the algorithmic point of view, so can't commit for any particular time. On the other hand, MiXCR handles indels in tag refinement step as well as substitutions. Just to check whether there is a room for improvement, could you share the tag pattern that you use? |
Beta Was this translation helpful? Give feedback.
-
Thanks for the clarification! I can't share the exact linker sequences here so I will represent them as We used a V gene primer that overlaps with For the long linker sequences in between CELL barcodes, I am thinking of converting some part of them to I noticed that tag patterns work and even matches a few more reads if I don't put
|
Beta Was this translation helpful? Give feedback.
-
According to the Barcode pattern syntax documentation about fuzzy match, indels are not supported. Could this cause over-filtering of reads during alignment step when I include long linker sequences in the tag pattern? We expect indel rate of 1-4% in nanopore sequencing data so I imagine a lot of reads get removed. I see that the refineTagAndSort have parameters to control the amount of indels allowed for correcting Capture Groups. I mistakenly assumed fuzzy matching during align step would also allow indels. Some fuzzy matching methods based on edit distance usually allow deletions, insertions, or substitutions. What is the fuzzy matching done here?
Beta Was this translation helpful? Give feedback.
All reactions