Artificial gaps in reported amino acid sequences #1612
-
When looking at the output of exportClones from bulk IgG sequences, I have quite a few sequences which are productive (i.e. no stop, no frame shift, etc) and complete/non-gapped nt sequences, but the amino acid translations have _ reported. I am using v4.6.0, a custom reference, and assemble contigs by VDJRegion. This is the code I ran: An example of a concerning output is:
Specifically concerning are these feature translations:
When I translate the target sequence, I get the following:
And when I put the nucleotide sequence into IgBLAST this is the output I get which is what I would expect from MiXCR:
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hi, it's hard to tell without looking at the alignment of the specific clone, but I think I know what is going on. So this is a tricky thing. My guess is there are some nucleotide deletions in the sequence. We know that all FRs should end with a complete codon thus should have a strict AA border. If it doesn't, it is not clear where to assign the last AA during the translation. Thats why we translate each FR and CDR from both ends moving towards the center and in a normal scenario this returns an accurate translation. If one nucleotide was deleted or inserted there will be a wrong AA sequence for this region, when exported separately. So this is done in order to correctly define AA borders of the segments ending with a complete codon. The workaround here is to add an export column with the VRegion or the VDJRegion for example. These regions are translated from one end, eliminating the need to distinguish segment borders, and return the correct AA receptor sequence. Use |
Beta Was this translation helpful? Give feedback.
-
Ok, thank you! |
Beta Was this translation helpful? Give feedback.
Hi, it's hard to tell without looking at the alignment of the specific clone, but I think I know what is going on. So this is a tricky thing. My guess is there are some nucleotide deletions in the sequence. We know that all FRs should end with a complete codon thus should have a strict AA border. If it doesn't, it is not clear where to assign the last AA during the translation. Thats why we translate each FR and CDR from both ends moving towards the center and in a normal scenario this returns an accurate translation. If one nucleotide was deleted or inserted there will be a wrong AA sequence for this region, when exported separately. So this is done in order to correctly define AA border…