This repository is a place to discuss Mpox virus sequences, in particular potential quality issues like:
- Sequencing erros: overdivergence, frame shifts, reference bias, ...
- Metadata errors: collection date, country, ...
Dozens of labs around the world have generously shared thousands of Mpox virus sequences via Genbank.
At such a scale, sequencinig and metadata errors inevitably occur. Those issues are often first spotted by scientists who use others' sequences for secondary analyses.
Each analyst currently needs to maintain their own list of outliers as there is not central place to collect them. Discussions often happen in non-public spaces, e.g. Slack spaces, meaning they are not discoverable.
It would be ideal if analysts could share their observations and create a community resource to deduplicate work.
In addition, once a potential error has been identified, the original submitter might be able to rectify it.
Issues in this repo serve as public spaces to dicuss potential sequence quality issues.
If you've noticed something odd in a sequence or a set of sequences, open an issue explaining what you've noticed and why you think it might be an artefact.
Other community members can then share their thoughts and analyses.