TBProfiler

This is the experimental commandline version of the TBProfiler described here: https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-015-0164-0 and is available here: http://tbdr.lshtm.ac.uk

This repository contains a complete rewrite of the web version of TBProfiler. It allows the use of profiling through a command line interface and contains some additional functionality such as the ability to process minION data.

The pipeline aligns reads to the H37Rv reference BWA or minimap2 and then looks at the coverage across a number of different candidate regions. We also predict the number of reads supporting drug resistance variants as an insight into hetero-resistance (not applicable for minION data)

Important changes as of v0.3.0

Version 0.3.0 features many changes to the code including the modularisation through creation of specific classes and functions for various functions. This will make it easier to maintain and add additional functionality. Support for minION has also been added. In the process some old functionality has not been added yet and some required arguments have changed. Please download v0.2.1 from the releases if you require the old code.

Installation

git clone --recursive https://github.com/jodyphelan/TBProfiler.git
cd TBProfiler
bash install_prerequisites.sh
echo "export PATH=$PWD:$PATH" >> ~/.bashrc

For OSX use:

bash osX_install_prerequisites.sh

Usage

The first argument indicates the analysis type to perform. At the moment we currently only support the calling of small variants or the detection of large deletions.

Quick start example

Run whole pipeline:

tb-profiler profile -1 /path/to/reads_1.fastq.gz -2 /path/to/reads_2.fastq.gz -p prefix

The prefix is usefull when you need to run more that one sample. This will store BAM files, summary pileup and result files in respective directories. Results are output in text format and json format.

Example run:

mkdir test_run; cd test_run
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR166/009/ERR1664619/ERR1664619_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR166/009/ERR1664619/ERR1664619_2.fastq.gz
../tb-profiler profile -1 ERR1664619_1.fastq.gz -2 ERR1664619_2.fastq.gz -t 4 -p ERR1664619
cat results/ERR1664619.results.txt

Running with an existing BAM file:

By using the -a option you can specify to use an existing BAM file instead of fastq files. Warning!!!: The BAM files must have been created using the ensembl version of the genome which can be downloaded here:

ftp://ftp.ensemblgenomes.org/pub/release-32/bacteria//fasta/bacteria_0_collection/mycobacterium_tuberculosis_h37rv/dna/Mycobacterium_tuberculosis_h37rv.ASM19595v2.dna.toplevel.fa.gz

The results from numerous runs can be collated into one table using the following command:

tb-profiler collate samples_file out_file

Where samples_file is a list of prefixes of previously run samples and out_file is the name of the output file.

Under the hood

The pipeline searches for small variants and big deletions associated with drug resistance. It will also report the lineage.

Adding new genes/mutations

To add new mutations navigate to the db directory and edit the drdb.txt file. Add a new line corresponding to the desired variant with the following columns:

Drug - Drug name with no spaces
Genomic position - If more than one position affected, seperate with "/".
Reference nucleotides - String of nucleotides with length equal to the number of bases affected.
Alternate nucleotides - String of nucleotides with length equal to the number of bases affected.
Gene name
Mutation - String with the mutation

After editing the file run the parse_drdb.py <prefix> script using a prefix to generate a new database. To use the database use the --db <prefix> option in tb-profiler.

Examples:

A non-synonymous variant:

ETHIONAMIDE 1674484/1674485 AT CC inhA Ile95Pro

A promoter mutation:

ISONIAZID 2156118 C T katG_promoter C-7T

An indel:

PYRAZINAMIDE 2288953 CC C pncA CC289C

ITOL files

Several files are produced by the tb-profile collate function. Among these are several config files that can be used with iTOL (http://itol.embl.de/) to annotate phylogenetic trees. A small tree and config files have been placed in the example_data directory. To use navigate to the iTOL website and upload the tbprofiler.tree file using the upload button on the navigation bar. Once this has been uploaded you will be taken to a visualisation of the tree. To add the annotation, click on the '+' button on the lower right hand corner and select the iTOL config files. You should now see a figure similar to the one below. The following annotations are included:

Lineage
Drug resistance classes (Sensitive, drug-resistant, MDR, XDR)
Drug resistance calls for individual drugs, were filled circles represent resistance.

Change Log

v0.3

Modularisation of code into classes and functions Support for minION

v0.2.1

Collate data fix
Generation of ITOL data files for visualisation

v0.2

Allow for the choice of BWA or SNAP for Linux users
Calling of deletions before small variant calling to avoid low quality variants around deletion breakpoints

v0.1

Allow users to provide BAM file as input
Ability to print out version

Citation

If you would like to cite this work please use:

Coll, F. et al. Rapid determination of anti-tuberculosis drug resistance from whole-genome sequences. Genome Med. 5:51, (2015).

Name		Name	Last commit message	Last commit date
Latest commit History 221 Commits
bcftools @ 8190859		bcftools @ 8190859
bwa @ 5961611		bwa @ 5961611
db		db
docker		docker
example_data		example_data
htsbox @ e1e67f9		htsbox @ e1e67f9
htslib @ b065a60		htslib @ b065a60
lofreq @ 0b93d43		lofreq @ 0b93d43
minimap2 @ 21a46ba		minimap2 @ 21a46ba
old_scripts		old_scripts
samtools @ 3ecdcc4		samtools @ 3ecdcc4
scripts		scripts
tbprofiler		tbprofiler
.gitmodules		.gitmodules
README.md		README.md
change_log.md		change_log.md
index.md		index.md
install_prerequisites.sh		install_prerequisites.sh
osX_install_prerequisites.sh		osX_install_prerequisites.sh
tb-profiler		tb-profiler
tb-profiler.source		tb-profiler.source

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TBProfiler

Important changes as of v0.3.0

Installation

Usage

Quick start example

Running with an existing BAM file:

Under the hood

Adding new genes/mutations

Examples:

A non-synonymous variant:

A promoter mutation:

An indel:

ITOL files

Change Log

v0.3

v0.2.1

v0.2

v0.1

Citation

About

Releases

Packages

Languages

COMBAT-TB/TBProfiler

Folders and files

Latest commit

History

Repository files navigation

TBProfiler

Important changes as of v0.3.0

Installation

Usage

Quick start example

Running with an existing BAM file:

Under the hood

Adding new genes/mutations

Examples:

A non-synonymous variant:

A promoter mutation:

An indel:

ITOL files

Change Log

v0.3

v0.2.1

v0.2

v0.1

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages