Skip to content

BCCDC-PHL/amplicone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Amplicone - Amplicon rEad simulator

A Nextflow pipeline for running ART with modifications to support amplicon read simulations and the option to have user supplied depths of each amplicon.

Introduction

This pipeline uses ART to generate simulated reads from an input fasta file. Given a fasta and primer bed file, the pipeline will generate amplicon-specific reads. Optionally, the user can provide a CSV file specifying individual amplicon depths. If not specified, amplicon reads will be generated in equal proportions based on the supplied depth parameter.

flowchart TD
    fasta[fasta directory] --> ampToFasta[convertFastaToAmplicons]
    primer_bed[primer.bed] --> ampToFasta[convertFastaToAmplicons]
    ampToFasta --> ART[runART] --> fastq[fastq]
    amplicon_depths[amplicon_depths.csv] --> VariableART[runArtVariableDepths]
    ampToFasta --> VariableART[runArtVariableDepths] --> VariableFastq[fastq with user specified individual amplicon depths]
Loading

Quick-start

nextflow run BCCDC-PHL/amplicone -profile conda \
  --bed /path/to/primers.bed \
  --fasta_dir /path/to/fasta_directory \
  --model_R1 /path/to/error_model_R1 \
  --model_R2 /path/to/error_model_R2 \
  --outdir /path/to/outputs 

Installation

An up-to-date version of Nextflow is required because the pipeline is written in DSL2. Follow the instructions at https://www.nextflow.io/ to download and install Nextflow.

Conda

The repo contains a environment.yml files which automatically build the correct conda env if -profile conda is specifed in the command.

--cache /some/dir can be specified to have a fixed, shared location to store the conda build for use by multiple runs of the workflow.

Config

Important config options are:

Option Default Description
vary_amplicon_depths false Set to true if user is supplying individual amplicon depths
amplicon_depths NO_FILE A CSV file containing "amplicon" and "depth" for each amplicon in primer.bed file
depth 50 Desired depth for reads if not supplying individual amplicon depths
fragment_mean 600 Mean genomic fragment size
fragment_sd 75 Standard deviation of genomic fragment size
read_length 150 Simulated read length
model_R1 NO_FILE Error profile of R1 reads
model_R2 NO_FILE Error profile of R2 reads

Output

A subdirectory for each process in the workflow is created in --outdir.