ncov2019-artic-nf

A Nextflow pipeline for running the ARTIC network's fieldbioinformatics tools (https://github.com/artic-network/fieldbioinformatics), with a focus on ncov2019

WARNING - THIS REPO IS UNDER ACTIVE DEVELOPMENT AND ITS BEHAVIOUR MAY CHANGE AT ANY TIME.

PLEASE ENSURE THAT YOU READ BOTH THE README AND THE CONFIG FILE AND UNDERSTAND THE EFFECT OF THE OPTIONS ON YOUR DATA!

Introduction

This Nextflow pipeline automates the ARTIC network nCoV-2019 novel coronavirus bioinformatics protocol. It is being developed to aid the harmonisation of the analysis of sequencing data generated by the COG-UK project. It will turn SARS-COV2 sequencing data (Illumina or Nanopore) into consensus sequences and provide other helpful outputs to assist the project's sequencing centres with submitting data.

Quick-start

Illumina

nextflow run connor-lab/ncov2019-artic-nf [-profile conda,singularity,docker,slurm] --illumina --prefix "output_file_prefix" --directory /path/to/reads

Nanopore

Nanopolish

nextflow run connor-lab/ncov2019-artic-nf [-profile conda,singularity,docker,slurm] --nanopolish --prefix "output_file_prefix" --basecalled_fastq /path/to/directory --fast5_pass /path/to/directory --sequencing_summary /path/to/sequencing_summary.txt

Medaka

nextflow run connor-lab/ncov2019-artic-nf [-profile conda,singularity,docker,slurm] --medaka --prefix "output_file_prefix" --basecalled_fastq /path/to/directory --fast5_pass /path/to/directory --sequencing_summary /path/to/sequencing_summary.txt

Installation

An up-to-date version of Nextflow is required because the pipeline is written in DSL2. Following the instructions at https://www.nextflow.io/ to download and install Nextflow should get you a recent-enough version.

Containers

This repo contains both Singularity and Dockerfiles. You can build the Singularity containers locally by running scripts/build_singularity_containers.sh and use them with -profile singularity The containers will be available from Docker/Singularityhub shortly.

Conda

The repo contains a environment.yml files which automatically build the correct conda env if -profile conda is specifed in the command. Although you'll need conda installed, this is probably the easiest way to run this pipeline.

Executors

By default, the pipeline just runs on the local machine. You can specify -profile slurm to use a SLURM cluster.

Profiles

You can use multiple profiles at once, separating them with a comma. This is described in the Nextflow documentation

Config

Common configuration options are set in conf/base.config. Workflow specific configuration options are set in conf/nanopore.config and conf/illumina.config They are described and set to sensible defaults (as suggested in the nCoV-2019 novel coronavirus bioinformatics protocol)

Options

--outdir sets the output directory.
--bwa to swap to bwa for mapping (nanopore only).

Workflows

Nanopore

Use --nanopolish or --medaka to run these workflows. --basecalled_fastq should point to a directory created by guppy_basecaller (if you ran with no barcodes), or guppy_barcoder (if you ran with barcodes). It is imperative that the following guppy_barcoder command be used for demultiplexing:

guppy_barcoder --require_barcodes_both_ends -i run_name -s output_directory --arrangements_files "barcode_arrs_nb12.cfg barcode_arrs_nb24.cfg"

Illumina

The Illumina workflow leans heavily on the excellent ivar for primer trimming and consensus making. This workflow will be updated to follow ivar, as its also in very active development! Use --illumina to run the Illumina workflow. Use --directory to point to an Illumina output directory usually coded something like: <date>_<machine_id>_<run_no>_<some_zeros>_<flowcell>. The workflow will recursively grab all fastq files under this directory, so be sure that what you want is in there, and what you don't, isn't!

Important config options are:

Option	Description
allowNoprimer	Allow reads that don't have primer sequence? Ligation prep = false, nextera = true
illuminaKeepLen	Length of illumina reads to keep after primer trimming
illuminaQualThreshold	Sliding window quality threshold for keeping reads after primer trimming (illumina)
mpileupDepth	Mpileup depth for ivar
ivarFreqThreshold	ivar frequency threshold for variant
ivarMinDepth	Minimum coverage depth to call variant

Output

A subdirectory for each process in the workflow is created in --outdir. Additionally, a climb_upload subdirectory containing files important for COG-UK is created.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
bin		bin
conf		conf
modules		modules
scripts		scripts
workflows		workflows
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
Singularity.artic-ncov2019-illumina		Singularity.artic-ncov2019-illumina
Singularity.artic-ncov2019-medaka		Singularity.artic-ncov2019-medaka
Singularity.artic-ncov2019-nanopolish		Singularity.artic-ncov2019-nanopolish
environment-illumina.yml		environment-illumina.yml
environment-medaka.yml		environment-medaka.yml
environment.yml		environment.yml
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ncov2019-artic-nf

Introduction

Quick-start

Illumina

Nanopore

Nanopolish

Medaka

Installation

Containers

Conda

Executors

Profiles

Config

Options

Workflows

Nanopore

Illumina

Output

About

Releases

Packages

Languages

embatty/ncov2019-artic-nf

Folders and files

Latest commit

History

Repository files navigation

ncov2019-artic-nf

Introduction

Quick-start

Illumina

Nanopore

Nanopolish

Medaka

Installation

Containers

Conda

Executors

Profiles

Config

Options

Workflows

Nanopore

Illumina

Output

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages