PMERGE

(Compatible till Stacks 1.48)

PMERGE, is a software, which implements a new method that identifies candidate PSVs by building networks of loci that share high levels of nucleotide similarity. The PMERGE is embedded in the analysis pipeline of the widely used Stacks software, and it is straightforward to apply it as an additional filter in population-genomic studies using RAD-seq data.

![PMERGE workflow] (https://github.com/beiko-lab/PMERGE/blob/master/pmerge_flow%20(1).png)

The PMERGE software is run after cstacks and before populations to generate a “whitelist” of loci from the catalog based on population-level filtering conditions and our new paralog-detection method. The populations program then uses only the whitelisted loci to generate population-genetic statistics. Apart from the paralog filter, PMERGE includes the following filters that are also used by the populations program: percent samples limit per population (r), which requires that a locus be present in at least the specified percentage of individuals in a population; locus population limit (p), the minimum number of populations in which a locus must be present; minor allele frequency cutoff (a), which sets a minimum threshold for the frequency of the minor allele (the second-most-frequent allele at a given locus); maximum observed heterozygozity (q); and minimum stack depth (m) at a given locus.

**

Implementation

**

The PMERGE software is implemented in C++ and parallelized using the OpenMP libraries. The PMERGE will complied in GNU-based Linux systems or BSD-based OS X systems. It is released under GNU GPL license.

Installation

Download the source files from https://github.com/beiko-lab/PMERGE

In the Terminal:

• Unzip the downloaded zip file.

• Traverse to the folder "Install".

$ ./configure
$ make
root access or sudo 
$ make install

PMERGE will be installed in the path “ /usr/local/bin “

Usage

pmerge -b batch_id -P path -M path [-r min] [-m min][-C cluster][-t threads]

   b: Batch ID to examine when exporting from the catalog.
   
   P: path to the Stacks output files.
   
   M: path to the population map, a tab separated file describing which individuals belong in which population.
   
   t: number of threads to run in parallel sections of code. 
   
Data Filtering: 

   q: maximum observed heterozygosity. 
   
   r: minimum percentage of individuals in a population required to process a locus for that population. 
   
   p: minimum number of populations a locus must be present in to process a locus. 
   
   m: specify a minimum stack depth required for individuals at a locus. 
   
   a: specify a minimum minor allele frequency required to process a nucleotide site at a locus (0 < a < 0.5). 
   
   c: filter loci with log likelihood values below this threshold. 
   
   C: minimum percentage of similarity between loci to cluster.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Install		Install
LICENSE		LICENSE
README.md		README.md
pmerge_flow (1).png		pmerge_flow (1).png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PMERGE

Implementation

Installation

Usage

About

Releases

Packages

Languages

License

beiko-lab/PMERGE

Folders and files

Latest commit

History

Repository files navigation

PMERGE

Implementation

Installation

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages