Skip to content

BenjaminDoran/unidip

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UniDip Python Port

See reference paper: http://www.kdd.org/kdd2016/subtopic/view/skinny-dip-clustering-in-a-sea-of-noise

UniDip is a noise robust clustering algorithm for 1 dimensional numeric data. It recursively extracts peaks of density in the data utilizing the Hartigan Dip-test of Unimodality.

Install

From command line in installation directory:

cd <path to UniDip installation directory> 
python setup.py install

Installation via Pip (this will not work with newer versions of Python and Numpy):

pip3.6 install unidip

Examples

Basic Usage

from unidip import UniDip

# create bi-modal distribution
dat = np.concatenate([np.random.randn(200)-3, np.random.randn(200)+3])

# sort data so returned indices are meaningful
dat = np.msort(dat)

# get start and stop indices of peaks 
intervals = UniDip(dat).run()

Advanced Options

  • alpha: control sensitivity as p-value. Default is 0.05. increase to isolate more peaks with less confidence. Or, decrease to isolate only peaks that are least likely to be noise.
  • mrg_dst: Defines how close intervals must be before they are merged.
  • ntrials: how many trials are run in Hartigan Dip Test more trials adds confidance but takes longer.
intervals = UniDip(dat, alpha=0.001, ntrials=1000, mrg_dst=5).run()