MHAP v1.5b1
Major updates:
- Eliminate repetitive k-mer filtering in index lookup, why filter k-mers when you can down-weight them.
- Increased performance of ordered k-mer second stage filter.
Changelog:
- Implemented weighted (discretized td-idf) MinHashing in first-stage filter.
- Random subsampling in second-stage filter.
- k-mer size is now unlimited.
- Reduced memory footprint and disk footprint of binary sketch representation, allowing a larger set of sequences to fit in memory.
Known Issues:
- If no repeat k-mer filter is specified, MHAP will use an experimental implementation of a count-min sketch to identify repeat k-mers and down-weight them. This option has not been full tested and may not always work. Users should always specify a filter file using the -f option.
Please see documentation at http://mhap.readthedocs.org/en/