Releases · muellan/metacache

29 Oct 09:49

muellan

v0.8.0

0cb917a

MetaCache v0.8.0

New feature "Coverage Filter"

Option -cov-percentile <p> removes the p-th percentile of hit targets (reference genomes) with the lowest coverage. A first pass does the normal mapping of queries (reads) to targets (reference genomes). The actual classification is then done in a second pass using only the remaining hit targets.

This will lead to a very small increase in runtime and memory consumption but can improve accuracy by detecting and removing stray false positive hits.

The coverage filter is deactivated by default.

Other Changes

improved multi-threading in query mode
improved database format (layout better suited for future loading on GPUs)
code cleanup

Assets 2

16 Oct 07:38

muellan

v0.6.2

8733f05

MetaCache v0.6.2

improved accession number / sequence id parsing
file reading improvements
code cleanup

Assets 2

25 Sep 12:30

muellan

v0.6.1

e749200

MetaCache v0.6.1

improved database building performance (~30-50% speedup)
improved taxonomic id assignment during build: now one can also use global assembly_summary files
(default: "assembly_summary_refseq.txt", "assembly_summary_refseq_historical.txt", "assembly_summary_genbank.txt", "assembly_summary_genbank_historical.txt" in the taxonomy folder)
the download-ncbi-taxonomy script downloads "assembly_summary_refseq.txt" and "assembly_summary_refseq_historical.txt" by default now
code cleanup

Assets 2

09 May 08:44

muellan

v0.5.3

d8f1f87

MetaCache v0.5.3

improvements to abundances output
some code cleanup

Assets 2

03 May 07:52

muellan

v0.5.2

2253a7c

MetaCache v0.5.2

fixed "abundance estimation not working if lowest classificaiton level is above sequence level"
some code reorganization

Assets 2

29 Apr 11:56

muellan

v0.5.1

ccffbd4

MetaCache v0.5.1

It is now possible to have the "root" level as highest taxonomic classification level. This is needed for some abundance estimation postprocessing tasks. The default for the highest level remains "domain".
improved database I/O performance
database files on disk are up to 15% smaller now
small fixes

Assets 2

23 Oct 14:00

muellan

v0.5.0

6c98408

MetaCache v0.5.0

New merge mode for merging results of multiple, independent queries. This can be used to save memory by splitting up the set of reference genomes into several databases. These can then be queried in succession and the results can be merged to obtain a classification based on the whole set of reference genomes.

    ./metacache query 1.db reads.fa -tophits -queryids -lowest species -out res1.txt 
    ./metacache query 2.db reads.fa -tophits -queryids -lowest species -out res2.txt
    ./metacache query 3.db reads.fa -tophits -queryids -lowest species -out res3.txt
    ./metacache merge res1.txt res2.txt res3.txt -taxonomy ncbi_taxonomy -out res1+2+3.txt

tweaked classification algorithm in case of multiple equally good matches in several targets
small fixes

Assets 2

02 Oct 13:33

muellan

v0.4.0

7b0ae01

MetaCache v0.4.0

added per-taxon abundance summary (-abundances) and per-rank abundance estimation (-abundance-per <rank>
Simplified internal classification scheme; Attention: This will now tend to favor precision a little more than before if one uses the default classification threshold. You can lower the threshold (-hitsmin) to have more sensitivity at the expense of precision.
performance improvements

Assets 2

18 Sep 11:06

muellan

v0.3.4

bbba5ce

MetaCache v0.3.4

improved query performance

Assets 2

12 Sep 12:43

muellan

v0.3.3

50dbe3b

MetaCache v0.3.3

small performance improvements
code simplification
fixes

Assets 2

Releases: muellan/metacache

MetaCache v0.8.0

New feature "Coverage Filter"

Other Changes

Uh oh!

MetaCache v0.6.2

Uh oh!

MetaCache v0.6.1

Uh oh!

MetaCache v0.5.3

Uh oh!

MetaCache v0.5.2

Uh oh!

MetaCache v0.5.1

Uh oh!

MetaCache v0.5.0

Uh oh!

MetaCache v0.4.0

Uh oh!

MetaCache v0.3.4

Uh oh!

MetaCache v0.3.3

Uh oh!