SEQanswers (
-   Bioinformatics (
-   -   Human read filtering with Kraken 2 (

JasperGeh 02-04-2020 05:56 AM

Human read filtering with Kraken 2
Hey everyone,

I was wondering something regarding the usage of Kraken 2 for read filtering.
I sequence/assemble viruses and inherited a pipeline in which human reads were removed by a simple Bowtie2 mapping to hg19.
While doing some metagenomic analysis of a sample with Kraken 2, I noticed how fast it was, so I build a Kraken database with just the human genome and tried filtering my reads with that. I tested it on a sample with around 2 million reads which we already assembled previously.

Filtering with Kraken was ~10x faster and both methods removed roughly half of the reads. Kraken, however, removed ~6k reads more, 2k of which seemed to be viral (985k of the remaining 1028k reads reads mapped to the assembled virus consensus, compared to 987k of the remaining 1034k reads with Bowtie).

I looked online but found noone who (mis)uses Kraken that way. The database is even smaller than the indexed hg19 for Bowtie. Am I missing something? :D At least, I feel a little bad for Kraken (if I'm allowed to anthropomorphise a bit) as it provides me with all this taxonomic data which I'm completely ignoring.

GenoMax 02-04-2020 06:00 AM

Have you looked at this:

JasperGeh 02-05-2020 11:36 PM

Yes, I have stumbled upon that but haven't tried it yet. Is that the "gold standard" for human read removal, if there is something like that?

All times are GMT -8. The time now is 01:32 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.