Hello all!
I am analyzing illumina Hiseq4000 - generated paired-end shotgun metagenomic sequences obtained from environmental samples. I am also new to shotgun metagnomic data, but have had experience analyzing 16S data.
The reads are 150 nt in length and a majority of the fragment sizes range from 280-700 bp. A few samples have fragment sizes ranging from 80- 600 bp.
I am using the illumina-utils program to quality filter reads before de-novo assembly with the iu-filter-quality-minoche flag (see here for more info: https://github.com/merenlab/illumina-utils).
So far, approximately 68% of both R1 and R2 pass the QC parameters while 32% fail (94% percent of failures due to R2).
Here are my questions: Is this error rate and magnitude for read 2 normal?
Should I quality filter the reads prior to merging some
of the reads (if only about 20% can be merged)?
Can I use both merged reads and unmerged R1 and R2
for de novo assembly using Megahit?
Thanks for the help!
Any guidance would be appreciated!
I am analyzing illumina Hiseq4000 - generated paired-end shotgun metagenomic sequences obtained from environmental samples. I am also new to shotgun metagnomic data, but have had experience analyzing 16S data.
The reads are 150 nt in length and a majority of the fragment sizes range from 280-700 bp. A few samples have fragment sizes ranging from 80- 600 bp.
I am using the illumina-utils program to quality filter reads before de-novo assembly with the iu-filter-quality-minoche flag (see here for more info: https://github.com/merenlab/illumina-utils).
So far, approximately 68% of both R1 and R2 pass the QC parameters while 32% fail (94% percent of failures due to R2).
Here are my questions: Is this error rate and magnitude for read 2 normal?
Should I quality filter the reads prior to merging some
of the reads (if only about 20% can be merged)?
Can I use both merged reads and unmerged R1 and R2
for de novo assembly using Megahit?
Thanks for the help!
Any guidance would be appreciated!
Comment