SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   SOLiD (http://seqanswers.com/forums/forumdisplay.php?f=7)
-   -   What is the differences between pre-filtering vs PCR duplicates remove mapped reads? (http://seqanswers.com/forums/showthread.php?t=19309)

yksikaksi 04-18-2012 05:55 AM

What is the differences between pre-filtering vs PCR duplicates remove mapped reads?
 
Hello,

Does is a must to preform pre-filtering for color space reads before mapping?

Could it has a big differences in downstream analysis when I mapped the SOLiD color space reads without pre-filtering but removed the PCR duplicates with Picard tool from mapped reads?

Could anyone kindly please share with me your opinion?

Thank you. Have a nice day.

kopi-o 04-18-2012 09:17 AM

I think you first have to clarify more precisely what you mean by pre-filtering, before we can answer.

yksikaksi 04-19-2012 12:12 AM

Sorry for the unclear question.

The pre-filtering I refer here is view the reads with tools for example FastQC or Fastx and then trim out the so-called bad bases before mapping the reads.

The problem is the FastQC and Fastx are develop to handle reads generated from Illumina and 454 platform. The color space (csfasta) reads can't be imported directly to these tools.

Some Perl conversation scripts also having problem when convert the csfasta to fastq.

Thanks.

kopi-o 04-19-2012 12:21 AM

OK. So, pre-filtering (I would call it quality filtering) is distinct from duplicate removal and you can think of them as independent filtering steps.

For SOLiD specific quality filtering, and looking at the data in a somewhat similar way to FastQC, I have used this toolkit: http://hts.rutgers.edu/filter/
Then you don't need to convert to FASTQ.

For some types of analysis, you may not need to do quality filtering (e g ChIP-seq, RNA-seq). The bad reads will (in general) simply fail to map. For de novo assembly, or resequencing where variant calling is important, you should do quality filtering.

yksikaksi 04-19-2012 12:47 AM

Quote:

Originally Posted by kopi-o (Post 71028)
OK. So, pre-filtering (I would call it quality filtering) is distinct from duplicate removal and you can think of them as independent filtering steps.

For SOLiD specific quality filtering, and looking at the data in a somewhat similar way to FastQC, I have used this toolkit: http://hts.rutgers.edu/filter/
Then you don't need to convert to FASTQ.

For some types of analysis, you may not need to do quality filtering (e g ChIP-seq, RNA-seq). The bad reads will (in general) simply fail to map. For de novo assembly, or resequencing where variant calling is important, you should do quality filtering.

Thanks, kopi-o.

I'm doing RNA-seq analysis with SOLiD platform. I read people mentioned carry out quality filtering (someone also called it as pre-filtering) before mapping is recommended. However, not much about how to deal with SOLiD csfasta but Illumina and 454 reads.

Thanks for the information. It is useful!


All times are GMT -8. The time now is 05:16 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.