Seqanswers Leaderboard Ad

**kopi-o** · 04-18-2012, 08:17 AM

I think you first have to clarify more precisely what you mean by pre-filtering, before we can answer.

**yksikaksi** · 04-18-2012, 11:12 PM

Sorry for the unclear question.

The pre-filtering I refer here is view the reads with tools for example FastQC or Fastx and then trim out the so-called bad bases before mapping the reads.

The problem is the FastQC and Fastx are develop to handle reads generated from Illumina and 454 platform. The color space (csfasta) reads can't be imported directly to these tools.

Some Perl conversation scripts also having problem when convert the csfasta to fastq.

Thanks.

**kopi-o** · 04-18-2012, 11:21 PM

OK. So, pre-filtering (I would call it quality filtering) is distinct from duplicate removal and you can think of them as independent filtering steps.

For SOLiD specific quality filtering, and looking at the data in a somewhat similar way to FastQC, I have used this toolkit: http://hts.rutgers.edu/filter/
Then you don't need to convert to FASTQ.

For some types of analysis, you may not need to do quality filtering (e g ChIP-seq, RNA-seq). The bad reads will (in general) simply fail to map. For de novo assembly, or resequencing where variant calling is important, you should do quality filtering.

**yksikaksi** · 04-18-2012, 11:47 PM

Originally posted by kopi-o View Post

OK. So, pre-filtering (I would call it quality filtering) is distinct from duplicate removal and you can think of them as independent filtering steps.

For SOLiD specific quality filtering, and looking at the data in a somewhat similar way to FastQC, I have used this toolkit: http://hts.rutgers.edu/filter/
Then you don't need to convert to FASTQ.

For some types of analysis, you may not need to do quality filtering (e g ChIP-seq, RNA-seq). The bad reads will (in general) simply fail to map. For de novo assembly, or resequencing where variant calling is important, you should do quality filtering.

Thanks, kopi-o.

I'm doing RNA-seq analysis with SOLiD platform. I read people mentioned carry out quality filtering (someone also called it as pre-filtering) before mapping is recommended. However, not much about how to deal with SOLiD csfasta but Illumina and 454 reads.

Thanks for the information. It is useful!

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 10 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

What is the differences between pre-filtering vs PCR duplicates remove mapped reads?

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News