Seqanswers Leaderboard Ad

**kmcarr** · 08-15-2012, 02:07 PM

This was a problem when Illumina first released CASAVA 1.8 (or was it 1.7, I can't remember). The default behavior, with not way to bypass, was to mix passed (:N

and failed (:Y

reads in the output file. Here is the recommendation from Illumina on how to filter failed reads from the file:

Code:

grep -A3 '^@.* [^:]*:N:[^:]*:' [I]your_input_file[/I] | grep -ve '^--$' > [I]your_output_file[/I]

The first grep statement searches for headers with :N: in the appropriate place and prints that line plus the 3 following lines (sequence, qual header and qual). The second grep statements removes the '--' lines which the first grep inserts between blocks of matches.

**Chirag** · 08-16-2012, 01:22 AM

Thank you Kmcarr !!

It removed all those sequences without bad QC quality flag.

#First, i filtered:
grep -A3 '^@.* [^:]*:N:[^:]*:' Embryo_R1.fastq | grep -ve '^--$' > Emb_R1.fastq

#Check if BAD flag
cat Emb_R1.fastq | grep :Y | wc -l
0 [None]

Then i try to filter these RAW read using flastq_quality_filter

fastq_quality_filter -i Emb_R1.fastq -o Test.fastq
fastq_quality_filter: Invalid quality score value (char '#' ord 35 quality value -29) on line 12

How could i solve this error ?

Thank you !

**Krish_143** · 08-16-2012, 02:53 AM

Google Code Archive - Long-term storage for Google Code Project Hosting.

http://code.google.com/p/condetri/

Check this once..
Quality filtering and PCRduplicate removal

**kmcarr** · 08-16-2012, 03:09 AM

Originally posted by Chirag View Post

Then i try to filter these RAW read using flastq_quality_filter

fastq_quality_filter -i Emb_R1.fastq -o Test.fastq
fastq_quality_filter: Invalid quality score value (char '#' ord 35 quality value -29) on line 12

How could i solve this error ?

Thank you !

This has to due with the way in which the quality score is encoded on your fastq file, that is if the character offset is phred+33 or phred+64. (Check out the Wikipedia article for a detailed explanation.) The Fastx toolkit programs default to the assumption that the encoding is phred+64 but Illumina now uses phred+33. You need to tell fastq_quality_filter to use 33.

Code:

fastq_quality_filter -Q33 -i Emb_R1.fastq -o Test.fastq

**Chirag** · 08-16-2012, 06:11 AM

Thank you very much !!!

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

QC Filter FLag:

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News