Seqanswers Leaderboard Ad

**JackieBadger** · 08-27-2013, 07:01 PM

"TRAILING:30" will work
However, Q20 is almost the universally accepted acceptance threshold (99% base call accuracy...If I remember correctly). Although this probably stems from the wide use of 454 in the growing stages of NGS. Q30 (99.9%) is a good min for Illumina, but you could justify keeping any bases >Q20

Take a look here: http://en.wikipedia.org/wiki/FASTQ_format
Were your samples run on a HiSeq?
If there is any confusion (your seq ids may have been edited without you knowing, for example) you can figure out the phred encoding from the characters used in your quality data

**nicole_01** · 08-27-2013, 09:11 PM

Thanks Jackie.

Yes, my samples were run on a HiSeq 2000 - the company just got back to me, supposedly it was run through the Illumina pipeline v1.5 and that the base quality values run from 2 to 41.

I tried to figure out the ASCII codes - with v1.5 (so phred +64 if I found the right information) I'd have to start looking from 66, because 0 and 1 don't exist anymore and 2 is that weird "B", correct? And that's where I don't understand it anymore really. With v1.5 B is supposed to only happen at the end and is Q<15 without specific quality value attached - yet looking at the first 100 sequences about 90% start with a BP/ or BS/. How can that be? Do you think they told me the wrong version number?

Thanks

**ddb** · 08-28-2013, 05:39 AM

If it is just the 3rd line header that is the problem with Prinseq then you can use the command line flag

-no_qual_header

your output should then just show a + in the 3rd line. It seems this is quite a common cause of confusion in the default prinseq output.

**HESmith** · 08-28-2013, 09:54 AM

Nicole,

The sequences are not listed randomly, and the first reads are usually low quality (i.e., lots of Bs). Check the quality scores of reads from the middle of the data set for a more accurate representation of the whole.

**nicole_01** · 08-28-2013, 05:12 PM

Thanks HESmith and ddb!

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Trimming Illumina PE sequences with Trimmomatic

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News