Seqanswers Leaderboard Ad

**dpryan** · 02-01-2013, 02:01 AM

1. Maybe they quality controlled prior to uploading?
2. Look for numbers in the quality score line. If it has numbers, it's Phred+33
3. As above, look for numbers in the quality score. You could look for symbols too, but I always forget which ones come before the capital letters in ASCII and which ones are between the capital and lower case letters. Phred+64 and Solexa scores will look similar, except the latter can contain =.

**winsettz** · 02-01-2013, 08:16 AM

I'm sure you saw this on the SRA website, but for reference

Fastq acceptable to SRA

fastq has been narrowly defined with the goal of minimizing submission failures. However, this narrow definition may imply interpretation and conversion by the submitter.

Fastq produced by Illumina pipelines is supported.

The following terms are defined in general:

READNAME = Text string terminated by white space.

BASES::= [ACGTNactgn.]+

QUALITIES ::= [0-9]+ | <quality>\s[0-9]+

or

[\!\"\#\$\%\&\'\*\+,\-\.\/0-9:;<=>\?\@A-I]+

or

[\@A-Z\[\\\]\^_`a-h]+

The permissible fastq format is simply:

@READNAME
BASES
+[READNAME]
QUALITIES
where each instance of READNAME, BASES, QUALITIES are separated newline.

The QUALITIES string can be whitespace-separated numeric phred scores or an ascii string of the phred scores with the ascii character value = phred score + an offset constant used to place the ascii characters in the printable character range. There are 2 predominant offsets: 33 (0 = !) and 64 (0=@).

A "second opinion" of sorts might be gained by feeding your fastqs into fastqc and see what it thinks.
(http://www.bioinformatics.babraham.a...ojects/fastqc/)

Edit:

And

FASTQ Format

http://maq.sourceforge.net/fastq.shtml

...Although Solexa/Illumina read file looks pretty much like FASTQ, they are different in that the qualities are scaled differently. In the quality string, if you can see a character with its ASCII code higher than 90, probably your file is in the Solexa/Illumina format.

**efoss** · 02-01-2013, 04:09 PM

Thanks very much for the help. These suggestions are really useful.

Best wishes,

Eric

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 27 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 31 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 27 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

quality scores in fastq files extracted from sea files

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News