Seqanswers Leaderboard Ad

**pbluescript** · 06-05-2012, 03:53 AM

Did you add the -Q 33 option to your fastx command? Illumina encoding changed from a -64 to a -33 encoding and fastx defaults to -64.
Did FastQC make the right choice when it used 1.9 encoding? I've never seen it make a mistake like that however.

**arvid** · 06-05-2012, 04:27 AM

Originally posted by pbluescript View Post

Did you add the -Q 33 option to your fastx command? Illumina encoding changed from a -64 to a -33 encoding and fastx defaults to -64.
Did FastQC make the right choice when it used 1.9 encoding? I've never seen it make a mistake like that however.

Doesn't seem like it, then they'd be skewed but not qualitatively different like they are. Look at the quality manually for a few reads - they should all have a bad quality at pos 20 if FastQC is right.
Did you try to see if you get the same result with a subset of your file? Is it very big and could thus overflow some counters?
AFAIK FastQC should use all the reads for the quality plot.

**Heisman** · 06-05-2012, 05:31 AM

Are there actually quality scores above 40 in your file? That should indicate which one is more likely to be correct.

**simonandrews** · 06-06-2012, 12:20 AM

That does look odd and I can't immediately think how you'd end up with biases which were that different. Were there any other oddities in the sample - such as the sample containing lots of 'N' calls? Does FastX do any filtering of the bases or reads it uses for the quality plot (FastQC doesn't, unless run with the --casava option). As others have said it's not going to be an offset detection problem, there's something else going on.

The very high qualities in the FastX results seem a bit odd, most sequencers don't produce reads of Q40. The drop and recovery in the FastQC results doesn't look like a technical effect, but then I can't see how that woudn't show up in the FastX result.

If you can put the FastQ file you were using somewhere I can see it I'd be happy to take a look to see which of the results seems to match up.

**TheStudent** · 06-08-2012, 04:43 AM

@ pbluescript
The FastX was run after a conversion to Sanger Format.

@ arvid
"AFAIK FastQC should use all the reads for the quality plot."
Thank you, that is very helpful

@simonandrews
Thank you for your kind offer

After a lot of checks I think I figured out what is wrong: Apparently because of some bug or malformatted fastq file FastQ only took the first ~5.000 reads. Those were really bad due to some technical issues.

Thank you all for your thoughts and help.

**simonandrews** · 06-08-2012, 05:13 AM

The only thing I know of which would allow FastQC to read only a subset of a file without throwing an error is if the file were composed of multiple gzipped sections. In this case the java parser used in FastQC stops after the first compressed block - however I included a work round for this in the last release of the program (v0.10.1) so this shouldn't happen any more.

Can you let me know if this still happens with the latest release? If it does I'd like to look into this some more to try to figure out what went wrong.

Thanks

Simon.

**TheStudent** · 06-08-2012, 08:44 AM

Hi Simon,

thank you very much for the very quick reply. I did try it with the recent version v0.10.1 and it still did the same. I will send you the file in a PM.

Thank you.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 49 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Why is the base quality score so different between FastQC and FastX?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News