SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
FastQC per base sequence content analyst Bioinformatics 14 02-15-2017 06:25 AM
FastQC,kmer content, per base sequence content: is this good enough mgg Bioinformatics 10 11-06-2013 10:45 PM
Unexpected FastQC results Rocketknight Illumina/Solexa 3 04-14-2012 02:37 AM
FastQC - strange 'per base sequence content' graph gconcepcion Bioinformatics 11 10-31-2011 12:39 AM
Need help for FastQC results. Thanks!! byou678 Bioinformatics 18 08-23-2011 01:53 PM

Reply
 
Thread Tools
Old 03-04-2013, 06:40 PM   #1
chongm
Member
 
Location: Canada

Join Date: Sep 2012
Posts: 21
Default Strange FASTQC Results - PerSequence GC Content

Hi Everyone:

I would be grateful if someone could take a quick look at these FASTQC results. This is exome-seq 100 bp paired-end data. For the most part, FASTQC results seem normal, except a strange distribution for the per sequence GC content graph. From the FASTQC manual, an unusual distribution seems to be suggestive of contamination and a shift in the curve is suggestive of a systematic bias. Could this systematic bias be due to exome enrichment?

Also there seems to be homopolymer AAAA and TTTT repeats as indicated by the KMER content warning? Can anyone speculate on the source of these homopolymers?

Thanks,

MC
Attached Images
File Type: png duplication_levels.png (18.3 KB, 39 views)
File Type: png kmer_profiles.png (18.6 KB, 58 views)
File Type: png per_base_quality.png (10.0 KB, 29 views)
File Type: png per_sequence_gc_content.png (27.7 KB, 76 views)
chongm is offline   Reply With Quote
Old 03-04-2013, 07:13 PM   #2
kcchan
Senior Member
 
Location: USA

Join Date: Jul 2012
Posts: 184
Default

How long were your inserts? When reads go polyA or polyT it may be a sign that it's read through the entire length of the sequence.
kcchan is offline   Reply With Quote
Old 03-04-2013, 07:24 PM   #3
chongm
Member
 
Location: Canada

Join Date: Sep 2012
Posts: 21
Default

The average insert size was ~360 bp. Why do reads go polyA or polyT when they've read through the entire length of the sequence?
chongm is offline   Reply With Quote
Old 03-04-2013, 10:20 PM   #4
kcchan
Senior Member
 
Location: USA

Join Date: Jul 2012
Posts: 184
Default

If you're using an Illumina instrument (which I'm assuming right now), a poly A stretch is caused by a zero intensity signal, in which case the base will be called an A with 0 quality. Another way to tell if this the problem is too see if your adapters are showing up a lot in your overabundant fragments. This would indicate a amount of adapter/primer dimers which may also cause this problem of reading through the end of the sequence.
kcchan is offline   Reply With Quote
Old 03-04-2013, 11:03 PM   #5
chongm
Member
 
Location: Canada

Join Date: Sep 2012
Posts: 21
Default

Hi Kcchan,

Thanks for the explanation, but I don't see any of my adapter sequences found in the overrepresented sequences section of FASTQC results. I only see "AAAAA" and "TTTTT" in this section with counts of 16919295 and 16616150, OBS/EXP overall of 3.44 and 3.09, respectively. FASTQC does not give me a warning for overrepresented sequences and shows these levels to be normal despite giving me a kmer content warning with these same homopolymers.
chongm is offline   Reply With Quote
Old 03-05-2013, 09:55 AM   #6
Bukowski
Senior Member
 
Location: UK

Join Date: Jan 2010
Posts: 390
Default

I have to say I see this homopolymer signal from FastQC in HiSeq exome runs *a lot* but never had a satisfactory explanation for it. Hasn't seemed to impact downstream applications.

Your GC content will almost always get flagged with FastQC, but I've always assumed part of this is to do with looking only at the exome (rather than unbiased sequence from the genome) however your peak does look unusually distorted from what I would normally expect.
Bukowski is offline   Reply With Quote
Old 03-05-2013, 10:04 AM   #7
chongm
Member
 
Location: Canada

Join Date: Sep 2012
Posts: 21
Default

Thanks for this insight Bukowski. I'll consider the homopolymer signal benign then. But yes the hump is very strange.
chongm is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:41 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO