SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Fastqc results small RNA run frymor Bioinformatics 4 10-24-2013 10:21 AM
Weird FastQC results per base GC content gevielr Bioinformatics 0 10-22-2013 02:24 PM
different results from fastQC and fastx toolkits lran2008 Bioinformatics 5 06-18-2013 10:54 AM
Unexpected FastQC results Rocketknight Illumina/Solexa 3 04-14-2012 02:37 AM
Need help for FastQC results. Thanks!! byou678 Bioinformatics 18 08-23-2011 01:53 PM

Reply
 
Thread Tools
Old 06-22-2015, 05:45 AM   #1
ileanadrt
Junior Member
 
Location: Argentina

Join Date: Jun 2015
Posts: 5
Question Help with FastQC results

Hi,
I have two sets of Illumina Single End RNA-Seq 50 bp data (two differents days of mammalian cell culture). The kit used was KAPA Stranded RNA-Seq Kit with RiboErase.

Unfortunately the results from FastQC are not as expected. But the problem is that I am not exactly sure how to interpret the data and what to say about the plots.

Both datasets show the same results. The plot of per base sequence quality is OK (I think) and also the plot of Adapter content, but the plots of GC content and Kmer content look very weird. Also, the duplication levels.

I am happy to get any advices about what is wrong in this data or possible explanations for this results.

Thanks for any help

Ileana

First three results of
Overrepresented sequences:
Sequence
CGACGGGGGGCCCCGCGGGGCCGAGAAGAAGAGGAGGGGGAGGCGAGGAGG Count: 187325
Percentage: 1.0857026079582217
Possible Source: No Hit

Sequence GGACAGGAGAGCGGTCGCGCCGTGGGAGGGGCGGCCCGGCCCCCACCGCGG Count: 98598
Percentage: 0.571456590094567
Possible Source: No Hit

Sequence CCCGAGACGAGTGGCTCTCCGCACCGGTCCCCGGTCCCGACGCGCGGCGGG Count: 95732
Percentage: 0.5548457603899987
Possible Source: No Hit
Attached Images
File Type: png per_base_quality.png (8.0 KB, 17 views)
File Type: png per_base_sequence_content.png (30.3 KB, 22 views)
File Type: png per_sequence_gc_content.png (26.0 KB, 26 views)
File Type: png duplication_levels.png (22.2 KB, 18 views)
File Type: png kmer_profiles.png (56.5 KB, 17 views)
ileanadrt is offline   Reply With Quote
Old 06-22-2015, 05:52 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975
Default

Out of the graphs you attached the GC content one looks rather strange. Is this known to be an extremely GC rich organism?
GenoMax is offline   Reply With Quote
Old 06-22-2015, 06:16 AM   #3
ileanadrt
Junior Member
 
Location: Argentina

Join Date: Jun 2015
Posts: 5
Default

No, the GC content is around 40%
ileanadrt is offline   Reply With Quote
Old 06-22-2015, 06:17 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975
Default

Have you checked a few sequences (e.g. by blast) to see if they are from the right organism (and are not some kind of contamination)?
GenoMax is offline   Reply With Quote
Old 06-22-2015, 07:32 AM   #5
ileanadrt
Junior Member
 
Location: Argentina

Join Date: Jun 2015
Posts: 5
Default

I did a quick search and didn't found possible contamination. Could be some rRNA and / or mitochondrial RNA? I did found some of these.
ileanadrt is offline   Reply With Quote
Old 06-22-2015, 08:17 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975
Default

Neither of those should skew GC content that way. Perhaps someone else will have further suggestions.

You should go ahead and start analyzing the data.
GenoMax is offline   Reply With Quote
Old 06-22-2015, 09:41 AM   #7
nucacidhunter
Jafar Jabbari
 
Location: Melbourne

Join Date: Jan 2013
Posts: 1,225
Default

According to plots GC content of reads is 60%. This seems to be result of some GC rich reads with high duplication rates (20% has over 10 k dup rate). I would check reads with dup rate over 1k to see what they are and if they make bilogical sense. If they are not rRNA or from repetative regions, I would suspect some library prep issues.
nucacidhunter is offline   Reply With Quote
Old 06-26-2015, 12:49 PM   #8
AdrianaGeldart
Junior Member
 
Location: Boston, MA

Join Date: Feb 2014
Posts: 6
Default

Dear Ileanadrt,

I am a member of the Kapa Biosystems Technical Support Team.

We would love to help you troubleshoot this further. Would you be willing to share the type of mammalian cell line you are using?

Thanks and best regards,
Adriana
AdrianaGeldart is offline   Reply With Quote
Reply

Tags
duplication, fastqc, illumina, rna-seq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:47 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO