SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
FastQC,kmer content, per base sequence content: is this good enough mgg Bioinformatics 10 11-06-2013 10:45 PM
Per Base Quality scores in FastQC mittymat Illumina/Solexa 3 03-30-2012 05:34 AM
kmer content in the first bases of Illumina sequence brachysclereid Bioinformatics 2 01-09-2012 02:54 PM
FastQC - strange 'per base sequence content' graph gconcepcion Bioinformatics 11 10-31-2011 12:39 AM
FastQC "Per Base Sequence Content": systematic deviation at 3' end of reads d f Illumina/Solexa 4 09-28-2010 09:46 AM

Reply
 
Thread Tools
Old 11-25-2011, 02:09 PM   #1
analyst
Member
 
Location: US

Join Date: Jan 2011
Posts: 18
Default FastQC per base sequence content

Does it seems acceptable (Global GC cotent for the species = 40%)

<original link removed>




Last edited by analyst; 11-28-2011 at 12:48 PM.
analyst is offline   Reply With Quote
Old 11-27-2011, 11:38 PM   #2
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Can you attach it to the forum? That site is blocked from here. Also it would be useful to see the per sequence GC plot as well as the per base plot.
simonandrews is offline   Reply With Quote
Old 11-28-2011, 01:59 AM   #3
Bukowski
Senior Member
 
Location: UK

Join Date: Jan 2010
Posts: 390
Default

Quote:
Originally Posted by analyst View Post
Does it seems acceptable (Global GC cotent for the species = 40%)

<link removed>
Please do not use that host for images - it has NSFW content displayed prominently, and I have no desire to get sacked for viewing SeqAnswers links in an open plan office.
Bukowski is offline   Reply With Quote
Old 11-28-2011, 06:45 AM   #4
analyst
Member
 
Location: US

Join Date: Jan 2011
Posts: 18
Default

I understand and apologize. and thanks to the person moderating who posted image correctly

Can anyone please comment on the fastQC finding
analyst is offline   Reply With Quote
Old 11-28-2011, 06:49 AM   #5
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

I can't see an image reposted. The moderator only removed the original link. We still can't see your data.
simonandrews is offline   Reply With Quote
Old 11-28-2011, 06:52 AM   #6
Orr Shomroni
Member
 
Location: Netherlands

Join Date: Oct 2011
Posts: 26
Default

analyst, you may want to look at the following website:

http://www.bioinformatics.bbsrc.ac.u...s/fastqc/Help/

Introduction gives some general information on the FastQC algorithm. For specific information on the modules (what the developers consider as "pass", "warn" and "fail"), see under "Analysis modules" there will be some text files about each module and what their thresholds are.

This is a quote from the section on GC content:

Quote:
Warning

This module issues a warning it the GC content of any base strays more than 5% from the mean GC content.

Failure

This module will fail if the GC content of any base strays more than 10% from the mean GC content.
Does this answer your question?
__________________
"Though it may seem that all's been said and done, originality still lives on" - some unoriginal guy who had nothing better to write as his signature
Orr Shomroni is offline   Reply With Quote
Old 11-28-2011, 07:52 AM   #7
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Looks acceptable to me.

What concerns you? If it is the low-level but periodic fluctuations in base levels across the read, that has been commented on before in another thread. There the period is three bases per cycle and more regular than what you see here.

--
Phillip
pmiguel is offline   Reply With Quote
Old 11-28-2011, 12:51 PM   #8
analyst
Member
 
Location: US

Join Date: Jan 2011
Posts: 18
Default

Thanks for the comments orr and phillip. Consider me beginner, just wanted to make sure the peaks at certain positions is nothing to worry about. It is more pronounced in the second picture I just posted, at 43. FastQC is warning me but I am not reading too much into it.

Simon, can you see the new picture I just posted, if not, I am also attaching here.
Attached Images
File Type: png Picture2.png (89.8 KB, 128 views)
analyst is offline   Reply With Quote
Old 11-29-2011, 03:41 AM   #9
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Okay, the new plot (the second one) is a different story. The G peak at 42 bases may indicate an issue of some sort.

--
Phillip
pmiguel is offline   Reply With Quote
Old 04-12-2013, 11:54 AM   #10
stacy09
Junior Member
 
Location: san diego

Join Date: Apr 2013
Posts: 2
Default help with fastQC per base sequence content

Hi,

I am a beginner of RNA-seq data analysis. I did fastQC of my samples and I found some variations at the beginning of reads for per base seqence content check. would you please tell me whether it is ok?

Many thanks in advance!
Attached Images
File Type: png per_base_sequence_content.png (32.1 KB, 104 views)
stacy09 is offline   Reply With Quote
Old 04-12-2013, 12:36 PM   #11
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,177
Default

Quote:
Originally Posted by stacy09 View Post
Hi,

I am a beginner of RNA-seq data analysis. I did fastQC of my samples and I found some variations at the beginning of reads for per base seqence content check. would you please tell me whether it is ok?

Many thanks in advance!
See this previous thread for a discussion of this phenomenon.
kmcarr is offline   Reply With Quote
Old 04-12-2013, 12:54 PM   #12
stacy09
Junior Member
 
Location: san diego

Join Date: Apr 2013
Posts: 2
Default

kmcarr,

Thanks a lot!
stacy09 is offline   Reply With Quote
Old 02-14-2017, 07:20 PM   #13
Aditi Verma
Junior Member
 
Location: India

Join Date: Mar 2016
Posts: 3
Default

Hello,

I ran FastQC through RNA-seq done on ribodepleted samples and the per base sequence content and per base GC content shows a very heavy bias (If I am correctly interpreting the results). Is this expected in data from ribodepleted RNA. Can this data be used at all?
Please find the image attached. Thanks for your help in advance!
Attached Images
File Type: png per_base_sequence_content.png (116.9 KB, 24 views)
File Type: png per_sequence_gc_content.png (25.4 KB, 10 views)

Last edited by Aditi Verma; 02-15-2017 at 12:52 AM.
Aditi Verma is offline   Reply With Quote
Old 02-15-2017, 03:26 AM   #14
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,075
Default

@Aditi: Have you scanned/trimmed this data for presence of Illumina adapters? I wonder if you have a large percentage of adapter dimers (and no real inserts).

I recommend using bbduk.sh from BBMap for this purpose. Search for the thread on bbduk here.
GenoMax is offline   Reply With Quote
Old 02-15-2017, 06:25 AM   #15
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Looks like about 20-25% of your reads are from adapter dimers.

--
Phillip
pmiguel is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:07 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO