SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNA-Seq and mouse reference genome ChristmasSunflower Bioinformatics 3 06-25-2014 11:23 PM
Lifescope smallRNA module with mouse genome selen SOLiD 1 05-15-2013 07:00 PM
whole genome sequencing of mouse genome gfmgfm Bioinformatics 0 06-12-2012 05:37 AM
Affymetrix HT 430 Mouse genome data himanshu04 General 2 04-13-2012 09:32 AM

Reply
 
Thread Tools
Old 06-17-2013, 07:18 AM   #1
le.nono
Member
 
Location: Montpellier- France

Join Date: Sep 2012
Posts: 17
Default Unmapped ratio very high on mouse genome

Hi,
My problem regards RNA-Seq data. I've downloaded public data (SAGE libs w/ 6 different samples from mouse liver ) to analyse using ArrayStudio. When I try to map them on the B38 mus musculus genom I have an unmapped read % of approximatly 95 % on all the samples!!! Quality scores are correct around 40 read length is correct (35 bp) but the base distrib QC is just very heterogenous, I don't understand why... this the first time I work on mouse data.Does anybody shared the same problem or have an idea please regarding the mapping and/or the base distrib?

Thanks, LN
__________________
Gene R' Us!
le.nono is offline   Reply With Quote
Old 06-17-2013, 07:40 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,091
Default

You may want to do a FastQC run on the data first to check on the quality. The data you downloaded may be raw and you may need to trim/clean the data before doing analysis/alignments.
GenoMax is offline   Reply With Quote
Old 06-17-2013, 07:56 AM   #3
le.nono
Member
 
Location: Montpellier- France

Join Date: Sep 2012
Posts: 17
Default

They say there is a 16 bp adaptor on each read, but my reads are at the correct length 35 bp on the QC. Do I really need to trim them?
__________________
Gene R' Us!
le.nono is offline   Reply With Quote
Old 06-17-2013, 07:58 AM   #4
le.nono
Member
 
Location: Montpellier- France

Join Date: Sep 2012
Posts: 17
Default

Quality score histogrammes look very good for each sample.
__________________
Gene R' Us!
le.nono is offline   Reply With Quote
Old 06-17-2013, 07:58 AM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,091
Default

Quote:
Originally Posted by le.nono View Post
They say there is a 16 bp adaptor on each read, but my reads are at the correct length 35 bp on the QC. Do I really need to trim them?
Is this supposed to be an "inline" adapter that is part of the actual sequence? Are you able to tell by looking at the reads?
GenoMax is offline   Reply With Quote
Old 06-17-2013, 08:01 AM   #6
le.nono
Member
 
Location: Montpellier- France

Join Date: Sep 2012
Posts: 17
Default

I don't think so the reads are very short. What do you have in mind?
__________________
Gene R' Us!
le.nono is offline   Reply With Quote
Old 06-17-2013, 08:11 AM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,091
Default

Can you post a FastQC (or which ever kind of QC you used) graph of the base distribution?

I was thinking that one way you would get 95% of reads unmapped is if the barcodes/adapter were still present in the reads (inline). Do you know if they have already been removed?
GenoMax is offline   Reply With Quote
Old 06-17-2013, 08:18 AM   #8
le.nono
Member
 
Location: Montpellier- France

Join Date: Sep 2012
Posts: 17
Default

no I don't have this information.

__________________
Gene R' Us!
le.nono is offline   Reply With Quote
Old 06-17-2013, 08:20 AM   #9
le.nono
Member
 
Location: Montpellier- France

Join Date: Sep 2012
Posts: 17
Default

Maybe a better quality and size one.

http://imageshack.us/photo/my-images/844/q6c.png/
__________________
Gene R' Us!

Last edited by le.nono; 06-17-2013 at 08:25 AM.
le.nono is offline   Reply With Quote
Old 06-17-2013, 08:35 AM   #10
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,091
Default

All the sequences appear to be starting with exactly the same 4 nucleotides (GCCA). Is that a barcode?
GenoMax is offline   Reply With Quote
Old 06-17-2013, 08:54 AM   #11
SNPsaurus
Registered Vendor
 
Location: Eugene, OR

Join Date: May 2013
Posts: 523
Default

Are you able to map other SAGE data with your pipeline? Maybe it is not set up for such short tags.

The 4bp starting sequence is the cut site, right?

Also, are these ditags of 16 bp? Those would not map unless you split them first.
__________________
Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com
SNPsaurus is offline   Reply With Quote
Old 06-17-2013, 10:36 AM   #12
le.nono
Member
 
Location: Montpellier- France

Join Date: Sep 2012
Posts: 17
Default

I m gonna try to trim those 4bp first map the reads. I definitely need further informations on the reads... I dont much about those 16 bp adapter its just written in the abstract coming with the data. Do you say that what is display on the base distrib histrogrammes are ditags of 16 bp?
__________________
Gene R' Us!

Last edited by le.nono; 06-17-2013 at 10:44 AM.
le.nono is offline   Reply With Quote
Old 06-17-2013, 10:47 AM   #13
SNPsaurus
Registered Vendor
 
Location: Eugene, OR

Join Date: May 2013
Posts: 523
Default

If it is SAGE data, then you should look here for an overview of the method:

http://www.scq.ubc.ca/painless-gene-...ne-expression/

It is an older method meant to increase the sampling of transcripts with Sanger sequencing. There are some mouse mapping tools here:
http://www.webcitation.org/getfile?f...243e9258904b3a

But I suspect you'll want to find some newer RNA-Seq data that isn't SAGE based and you'll find it easier to go forward.
__________________
Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com
SNPsaurus is offline   Reply With Quote
Old 06-17-2013, 11:59 AM   #14
le.nono
Member
 
Location: Montpellier- France

Join Date: Sep 2012
Posts: 17
Default

Ok it s becoming clearer now. I really need these data I use so i m gonna stick to them even if it s harder. I m gonna try to look for in the literature some RNA Seq with SAGE preps I think its been done before what do you think?
__________________
Gene R' Us!
le.nono is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:35 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO