Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unmapped ratio very high on mouse genome

    Hi,
    My problem regards RNA-Seq data. I've downloaded public data (SAGE libs w/ 6 different samples from mouse liver ) to analyse using ArrayStudio. When I try to map them on the B38 mus musculus genom I have an unmapped read % of approximatly 95 % on all the samples!!! Quality scores are correct around 40 read length is correct (35 bp) but the base distrib QC is just very heterogenous, I don't understand why... this the first time I work on mouse data.Does anybody shared the same problem or have an idea please regarding the mapping and/or the base distrib?

    Thanks, LN
    Gene R' Us!

  • #2
    You may want to do a FastQC run on the data first to check on the quality. The data you downloaded may be raw and you may need to trim/clean the data before doing analysis/alignments.

    Comment


    • #3
      They say there is a 16 bp adaptor on each read, but my reads are at the correct length 35 bp on the QC. Do I really need to trim them?
      Gene R' Us!

      Comment


      • #4
        Quality score histogrammes look very good for each sample.
        Gene R' Us!

        Comment


        • #5
          Originally posted by le.nono View Post
          They say there is a 16 bp adaptor on each read, but my reads are at the correct length 35 bp on the QC. Do I really need to trim them?
          Is this supposed to be an "inline" adapter that is part of the actual sequence? Are you able to tell by looking at the reads?

          Comment


          • #6
            I don't think so the reads are very short. What do you have in mind?
            Gene R' Us!

            Comment


            • #7
              Can you post a FastQC (or which ever kind of QC you used) graph of the base distribution?

              I was thinking that one way you would get 95% of reads unmapped is if the barcodes/adapter were still present in the reads (inline). Do you know if they have already been removed?

              Comment


              • #8
                no I don't have this information.

                Gene R' Us!

                Comment


                • #9
                  Maybe a better quality and size one.

                  Last edited by le.nono; 06-17-2013, 08:25 AM.
                  Gene R' Us!

                  Comment


                  • #10
                    All the sequences appear to be starting with exactly the same 4 nucleotides (GCCA). Is that a barcode?

                    Comment


                    • #11
                      Are you able to map other SAGE data with your pipeline? Maybe it is not set up for such short tags.

                      The 4bp starting sequence is the cut site, right?

                      Also, are these ditags of 16 bp? Those would not map unless you split them first.
                      Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

                      Comment


                      • #12
                        I m gonna try to trim those 4bp first map the reads. I definitely need further informations on the reads... I dont much about those 16 bp adapter its just written in the abstract coming with the data. Do you say that what is display on the base distrib histrogrammes are ditags of 16 bp?
                        Last edited by le.nono; 06-17-2013, 10:44 AM.
                        Gene R' Us!

                        Comment


                        • #13
                          If it is SAGE data, then you should look here for an overview of the method:

                          (August 2004) With the advent of the human genome project, a vast amount of information about genes and gene structure is suddenly at our fingertips. But this information is limited. Every cell within an organism has the same genetic composition (with the exception of its gametes), and yet, obviously skin tissue is very different from


                          It is an older method meant to increase the sampling of transcripts with Sanger sequencing. There are some mouse mapping tools here:


                          But I suspect you'll want to find some newer RNA-Seq data that isn't SAGE based and you'll find it easier to go forward.
                          Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

                          Comment


                          • #14
                            Ok it s becoming clearer now. I really need these data I use so i m gonna stick to them even if it s harder. I m gonna try to look for in the literature some RNA Seq with SAGE preps I think its been done before what do you think?
                            Gene R' Us!

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Current Approaches to Protein Sequencing
                              by seqadmin


                              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                              04-04-2024, 04:25 PM
                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 04-11-2024, 12:08 PM
                            0 responses
                            30 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 10:19 PM
                            0 responses
                            32 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 09:21 AM
                            0 responses
                            28 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-04-2024, 09:00 AM
                            0 responses
                            52 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X