Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Samtools flagstat - low % reads mapping

    Hi,

    I'm working with RNA-Seq and using bowtie and tophat to align 65bp PE reads to a reference genome. My reads were sequenced from X.laevis and I'm attempting to first map to X.tropicalis (X.laevis genome is still draft version).

    After trimming and filtering my reads I am left with 31*2 = 62M reads but running samtools on my accepted_hits.bam file shows that only 12M reads have mapped in total. I'm completely confused about why the number of reads mapping is so low - I've tried fine tuning the options in tophat (-r value, -N value) and using differently trimmed reads - but have seen little improvement on 20% mapping success.

    In addition almost none of my reads pair properly (samtools flagstat 'properly paired' = 0.01%).

    Any help would be hugely appreciated,

    Thanks

  • #2
    How have you trimmed your reads? Have you looked for adaptor sequence in your reads?

    Comment


    • #3
      I've trimmed the reads using fastq_quality_trimmer & filter and fastx_trimmer.

      One of the problems I've had is that the RNA fragment size is ~130 bp (post adapter removal) and my 100bp reads therefore overlap considerably. I've been using fastx_trimmer to cut the reads to 65bp to ensure no overlap - but they don't seem to be pairing properly in mapping.

      I haven't checked for adapters - I ran the .txt files through fastqc and there were no over-represented sequences.

      N

      Comment


      • #4
        Thats what I thought.

        Even at 65 bp you may still have overlap and/or adaptor sequence.

        Is it critical that you have paired end data? I had a similar situation with some paired end data. I simply dispensed with the second set of reads and treated it as single end reads. With that amount of overlap, its probably going to be impossible for tophat to get the insert size right.

        Also try adaptor trimming with a trimmer that can handle variable lengths of adaptor sequence, I have used cutadapt with great success. Then try realigning without your paired end and you should have better results.

        Otherwise....make a new library.

        Comment


        • #5
          If your reads are overlapping significantly you may want to try this as an alternative to stitch the two ends together.



          Updated citation:

          Tanja Magoč and Steven L. Salzberg

          FLASH: fast length adjustment of short reads to improve genome assemblies Bioinformatics (2011) 27(21): 2957-2963
          Last edited by GenoMax; 11-01-2012, 08:04 AM.

          Comment


          • #6
            I ran the 65bp trimmed reads through FLASH (http://genomics.jhu.edu/software/FLASH/index.shtml) to confirm that, post trim, there's no overlap.

            As I understand it bowtie and tophat map the pairs independently, so I would expect that dispensing of 1/2 of my reads would result in the same % mapped reads, maybe I'm wrong though?

            My primary concern is that the % of reads mapped is so low, I'm less concerned about the pairing of the reads (I'm interested in differential expression rather than resolving isoforms etc) but can't help but feel that the two are linked...

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Recent Advances in Sequencing Analysis Tools
              by seqadmin


              The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
              05-06-2024, 07:48 AM
            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:57 AM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-06-2024, 07:17 AM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-02-2024, 08:06 AM
            0 responses
            19 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-30-2024, 12:17 PM
            0 responses
            24 views
            0 likes
            Last Post seqadmin  
            Working...
            X