Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TopHat2 - reads removed from the analysis?

    Hello
    We analyse Solid 5500 RNA seq reads. I used tophat2 for the alignment. We first performed the analysis ONLY on 2 chr in order to have a quick reply (that's why our number of reads aligned - see below - is very low)
    We noticed some reads were "missing" at the end of the pipeline and we're wondering why.

    These are .info files
    ::::::::::::::
    left_kept_reads.info
    ::::::::::::::
    min_read_len=50
    max_read_len=50
    reads_in =25357934
    reads_out=25319876
    ::::::::::::::
    right_kept_reads.info
    ::::::::::::::
    min_read_len=35
    max_read_len=35
    reads_in =25357934
    reads_out=25283245

    and samtools flagstat done on accepted_hits.bam

    1552361 + 0 in total (QC-passed reads + QC-failed reads)
    0 + 0 duplicates
    1552361 + 0 mapped (100.00%:-nan%)
    1552361 + 0 paired in sequencing
    769043 + 0 read1
    783318 + 0 read2
    549840 + 0 properly paired (35.42%:-nan%)
    661184 + 0 with itself and mate mapped
    891177 + 0 singletons (57.41%:-nan%)
    5522 + 0 with mate mapped to a different chr
    5522 + 0 with mate mapped to a different chr (mapQ>=5)

    and 15435015 reads are in the unmapped.bam files

    --> so that we've got 15.435.015 unmapped + 1.552.361 mapped ~ 17.000.000 reads have been analysed.
    --> we had 25.357.934 + 25.357.934 reads_in to analyse ~ 50.000.000 reads were available for the analysis.

    We're wondering where are the other reads. We expected summing the number of reads in accepted + unmapped bam files would lead to the number of reads_in but it's not the cas. If you have any explanation may I ask you to help us please?
    Thanks a lot for your time

  • #2
    any idea please?
    thanks

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Advancing Precision Medicine for Rare Diseases in Children
      by seqadmin




      Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
      12-16-2024, 07:57 AM
    • seqadmin
      Recent Advances in Sequencing Technologies
      by seqadmin



      Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

      Long-Read Sequencing
      Long-read sequencing has seen remarkable advancements,...
      12-02-2024, 01:49 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 12-17-2024, 10:28 AM
    0 responses
    33 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 12-13-2024, 08:24 AM
    0 responses
    49 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 12-12-2024, 07:41 AM
    0 responses
    34 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 12-11-2024, 07:45 AM
    0 responses
    46 views
    0 likes
    Last Post seqadmin  
    Working...
    X