Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TopHat+BWT = reads that failed to align ~ 99%

    I ran Tophat and cufflinks successfully!
    But I think some the output results went wrong (red colour ) but I'm not sure.

    I ran 17246957 reads and my TopHat outputs are
    accepted_hits.sam = 152434
    junctions.bed = 12672

    logs
    $ cat file2lqYQ1.log
    Code:
    # reads processed: 17184877
    # reads with at least one reported alignment: 17194 (0.10%)
    [COLOR="Red"][B]# reads that failed to align: 17156835 (99.84%)[/B][/COLOR]
    # reads with alignments suppressed due to -m: 10848 (0.06%)
    Reported 127509 alignments to 1 output stream(s)
    $ cat fileIw0Tgh.log

    Code:
    # reads processed: 17184877
    # reads with at least one reported alignment: 69661 (0.41%)
    [COLOR="Red"][B]# reads that failed to align: 17098690 (99.50%)[/B][/COLOR]
    # reads with alignments suppressed due to -m: 16526 (0.10%)
    Reported 487470 alignments to 1 output stream(s)
    $ cat prep_reads.log
    Code:
    prep_reads v1.0.13
    ---------------------------
    62080 out of 17246957 reads have been filtered out
    Code:
    segment_juncs v1.0.13
    ---------------------------
    Loading reference sequences...
            Loading chr1...done
            Loading chr2...done
            Loading chr3...done
            Loading chr4...done
            Loading chr5...done
            Loading chr6...done
            Loading chr7...done
            Loading chr8...done
            Loading chr9...done
            Loading chr10...done
            Loading chr11...done
            Loading chr12...done
            Loading chr13...done
            Loading chr14...done
            Loading chr15...done
            Loading chr16...done
            Loading chr17...done
            Loading chr18...done
            Loading chr19...done
            Loading chr20...done
            Loading chr21...done
            Loading chr22...done
            Loading chrX...done
            Loading chrY...done
            Loading chrM...done
    Found 0 potential split-segment junctions
    Indexing extensions in Tophat_Brain/tmp//left_kept_reads_missing.fq
    Total extensions: 394607205
    Looking for junctions by island end pairings
    Adding hits from segment file 0 to coverage map
    Map covers 382631 bases
    Map covers 374273 bases in sufficiently long segments
    Map contains 8351 good islands
    417440 are left looking bases
    417332 are right looking bases
    Collecting potential splice sites in islands
    reporting synthetic splice junctions...
    Examining donor-acceptor pairings in chr20
    Examining donor-acceptor pairings in chr21
    Examining donor-acceptor pairings in chr22
    Examining donor-acceptor pairings in chr19
    Examining donor-acceptor pairings in chr18
    Examining donor-acceptor pairings in chr11
    Examining donor-acceptor pairings in chr10
    Examining donor-acceptor pairings in chr13
    Examining donor-acceptor pairings in chr12
    Examining donor-acceptor pairings in chr15
    Examining donor-acceptor pairings in chr14
    Examining donor-acceptor pairings in chr17
    Examining donor-acceptor pairings in chr16
    Examining donor-acceptor pairings in chrX
    Examining donor-acceptor pairings in chrY
    Examining donor-acceptor pairings in chr2
    Examining donor-acceptor pairings in chr3
    Examining donor-acceptor pairings in chr1
    Examining donor-acceptor pairings in chr6
    Examining donor-acceptor pairings in chr7
    Examining donor-acceptor pairings in chr4
    Examining donor-acceptor pairings in chr5
    Examining donor-acceptor pairings in chr8
    Examining donor-acceptor pairings in chr9
    Found 865 potential island-end pairing junctions
    done
    Looking for junctions between and within islands
    Adding hits from segment file 0 to coverage map
    Recording coverage islands
    Found 62807 islands covering 2120609 bases
    Collecting potential splice sites in islands
    reporting synthetic splice junctions...
    Examining donor-acceptor pairings in chr20
    Examining donor-acceptor pairings in chr21
    Examining donor-acceptor pairings in chr22
    Examining donor-acceptor pairings in chr19
    Examining donor-acceptor pairings in chr18
    Examining donor-acceptor pairings in chr11
    Examining donor-acceptor pairings in chr10
    Examining donor-acceptor pairings in chr13
    Examining donor-acceptor pairings in chr12
    Examining donor-acceptor pairings in chr15
    Examining donor-acceptor pairings in chr14
    Examining donor-acceptor pairings in chr17
    Examining donor-acceptor pairings in chr16
    Examining donor-acceptor pairings in chrX
    Examining donor-acceptor pairings in chrY
    Examining donor-acceptor pairings in chr2
    Examining donor-acceptor pairings in chr3
    Examining donor-acceptor pairings in chr1
    Examining donor-acceptor pairings in chr6
    Examining donor-acceptor pairings in chr7
    Examining donor-acceptor pairings in chr4
    Examining donor-acceptor pairings in chr5
    Examining donor-acceptor pairings in chr8
    Examining donor-acceptor pairings in chr9
    Found 173876 potential intra-island junctions
    done
    Reporting potential splice junctions...done
    Reported 174467 total possible splices

  • #2
    What's your mapping rate if you just run them through bowtie instead of tophat?

    Comment


    • #3
      it is the same just with bowtie

      why ~90% are failed ?
      I just used Eric Wang's public data means it is good data.

      Comment


      • #4
        Did you remember to clip the adapters?

        From experience I once aligned a Fastq file containing adapters and received a very similar low percentage alignment.

        That's my guess.

        Comment


        • #5
          Hey repinementer,

          did you solve this problem by clipping adapters and do you know the actual meaning of the different log files you posted? I get similar log files with different postfixes (fileXYZ.log) that contain different entries and I cannot make much sense out of them.

          Besides, does someone know if it is appropriate to just divide the number of lines in the accepted_hits.sam file from TopHat by the total number of reads in order to get the overall ratio of aligned reads?

          Best Moritz

          Comment


          • #6
            hi,drd2009,
            could you tell me how to clip the adapters?

            Comment


            • #7
              Originally posted by northbio View Post
              hi,drd2009,
              could you tell me how to clip the adapters?
              suggest you trying to find out the pipeline for seq-deal
              It is not difficult to get the "clean" sequence

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              25 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              22 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Working...
              X