Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat: how to define primary alignment

    From the sam file generate by Tophat, if there are multiple mapped reads, one of the alignments will be labeled as primary alignment.
    does anybody know how Tophat define primary alignment?
    is there any reference mentioned it?
    thanks!

    i checked one read, there are four alignments.
    all of them have 2 mismatch, but only one are labeled as primary alignment.

  • #2
    Did you get ever figure this out? I would be interested in knowing this answer.

    Thanks,

    Comment


    • #3
      I am also quite interested in this! Still...

      Comment


      • #4
        Because some of my "non-primary" alignments look quite good in IGV...

        Comment


        • #5
          There seems to be confusion in what's meant by "primary" in the field.

          E.g. SAM flag interpreter http://picard.sourceforge.net/explain-flags.html clearly uses "non-primary" as an equivalent for spliced

          Comment


          • #6
            Originally posted by apredeus View Post
            There seems to be confusion in what's meant by "primary" in the field.

            E.g. SAM flag interpreter http://picard.sourceforge.net/explain-flags.html clearly uses "non-primary" as an equivalent for spliced
            I don't see the word 'spliced' anywhere on that page. The difference between primary and non-primary is that only one primary alignment is allowed. It was pretty clear, at least until 'supplemental alignments' and made things murky. The intent of supplemental appears to be for representing chimeric alignments, and thus it could be used for splice events, while secondary shouldn't be used that way.

            Comment


            • #7
              I don't see the word 'spliced' anywhere on that page.
              Flags of 256/272 in splice aligners' output usually means gapped (spliced) alignment to + and - strand respectively. That's what they call "not primary alignment" on that page.

              The difference between primary and non-primary is that only one primary alignment is allowed.
              Not sure what that means? I figured that if there's one best alignment score the read would be considered as mapping to one genomic position, while if there are few scores that are equally high would be a multimapper. Now, most RNA-seq aligners do allow multimappers up to a certain number of locations, e.g. Tophat's default is 15.

              If you call the read that has the best score primary, it's not clear what's the use of having a term like that since others would be discarded anyway?

              Comment


              • #8
                Originally posted by apredeus View Post
                If you call the read that has the best score primary, it's not clear what's the use of having a term like that since others would be discarded anyway?
                Depends on the application; sometimes they are discarded, sometimes not - they can be used fractionally when calculating coverage, for example. The most important case is when there is no single best-scoring location. Then one location can be assigned 'primary' and the rest 'secondary'. This allows a tool (or person) aware of secondary alignments to use that information later, or for simplicity (for example, when converting sam to fastq), all the secondary sites can just be ignored.

                Comment


                • #9
                  You're right, I was actually totally wrong to assume that flag to mean spliced as opposed to non-spliced alignment. It's "secondary" as defined by some other criteria.

                  Thank you

                  Comment


                  • #10
                    Originally posted by apredeus View Post
                    Flags of 256/272 in splice aligners' output usually means gapped (spliced) alignment to + and - strand respectively. That's what they call "not primary alignment" on that page.
                    That's not splicing, it's a chimeric/fusion/non-linear alignment (there are a bunch of terms for this). These days, those should have the 0x800 bit set in the flag.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM
                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    25 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    28 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 09:21 AM
                    0 responses
                    24 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-04-2024, 09:00 AM
                    0 responses
                    52 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X