Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to define XS tag

    Hi.

    Does anyone know how to define the XS tag? I'm trying to use cuffinks to analyze GSNAP result? But it found many errors in the sam file: SAM error on line ***: found spliced alignment without XS attribute. I tracked these lines. They all have XS:A:?. So I guess the problem is caused by that GNSAP couldn't determine the direction value for these reads. What should I do?

    Wish your help! Thanks very much!

  • #2
    Just encountered the same problem.

    I also looked at the output of Cufflinks and compared to what had got with the output of a previous version of GSNAP (Old version 2011-03-28).

    The main issue is that now I get many more (around 14000) Ensembl annotated genes with a FAIL FPKM_status compared to 127 that I got when using the previous version of GSNAP.

    If anybody has come up with a solution to that or knows why GSNAP behaves in such a a way, I'd love to hear your comments.

    Comment


    • #3
      From http://research-pub.gene.com/gmap/src/README :

      To provide splice orientation, though,
      our SAM output includes information about splice orientation in an
      extra "XS" field, which has possible values "+" (meaning the expected
      GT-AG, GC-AG, or AT-AC dinucleotide pair is on the plus strand of the
      genome), "-" (the dinucleotides are on the minus strand), or "?" (the
      direction is unknown, because the dinucleotides do not match GT-AG,
      GC-AG, AT-AC, or their complements).

      Edit:
      I forgot to say : maybe you should not rely on this spliced alignment...

      Comment


      • #4
        sam output error

        So did anyone find out how to solve the problem of sam output error(XS tag)


        Thanks

        Comment


        • #5
          I am in the same situation. The way I am trying to resolve the problem is by creating a program that would find the strand information (in addition to those located on a splice junction). There doesn't seem to be any other method.

          Brdido--why did you say "maybe you should not rely on this spliced alignment...?"

          Comment


          • #6
            Zorph,

            i wrote that because as the alignment program was not able to detect the "expected" splicing bases, something should be wrong with the read.

            But recently i wrote this question to the developers of cufflinks and they suggested to add "manually" this tag.

            I did not answered here because my problem was that i was using 2 differrent aligners... And one of them didn't assigned the XS:A tag.

            But i do believe now that adding the XS tag is not a problem at all and a good solution for compatibility with cufflinks. It is what I did.

            Comment


            • #7
              Hi Zorph, how did you find the strand information from a sam file? The first three lines of the sam file I'm using are

              HWI-EAS440:2:25:193:1474#0 0 2L 7532 25 75M * 0 0 CTCGCATGTAGAGATTTCCACTTATGTTTTCTCTACTTTCAGCAACCGAGAAGAGAACCCANGTTTGAACAAGTA abbaba`b^_`abaa`aa_aaaabaa_`aaaa_aa`aaa`aaba`]^`aa_aa``[[Z]XWDVR\\\YX^^^ZQU NM:i:1 X1:i:1 MD:Z:61N13
              HWI-EAS179:1:29:802:267#0 16 2L 7621 25 75M * 0 0 ACAGCTATCCCCGCTTCATAACGAATGAGGCTGCCGAGGACCTGATTTACAAGAAGTCCATGGGCGAGCGGGATC TQTYWRRQQXX]\\XOYZNZYZ]^YZ]^^^\X_^^a``a[\`^`_Z]^^aaaaaaa_aa`]aaa`a`aabbbaaa NM:i:0 X0:i:1 MD:Z:75
              HWUSI-EASXXX:2:65:1779:826#0 160 2L 7622 25 76M = 7711 165 CAGCTATCCCCGCTTCATAACGAATGAGGCTGCCGAGGACCTGATCTACAAGAAGTCCATGGGCGAGCGGGATCAG BC>CBCBACCBC@CCBCCCCBBCBCCCBBCCCCBBBBBA@ABBB@*<A;?BB>BABA@ABBABAB:BACB?7A=?= NM:i:1 X1:i:1 MD:Z:45C30

              I can't tell what their strand information is. Thanks!

              Originally posted by zorph View Post
              I am in the same situation. The way I am trying to resolve the problem is by creating a program that would find the strand information (in addition to those located on a splice junction). There doesn't seem to be any other method.

              Brdido--why did you say "maybe you should not rely on this spliced alignment...?"

              Comment


              • #8
                hey lijy03,

                HTML Code:
                Hi Zorph, how did you find the strand information from a sam file?
                I did't know the strand information from the SAM file. I knew it because of the way I prepped my reads.
                With my preparation using epicentre's script-seq kit, I knew that my PE-1 reads aligned to the 5' end of the sequence and that my PE-2 aligned to the 3' end of the sequence. The way that Illumina sequences these reads that works out that all my reads from PE-1 aligned to the sense strand of the RNA and my PE-2 aligned to the opposite strand of my transcribed RNA.

                Using BOTH this information and the sam flag, I was able to tell which strand my RNA was generated from.

                If you have a stranded prep and you know which read has the adaptor indicating directionality or in the case of Single reads, you know which end of the read corresponds to the 5' or 3' direction then you should be able to figure out the directionality of the reads in your library.

                Comment


                • #9
                  hey lijy03,

                  yhis thread might help you:

                  Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  27 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  31 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  27 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  52 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X