Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to define XS tag

    Hi.

    Does anyone know how to define the XS tag? I'm trying to use cuffinks to analyze GSNAP result? But it found many errors in the sam file: SAM error on line ***: found spliced alignment without XS attribute. I tracked these lines. They all have XS:A:?. So I guess the problem is caused by that GNSAP couldn't determine the direction value for these reads. What should I do?

    Wish your help! Thanks very much!

  • #2
    Just encountered the same problem.

    I also looked at the output of Cufflinks and compared to what had got with the output of a previous version of GSNAP (Old version 2011-03-28).

    The main issue is that now I get many more (around 14000) Ensembl annotated genes with a FAIL FPKM_status compared to 127 that I got when using the previous version of GSNAP.

    If anybody has come up with a solution to that or knows why GSNAP behaves in such a a way, I'd love to hear your comments.

    Comment


    • #3
      From http://research-pub.gene.com/gmap/src/README :

      To provide splice orientation, though,
      our SAM output includes information about splice orientation in an
      extra "XS" field, which has possible values "+" (meaning the expected
      GT-AG, GC-AG, or AT-AC dinucleotide pair is on the plus strand of the
      genome), "-" (the dinucleotides are on the minus strand), or "?" (the
      direction is unknown, because the dinucleotides do not match GT-AG,
      GC-AG, AT-AC, or their complements).

      Edit:
      I forgot to say : maybe you should not rely on this spliced alignment...

      Comment


      • #4
        sam output error

        So did anyone find out how to solve the problem of sam output error(XS tag)


        Thanks

        Comment


        • #5
          I am in the same situation. The way I am trying to resolve the problem is by creating a program that would find the strand information (in addition to those located on a splice junction). There doesn't seem to be any other method.

          Brdido--why did you say "maybe you should not rely on this spliced alignment...?"

          Comment


          • #6
            Zorph,

            i wrote that because as the alignment program was not able to detect the "expected" splicing bases, something should be wrong with the read.

            But recently i wrote this question to the developers of cufflinks and they suggested to add "manually" this tag.

            I did not answered here because my problem was that i was using 2 differrent aligners... And one of them didn't assigned the XS:A tag.

            But i do believe now that adding the XS tag is not a problem at all and a good solution for compatibility with cufflinks. It is what I did.

            Comment


            • #7
              Hi Zorph, how did you find the strand information from a sam file? The first three lines of the sam file I'm using are

              HWI-EAS440:2:25:193:1474#0 0 2L 7532 25 75M * 0 0 CTCGCATGTAGAGATTTCCACTTATGTTTTCTCTACTTTCAGCAACCGAGAAGAGAACCCANGTTTGAACAAGTA abbaba`b^_`abaa`aa_aaaabaa_`aaaa_aa`aaa`aaba`]^`aa_aa``[[Z]XWDVR\\\YX^^^ZQU NM:i:1 X1:i:1 MD:Z:61N13
              HWI-EAS179:1:29:802:267#0 16 2L 7621 25 75M * 0 0 ACAGCTATCCCCGCTTCATAACGAATGAGGCTGCCGAGGACCTGATTTACAAGAAGTCCATGGGCGAGCGGGATC TQTYWRRQQXX]\\XOYZNZYZ]^YZ]^^^\X_^^a``a[\`^`_Z]^^aaaaaaa_aa`]aaa`a`aabbbaaa NM:i:0 X0:i:1 MD:Z:75
              HWUSI-EASXXX:2:65:1779:826#0 160 2L 7622 25 76M = 7711 165 CAGCTATCCCCGCTTCATAACGAATGAGGCTGCCGAGGACCTGATCTACAAGAAGTCCATGGGCGAGCGGGATCAG BC>CBCBACCBC@CCBCCCCBBCBCCCBBCCCCBBBBBA@ABBB@*<A;?BB>BABA@ABBABAB:BACB?7A=?= NM:i:1 X1:i:1 MD:Z:45C30

              I can't tell what their strand information is. Thanks!

              Originally posted by zorph View Post
              I am in the same situation. The way I am trying to resolve the problem is by creating a program that would find the strand information (in addition to those located on a splice junction). There doesn't seem to be any other method.

              Brdido--why did you say "maybe you should not rely on this spliced alignment...?"

              Comment


              • #8
                hey lijy03,

                HTML Code:
                Hi Zorph, how did you find the strand information from a sam file?
                I did't know the strand information from the SAM file. I knew it because of the way I prepped my reads.
                With my preparation using epicentre's script-seq kit, I knew that my PE-1 reads aligned to the 5' end of the sequence and that my PE-2 aligned to the 3' end of the sequence. The way that Illumina sequences these reads that works out that all my reads from PE-1 aligned to the sense strand of the RNA and my PE-2 aligned to the opposite strand of my transcribed RNA.

                Using BOTH this information and the sam flag, I was able to tell which strand my RNA was generated from.

                If you have a stranded prep and you know which read has the adaptor indicating directionality or in the case of Single reads, you know which end of the read corresponds to the 5' or 3' direction then you should be able to figure out the directionality of the reads in your library.

                Comment


                • #9
                  hey lijy03,

                  yhis thread might help you:

                  Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  7 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  7 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  49 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  66 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X