Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • New: updated SpliceMap supports SAM format/cufflinks

    Hi SEQanswers Community,

    SpliceMap has received significant updates, with numerous bugs fixed. If you were having trouble with the old version, please try this new one.

    New webpage: http://www-stat.stanford.edu/~kinfai/SpliceMap/

    The new webpage has a detailed tutorial/manual to help you along.

    In the next point release we will add Bowtie support for the short reads mapping. I realize that not many of you have access to Eland and this is inconvenient as SeqMap is quite slow (but very accurate)...
    In the next version you can use the workflow: Bowtie -> SpliceMap -> Cufflinks.

    Let me know if you have trouble with using the SAM output with Cufflinks.

    The source code is now included so you can compile the code for your
    own platform. It should work on any version of Linux.

    ---

    Notable New features:
    • SAM format support (sorted for use in cufflinks)
    • Many bugs fixed (sorry about that )
    • More accurate coverage calculation
    • Faster junction search
    • and more!


    ---

    Comparison to Tophat with 23 million (50bp) reads mapped to the human genome (hg18). EST validation was used to judge specificity.

    Code:
    SpliceMap (Depending on level of filtering)
    Junctions found    169,098   121,200	
    EST validated      143,840   114,661	
    Novel junctions    31,155    7,521	
    Specificity (EST)  85.06%    94.60%
    Code:
    Tophat
    Junctions found    133,722
    EST validated      117,113
    Novel junctions    19,777
    Specificity (EST)  87.58%
    I tested randomly filtering out unreliable junctions until the total number of junctions found is about the same as Tophat and the specificity in that case is about 90%, which is still an improvement over Tophat.

    details here: http://www-stat.stanford.edu/~kinfai.../features.html
    Last edited by john_mu; 05-20-2010, 05:38 PM.
    SpliceMap: De novo detection of splice junctions from RNA-seq
    Download SpliceMap Comment here

  • #2
    Sounds great

    Are there any plans to support reads shorter than 50 bp?

    Comment


    • #3
      Thanks Thomas!

      Sorry, no.. not at this stage

      Supporting reads shorter than 50 will need significant reworking of the algorithm. It could be said that the algorithm is "optimized" for 50 bp reads.

      Next on the agenda is Bowtie support, then some usability improvements and ideas on assessing junction reliability.

      Edit: oops.. I posted this from my old account.

      Comment


      • #4
        Originally posted by jm1234567890 View Post
        Thanks Thomas!

        Sorry, no.. not at this stage

        Supporting reads shorter than 50 will need significant reworking of the algorithm. It could be said that the algorithm is "optimized" for 50 bp reads.

        Next on the agenda is Bowtie support, then some usability improvements and ideas on assessing junction reliability.

        Edit: oops.. I posted this from my old account.
        Is the next version going to support color space reads? Tophat doesn't natively support color space reads, although the mapping algorithm it uses (Bowtie) can handle color space reads.

        thanks

        Comment


        • #5
          Originally posted by xguo View Post
          Is the next version going to support color space reads? Tophat doesn't natively support color space reads, although the mapping algorithm it uses (Bowtie) can handle color space reads.

          thanks
          Hi xguo,

          Thanks for your interest! We are investigating how to support colour-space reads.

          However, currently it seems that Bowtie (even with "try hard") is not as sensitive as Eland (or SeqMap) in the mapping and this is effecting the performance of SpliceMap. We are currently investigating why this is the case.

          But, yes, once bowtie support is implemented, colour-space should appear in the next version (or the one after that).

          John Mu
          SpliceMap: De novo detection of splice junctions from RNA-seq
          Download SpliceMap Comment here

          Comment


          • #6
            Originally posted by john_mu View Post
            Hi xguo,

            Thanks for your interest! We are investigating how to support colour-space reads.

            However, currently it seems that Bowtie (even with "try hard") is not as sensitive as Eland (or SeqMap) in the mapping and this is effecting the performance of SpliceMap. We are currently investigating why this is the case.

            But, yes, once bowtie support is implemented, colour-space should appear in the next version (or the one after that).

            John Mu

            Just to update... Bowtie is fine. There was a typo in the code which gave the impression that there was poor mapping.

            Preliminary, tests seem to suggest it is performing well. The release should be in 1 week or so! This will include native FASTQ and FASTA support (Although, still no support for read quality, which is overall not too important to junction search).
            SpliceMap: De novo detection of splice junctions from RNA-seq
            Download SpliceMap Comment here

            Comment


            • #7
              Originally posted by john_mu View Post
              The release should be in 1 week or so! This will include native FASTQ and FASTA support (Although, still no support for read quality, which is overall not too important to junction search).

              Why is read quality not important to junction search? If a bad quality read is aligned to the wrong place, will it not result in your program detecting spurious junctions? Unless your thought is that bad quality reads will not get aligned at all? Please tell us your logic.

              I am waiting to use your program when bowtie is supported. I would like to use the cufflinks/cuffcompare pipeline after I get the reads mapped to junctions using your program. I think all we need is SAM formatted alignment that is sorted for this purpose.

              Also, what percentage of your novel junctions included the canonical splice sites? I am assuming a big chunk? Finally what is your definition of specificity here?

              Comment


              • #8
                Hi thinkRNA,

                My answers to your questions are below.

                Originally posted by thinkRNA View Post
                Why is read quality not important to junction search? If a bad quality read is aligned to the wrong place, will it not result in your program detecting spurious junctions? Unless your thought is that bad quality reads will not get aligned at all? Please tell us your logic.
                In general, when mapping, read strata (number of mis-matches) trumps read quality. For example, a read with one mismatch but low quality is preferred over a read with 2 mismatches but good quality.

                Read quality is more important for SNP calling, since each nt counts there. Some people are interested in SNPs from RNA-seq reads, so read quality is still useful. We will add it in future releases, this is simply a matter of writing the code, which takes some time.

                Originally posted by thinkRNA View Post
                I am waiting to use your program when bowtie is supported. I would like to use the cufflinks/cuffcompare pipeline after I get the reads mapped to junctions using your program. I think all we need is SAM formatted alignment that is sorted for this purpose.
                Yes, the SAM output from SpliceMap is sorted as required by cufflinks. The other problem is that cufflinks does not support clipping and there will be some clipped alignments from SpliceMap. I also had to reformat the SAM
                file for this purpose.

                Originally posted by thinkRNA View Post
                Also, what percentage of your novel junctions included the canonical splice sites? I am assuming a big chunk? Finally what is your definition of specificity here?
                All of the junctions found by SpliceMap are using the canonical splice sites. I'll add a paragraph about how SpliceMap works in the updated webpage later. This is all described in the paper.

                Specificity is found by comparing the discovered junctions to known ESTs in the existing databases. If the junction is validated by an EST, we call that junction reliable. So specificity = (junction validated by EST)/(all junctions found).

                Novel junctions has a different definition, this is defined as a junction which is not found in existing gene annotations.

                Thanks for your interest!
                SpliceMap: De novo detection of splice junctions from RNA-seq
                Download SpliceMap Comment here

                Comment


                • #9
                  Originally posted by john_mu View Post
                  Hi xguo,

                  Thanks for your interest! We are investigating how to support colour-space reads.

                  However, currently it seems that Bowtie (even with "try hard") is not as sensitive as Eland (or SeqMap) in the mapping and this is effecting the performance of SpliceMap. We are currently investigating why this is the case.

                  But, yes, once bowtie support is implemented, colour-space should appear in the next version (or the one after that).

                  John Mu
                  Thanks, john. Do you have any plan to support BWA in the near future?

                  Comment


                  • #10
                    Originally posted by xguo View Post
                    Thanks, john. Do you have any plan to support BWA in the near future?
                    Yes, we will consider that. It is not hard to add support for a new mapper.

                    However, what do you consider is the advantage of BWA over Bowtie?

                    SpliceMap does not support indels in seeds since they are so short.
                    SpliceMap: De novo detection of splice junctions from RNA-seq
                    Download SpliceMap Comment here

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM
                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    31 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    33 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 09:21 AM
                    0 responses
                    28 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-04-2024, 09:00 AM
                    0 responses
                    53 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X