Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spliced aligner for 454 reads?

    Hi all,
    I am trying to assemble 454 reads by aligning them to reference genome.

    I have tried the official gsMapper before, which just gave me exon sequences, rather than transcripts sequences. Now, I am using BWA-SW to align the reads, and Cufflinks/Scripture to reconstruct the transcripts. Still, I can only get the individual exon sequences, instead of expected transcript sequences.

    It seems the problem here is that BWA-SW is not a spliced aligner so that the splicing junction reads will be lost during the alignment. A spliced aligner, Tophat, is used in the tutorial of both Cufflinks and Scripture. However, this aligner is based on Bowtie, an aligner designed for short reads only.

    Could anyone give me some suggestion of the spliced aligner suitable for 454 reads? I think BLAT would be an option, but still want to test some other methods developed recently.

    Thanks,
    Shuli

  • #2
    For a read bridging a splice junction, bwasw should give two or more hits unless one of them is too short. Perhaps Cufflinks is expecting some tophat/bowtie specific information to group local hits to a transcript. I do not know.

    Nonetheless, I agree bwa-sw would not work well because it is a local aligner. For RNA-seq/ESTs, a dedicated splicing-aware glocal aligner is more appropriate. In addition to blat, you may also try gmap.

    Comment


    • #3
      gmap can align 454 ESTs against a genome taking into account the introns.

      Comment


      • #4
        Hi,

        our mapper (Genomatix) has a local spliced alignment mode that allows to align complete transcripts to the genome. Attached is a screenshot of assembled (velvet) RNA-Seq reads mapped to the reference genome.

        Depending on the organism and your objective you could also consider mapping your reads against a transcriptome library (with no worries about splicing your reads). Then, however, you wouldn't be able to discover novel transcript variants.
        Attached Files

        Comment


        • #5
          Blat is also able to handle 454-splice reads but be aware of long runtime....

          Comment


          • #6
            Thanks Li Heng and Jose. I have tried GMAP and checked the output. It looks good. According the a document I have found, GMAP outperforms BLAT for gene structure identification in both speed and accuracy. And the latest version of GMAP can generate output in SAM format directly, which facilitates the subsequent analysis.

            Finally, I got some transcripts constructed by running Cufflinks on GMAP output. However, it seems that Cufflinks has modified the original alignment and generated some artificial exons/splicing junctions I've never seen in the GMAP output...

            Comment


            • #7
              Originally posted by sulicon View Post
              However, it seems that Cufflinks has modified the original alignment and generated some artificial exons/splicing junctions I've never seen in the GMAP output...
              Maybe this is why (from http://cufflinks.cbcb.umd.edu/manual.html):

              -g/--GTF-guide <reference_annotation.(gtf/gff)> : Tells Cufflinks to use the supplied reference annotation (GFF) to guide RABT assembly. Reference transcripts will be tiled with faux-reads to provide additional information in assembly. Output will include all reference transcripts as well as any novel genes and isoforms that are assembled.

              I'll try to use gmap+cufflinks for 454 data! Thanks.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              30 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Working...
              X