Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tuning TopHat parameters for SOLiD reads

    Hi folks

    The new TopHat 1.1.1, nicely handles SOLiD reads, and the results can be piped easily to Cufflinks to make up a RNAseq pipeline. However, the manual page says:

    In TopHat 1.1.0, we began supporting Applied Biosystems' Colorspace format. The software is optimized for reads 75bp or longer.
    and there are no further guidelines how to tune the parameters if you have shorted reads, as in my case 30bp single-end. Does anybody have any clue on this? Of course this is tunable through editing the TopHat scripts but, the question is how to set the parameters not to screw the whole thing up

    I did a comparison, using TopHat+GTF file and defaults parameters, I get about 5 times less aligned tags than what I get from Bowtie with some sensible parameters.
    Last edited by Pejman; 10-15-2010, 08:59 AM.

  • #2
    Based on my limited understanding of how TopHat works, I would be very hesitant to use reads as short as 30bp. It seems to me you would have a pretty good chance of either the front or the back of the read (before or after the splice) mapping to random sections of the genome that way. For a small genome (like something microbial) that may not be too much of a problem. For anything comparable in size to human I would be extremely hesitant to trust any junction mapping without at least 40bp, and not very confident without 50+, as that would allow for a few mismatches before random matches became a serious problem.

    Comment


    • #3
      well, I'm working with human data, so ... but I'm not looking for new junctions, so I'm running it using predefined junctions from RefSeq and --no-new-junctions option, both for Tophat and Cufflinks. Now I've ran Bowtie -> cufflinks which is not recommended by the authors, and TopHat -. Cufflinks, which is recommended, but heavily fails on the alignment step. I'm gonna do some comparison, I'll keep you updated in case of any conclusive observations! Suggestions are welcome!

      Comment


      • #4
        Hello Pejmen,

        I am Shilp Purohit, and I am currently working with ABI SOLiD 3 plus sequencer. I am trying to install TopHat 1.1.2 (BETA) on my local machine. The "Getting started" manual suggests that I need to run following commands:

        ./configure
        make
        make install

        However, the unzipped tar package doesn't have these files to execute, i.e. configure, make file and install file.

        On the other hand, I unzipped tar package of TopHat version 1.1.0 and it worked absolutely fine and installed correctly just because the package consisted of these three files. But this version doesn't support the colorspace format of SOLiD.

        Can you please suggest me as to how to install version 1.1.2? Any suggestions will be highly appreciated.

        I am looking forward to having a reply soon.

        Thanking you.

        Comment


        • #5
          Hi
          The last version I've used was 1.1.1 and that supports CS data. I recommend you to just use the precompiled versions, if you have problem with compiling. I checked the source files for 1.1.2, the configure file is there!

          Comment


          • #6
            How risky do you consider 50 bp reads?

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            30 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            32 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Working...
            X