Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Pejman
    Member
    • Jul 2010
    • 23

    Tuning TopHat parameters for SOLiD reads

    Hi folks

    The new TopHat 1.1.1, nicely handles SOLiD reads, and the results can be piped easily to Cufflinks to make up a RNAseq pipeline. However, the manual page says:

    In TopHat 1.1.0, we began supporting Applied Biosystems' Colorspace format. The software is optimized for reads 75bp or longer.
    and there are no further guidelines how to tune the parameters if you have shorted reads, as in my case 30bp single-end. Does anybody have any clue on this? Of course this is tunable through editing the TopHat scripts but, the question is how to set the parameters not to screw the whole thing up

    I did a comparison, using TopHat+GTF file and defaults parameters, I get about 5 times less aligned tags than what I get from Bowtie with some sensible parameters.
    Last edited by Pejman; 10-15-2010, 08:59 AM.
  • mrawlins
    Member
    • Apr 2010
    • 63

    #2
    Based on my limited understanding of how TopHat works, I would be very hesitant to use reads as short as 30bp. It seems to me you would have a pretty good chance of either the front or the back of the read (before or after the splice) mapping to random sections of the genome that way. For a small genome (like something microbial) that may not be too much of a problem. For anything comparable in size to human I would be extremely hesitant to trust any junction mapping without at least 40bp, and not very confident without 50+, as that would allow for a few mismatches before random matches became a serious problem.

    Comment

    • Pejman
      Member
      • Jul 2010
      • 23

      #3
      well, I'm working with human data, so ... but I'm not looking for new junctions, so I'm running it using predefined junctions from RefSeq and --no-new-junctions option, both for Tophat and Cufflinks. Now I've ran Bowtie -> cufflinks which is not recommended by the authors, and TopHat -. Cufflinks, which is recommended, but heavily fails on the alignment step. I'm gonna do some comparison, I'll keep you updated in case of any conclusive observations! Suggestions are welcome!

      Comment

      • waterboy
        Member
        • Oct 2010
        • 14

        #4
        Hello Pejmen,

        I am Shilp Purohit, and I am currently working with ABI SOLiD 3 plus sequencer. I am trying to install TopHat 1.1.2 (BETA) on my local machine. The "Getting started" manual suggests that I need to run following commands:

        ./configure
        make
        make install

        However, the unzipped tar package doesn't have these files to execute, i.e. configure, make file and install file.

        On the other hand, I unzipped tar package of TopHat version 1.1.0 and it worked absolutely fine and installed correctly just because the package consisted of these three files. But this version doesn't support the colorspace format of SOLiD.

        Can you please suggest me as to how to install version 1.1.2? Any suggestions will be highly appreciated.

        I am looking forward to having a reply soon.

        Thanking you.

        Comment

        • Pejman
          Member
          • Jul 2010
          • 23

          #5
          Hi
          The last version I've used was 1.1.1 and that supports CS data. I recommend you to just use the precompiled versions, if you have problem with compiling. I checked the source files for 1.1.2, the configure file is there!

          Comment

          • JohnK
            Senior Member
            • Feb 2010
            • 106

            #6
            How risky do you consider 50 bp reads?

            Comment

            Latest Articles

            Collapse

            • SEQadmin2
              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by SEQadmin2


              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

              Here are nine questions we think about, in roughly the order they matter, before...
              06-18-2026, 07:11 AM
            • SEQadmin2
              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
              by SEQadmin2


              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
              ...
              06-02-2026, 10:05 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, Today, 11:10 AM
            0 responses
            6 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-17-2026, 06:09 AM
            0 responses
            41 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-09-2026, 11:58 AM
            0 responses
            102 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            123 views
            0 reactions
            Last Post SEQadmin2  
            Working...