Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat2/prep_reads error: unrecognized option '--max-seg-multihits'

    I know there are several posts about 'prep_reads', but they concern the Sanger/Phred quality issue. I think mye problem is different.

    I am trying to map single-end illumina reads. They are sequenced with the CASAVA 1.8 and later pipeline, so I have kept the default values regarding quality scores.

    I have mapped such reads without problems before, but now I always get error at the prep_reads stage:

    Code:
    [2013-06-02 12:53:23] Beginning TopHat run (v2.0.7)
    -----------------------------------------------
    [2013-06-02 12:53:23] Checking for Bowtie
                      Bowtie version:        2.0.6.0
    [2013-06-02 12:53:23] Checking for Samtools
                    Samtools version:        0.1.18.0
    [2013-06-02 12:53:23] Checking for Bowtie index files
    [2013-06-02 12:53:23] Checking for reference FASTA file
    [2013-06-02 12:53:23] Generating SAM header for sycon-genome
            format:          fastq
            quality scale:   phred33 (default)
    [2013-06-02 12:53:24] Preparing reads
            [FAILED]
    Error running 'prep_reads'
    Usage:   prep_reads <reads1.fa/fq,...,readsN.fa/fq>

    And inside the prep_reads log:

    Code:
    /cluster/home/jonbra/bin/prep_reads: unrecognized option '--max-seg-multihits'
    Any tips on what is wrong?

  • #2
    Tophat2/prep_reads error: unrecognized option '--max-seg-multihits'

    What was the command you used to run Tohat?

    Comment


    • #3
      Code:
      tophat2 -p 8 --library-type fr-firststrand -o /output genome-file /seq-data.fastq
      And here's a sample of the sequence data:
      Code:
      @HWI-ST486:386:D1UMHACXX:3:1101:1440:2126 1:N:0:ACAGTG
      NTAACATTGTTTAAATGGAGAAAATAACCGTATGAAGAAGTTAATGAAGTTAATGCTGCTGGCAAGTGCCAGTTTAACCGTGGGTTGTGCAACATCTGATA
      +
      #11AA1B?BDA<BDB9ACFCFFIC4FFCF>)8CC:?D9?D:9BDDDDEEDDIA>D9?DEIC=CACDCEEC7=ACEEDDDA??@<35?81>>>A99::>AAA
      @HWI-ST486:386:D1UMHACXX:3:1101:1389:2165 1:N:0:ACAGTG
      AACTTTTAACGGTGGATCTCTTGGCTCGTGGATCGATGAAGAAAGCAGCAAACTGCGATACGTAGTGTGAATTGCAGAATTCAGTGAATCATCGAATTTTT
      +
      @@?DDBBDF<FFFIIBGEHAEHIE3A<?DAF=@0??B<DFD<D/9.)88CCF>@CCE'=<B?73;;3;@(;B@:5((5>@BA:>5:(;3:A>@99?A####
      @HWI-ST486:386:D1UMHACXX:3:1101:1363:2167 1:N:0:ACAGTG
      TTTATTTGGTTAGGGCTGAGGTAGTGACAAGTTCACTACCTCTTTTAAAAAAAACAAAAAAAAAAAAAAAAAAAAAAAATGGAAGGACAAAACGCTTCACC
      +
      @@@DDDDDHHHDHIIIIIIGIB3AABA4?CC?C??BD<?;B<?FGI<DCGGEEHIIFHFHEBBBBBBBB@@BBBB##########################
      Last edited by JonB; 06-02-2013, 11:30 AM.

      Comment


      • #4
        I think maybe the problem is that I have sequenced small RNAs, and my reads are all 100 bps. I forgot to trim the Poly-A stretches and low quality bases...

        Also I think I only need to run Bowtie and not Tophat, as I do not expect spliced small RNAs.

        Comment


        • #5
          I talking mostly to myself here... but I solved the problem (or at least it works now).

          I forgot that I had sequenced smallRNAs, so I had to remove Poly-A stretches from the oligodT probe and other sequence on the 5'-end (sometimes it was sequenced all the way through the Poly-A stretch).

          And I also mapped using bowtie2 instead of Tophat2. No need to look for splice junctions when I have smallRNA data I reckon.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 08:47 AM
          0 responses
          9 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          60 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          57 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          53 views
          0 likes
          Last Post seqadmin  
          Working...
          X