Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with glimmer multi-extract

    I am having difficulty with the multi-extract application that comes with glimmer3. I have used glimmer to predict CDSs in a couple thousand contigs coming from a de novo assembly of illumina sequences from a bacterial genome. I now have coordinates for all of the predicted CDSs in the contigs but when I run multi-extract to extract from the fasta file of contigs the predicted CDSs sequences there are errors. That is when translating the nt seqs to amino acids it is clear that not all of the extracted nt seqs are stemming from open reading frames. I have checked the CDS coordinates and they are correct it is the extraction process not the prediction that is not working. Some of the regions extracted are not what they are supposed to be and some are correct. It appears that the extractions that are in error are because some of the contigs are being treated as circular DNA this is despite a -l linear sequence parameter being specified. Does anybody have an insight as to the problem, a fix, or a suggestion for an alternative extraction tool to use.
    SBB

  • #2
    Got a solution, I needed to add a -w parameter to tell multi-extract not to WRAP around the ends of contigs when extracting CDS sequence.
    SBB

    Comment


    • #3
      glimmer3 with multi-fasta files

      I am wondering how to run glimmer3 to get coordinates for all ORFs in a multi-FASTA file of contigs.
      Should I edit g3-iterated.csh for such multiple-sequence input files?

      I had errors when typing 'g3-iterated.csh genom.seq run3' as shown in the documentation (http://www.cbcb.umd.edu/software/gli...im302notes.pdf).
      Running 'g3-iterated.csh 454AllContigs.fna run3' printed Standard Error (STDERR) that 'Error allocating memory'.
      Running 'g3-iterated.csh 454Scaffolds.fna run3' printed Standard Error (STDERR) that 'Motif length is greater then input sequence orf00685'.
      Both runnings printed Standard Out (STDOUT) that 'Segmentation fault' and 'Failed to create PWM'.
      where byte count for each file is ca. 5M,
      454AllContigs.fna is a FASTA file of all the consensus basecalled contigs longer than 100 bases,
      454Scaffolds.fna is a FASTA file of the concatenated contig sequences that were scaffolded as a result of Paired End analysis. The contigs are separated by a number of ‘N’ corresponding to the estimated size of the gap between them (but with a minimum of 20 N’s to ensure the separation of the contigs)
      (http://xyala.cap.ed.ac.uk/Gene_Pool/...ls_Oct2009.pdf).

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 06:37 PM
      0 responses
      7 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 06:07 PM
      0 responses
      7 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      49 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      66 views
      0 likes
      Last Post seqadmin  
      Working...
      X