Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Obtaining ORFs from DNA sequence

    Hi Everyone,

    I have small question in regards to obtaining ORFs in DNA sequences. I know that EMBOSS tools have a collection of awesome tools to do this specifically.

    I have used six-pack from EMBOSS. Sixpack is able to find the ORFs and then translate them across all the 6 frames. Strangely it gives me a smaller amount of ORFs compared to the rest.

    Also there is Transeq, but that translates the whole sequence, which should be better suited for transcript sequences as I am working with Eukaryotic organisms.

    Then there is getORF, which finds the ORFs and outputs them.

    What is different between getORF and Sixpack in the EMBOSS tool set?

    My question is what is the difference between ORFget and Sixpack?

    Also I found this awesome perl script: http://navjeet-ahalawat.blogspot.com...n-reading.html

    ORFget, Sixpack and the Perl Script give me three different results. Why is that? I was expecting them to be the same.

    Also I was wondering is there a method to obtain or extract all of the ORF nucleotide sequence in fasta format for each of ORFs with the proper FASTA header between the protein and nucleotide sequences.

    Many thanks,

    Zapages

  • #2
    Rather than a plain ORF finder you should be looking at a gene prediction program if you are working with an eukaryote genome (https://en.wikipedia.org/wiki/List_o...ction_software).

    Comment


    • #3
      Hi Genomax,

      Thank you Genomax. I'm definitely using gene prediction programs as well. I'm trying to build confidence on newly denovo/ NCBI assembled sequences for a specific group of gemes of interest.

      I'm really interested in the orfs and the protein sequences due to variance in protein, PSI, and nucleotide BLASTs.

      Comment


      • #4
        Hi Everyone,

        Sorry for bumping the thread. I am just wondering what is the difference between getorf and six-pack from the EMBOSS tool set.

        Why am I getting different results among the two?

        Also I could use EMBOSS backtrans to get the transcript sequence from the translated protein sequence.

        Many thanks in advance.

        -Zapages

        Comment


        • #5
          Originally posted by GenoMax View Post
          Rather than a plain ORF finder you should be looking at a gene prediction program if you are working with an eukaryote genome (https://en.wikipedia.org/wiki/List_o...ction_software).
          Hi GenoMax,

          I am new to eukaryotic genomes and trying to get an idea of the "best" workflow for translating DNA sequences into predicted proteins. I came across this post and your recommendation to not use a plain ORF finder, rather use a gene prediction program. Is this because of the introns/exons and large number of repeats?

          Thanks

          Comment


          • #6
            More or less. Since you need to account for possibility of splicing you can't just use start/stops. That said what are you trying to get the proteins from? Are you working with de novo DNA sequence or transcriptome sequence.

            Comment


            • #7
              Hi Genomax,

              Thanks. It is de novo DNA sequence. 2 x 150 read, assembled using Megahit.

              Arvind

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              9 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              49 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              67 views
              0 likes
              Last Post seqadmin  
              Working...
              X