Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Obtaining ORFs from DNA sequence

    Hi Everyone,

    I have small question in regards to obtaining ORFs in DNA sequences. I know that EMBOSS tools have a collection of awesome tools to do this specifically.

    I have used six-pack from EMBOSS. Sixpack is able to find the ORFs and then translate them across all the 6 frames. Strangely it gives me a smaller amount of ORFs compared to the rest.

    Also there is Transeq, but that translates the whole sequence, which should be better suited for transcript sequences as I am working with Eukaryotic organisms.

    Then there is getORF, which finds the ORFs and outputs them.

    What is different between getORF and Sixpack in the EMBOSS tool set?

    My question is what is the difference between ORFget and Sixpack?

    Also I found this awesome perl script: http://navjeet-ahalawat.blogspot.com...n-reading.html

    ORFget, Sixpack and the Perl Script give me three different results. Why is that? I was expecting them to be the same.

    Also I was wondering is there a method to obtain or extract all of the ORF nucleotide sequence in fasta format for each of ORFs with the proper FASTA header between the protein and nucleotide sequences.

    Many thanks,

    Zapages

  • #2
    Rather than a plain ORF finder you should be looking at a gene prediction program if you are working with an eukaryote genome (https://en.wikipedia.org/wiki/List_o...ction_software).

    Comment


    • #3
      Hi Genomax,

      Thank you Genomax. I'm definitely using gene prediction programs as well. I'm trying to build confidence on newly denovo/ NCBI assembled sequences for a specific group of gemes of interest.

      I'm really interested in the orfs and the protein sequences due to variance in protein, PSI, and nucleotide BLASTs.

      Comment


      • #4
        Hi Everyone,

        Sorry for bumping the thread. I am just wondering what is the difference between getorf and six-pack from the EMBOSS tool set.

        Why am I getting different results among the two?

        Also I could use EMBOSS backtrans to get the transcript sequence from the translated protein sequence.

        Many thanks in advance.

        -Zapages

        Comment


        • #5
          Originally posted by GenoMax View Post
          Rather than a plain ORF finder you should be looking at a gene prediction program if you are working with an eukaryote genome (https://en.wikipedia.org/wiki/List_o...ction_software).
          Hi GenoMax,

          I am new to eukaryotic genomes and trying to get an idea of the "best" workflow for translating DNA sequences into predicted proteins. I came across this post and your recommendation to not use a plain ORF finder, rather use a gene prediction program. Is this because of the introns/exons and large number of repeats?

          Thanks

          Comment


          • #6
            More or less. Since you need to account for possibility of splicing you can't just use start/stops. That said what are you trying to get the proteins from? Are you working with de novo DNA sequence or transcriptome sequence.

            Comment


            • #7
              Hi Genomax,

              Thanks. It is de novo DNA sequence. 2 x 150 read, assembled using Megahit.

              Arvind

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Advancing Precision Medicine for Rare Diseases in Children
                by seqadmin




                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                12-16-2024, 07:57 AM
              • seqadmin
                Recent Advances in Sequencing Technologies
                by seqadmin



                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                Long-Read Sequencing
                Long-read sequencing has seen remarkable advancements,...
                12-02-2024, 01:49 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 12-17-2024, 10:28 AM
              0 responses
              34 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-13-2024, 08:24 AM
              0 responses
              49 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-12-2024, 07:41 AM
              0 responses
              34 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-11-2024, 07:45 AM
              0 responses
              46 views
              0 likes
              Last Post seqadmin  
              Working...
              X