Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Six reading frame question... why all contain ORF??

    Hi,

    So I have a conceptual question I'm trying to get my head around. I have some RNA-seq data and was trying to determine the ORF of each read. Of course a six reading frame translation of a given nucleotide sequence would be expected to have a significant ORF in at least one frame, as long as the sequence comes from a gene region.

    However, I find short bits of sequences (my 150bp RNA-seq reads) that appears to have continuous ORFs on all 6 frames of translation without any stop codons at all... how can this be? Since there are 3 stop codons out of 64 possibilities, we should statistically see a stop every 21AA (63 bases) or so.

    I realize this could be an anomaly, but this seems to be the case with about 10% of all my RNA-seq reads. I realize this could happen with repetitive sequence, but I don't think that is the case, since it is RNA-seq data.

    Any thoughts or speculations are gladly welcomed!!

  • #2
    RNA-seq libraries are almost never full length; the strands are fragmented into shorter fragments before sequencing. Therefore the reads you get are only a portion of the full mRNA. If you want to get the complete AA sequence of an RNA, you'll have to assemble your reads back together first.

    Comment


    • #3
      Thanks for the reply. I understand this is just a small fragment of a whole mRNA, but for a span of 150 bases, I can't understand why we should find no stop codons on all 6 reading frames.

      Comment


      • #4
        Originally posted by all_your_base View Post
        Since there are 3 stop codons out of 64 possibilities, we should statistically see a stop every 21AA (63 bases) or so.
        ...assuming the same frequency for each base, which is usually not the case. What is the GC% of this genome? Also, base distribution is not uniform and often differs between regions (gene/intergenic, exon/intron, etc). You might find GC-rich repeats in 3'UTRs for instance. Last, this subset of 10% might come from the same genomic locus.
        Have you first tried fastqc on your reads?

        Comment


        • #5
          Originally posted by all_your_base View Post
          Since there are 3 stop codons out of 64 possibilities, we should statistically see a stop every 21AA (63 bases) or so.
          Bases and codons aren't randomly distributed, nor should one to expect them to be.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin


            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          39 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          41 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          35 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Working...
          X