SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
LTR_STRUC : output rprt file - active site in frame 1 Pradhaun Bioinformatics 0 01-08-2013 08:16 AM
On using BLASTX to find the protein coding frame woa Bioinformatics 1 11-26-2012 11:34 PM
Eukaryotic orf finder svj Bioinformatics 2 10-14-2012 06:40 AM
Bisulphite Sequencing Proof-reading or NOT proof-reading? yog77 Epigenetics 1 01-25-2012 06:45 AM
When is Open reading frame=gene? ritzriya RNA Sequencing 4 10-06-2010 08:10 PM

Reply
 
Thread Tools
Old 04-03-2013, 05:56 PM   #1
all_your_base
Member
 
Location: USA

Join Date: Mar 2012
Posts: 40
Default Six reading frame question... why all contain ORF??

Hi,

So I have a conceptual question I'm trying to get my head around. I have some RNA-seq data and was trying to determine the ORF of each read. Of course a six reading frame translation of a given nucleotide sequence would be expected to have a significant ORF in at least one frame, as long as the sequence comes from a gene region.

However, I find short bits of sequences (my 150bp RNA-seq reads) that appears to have continuous ORFs on all 6 frames of translation without any stop codons at all... how can this be? Since there are 3 stop codons out of 64 possibilities, we should statistically see a stop every 21AA (63 bases) or so.

I realize this could be an anomaly, but this seems to be the case with about 10% of all my RNA-seq reads. I realize this could happen with repetitive sequence, but I don't think that is the case, since it is RNA-seq data.

Any thoughts or speculations are gladly welcomed!!
all_your_base is offline   Reply With Quote
Old 04-03-2013, 07:53 PM   #2
kcchan
Senior Member
 
Location: USA

Join Date: Jul 2012
Posts: 153
Default

RNA-seq libraries are almost never full length; the strands are fragmented into shorter fragments before sequencing. Therefore the reads you get are only a portion of the full mRNA. If you want to get the complete AA sequence of an RNA, you'll have to assemble your reads back together first.
kcchan is offline   Reply With Quote
Old 04-03-2013, 08:00 PM   #3
all_your_base
Member
 
Location: USA

Join Date: Mar 2012
Posts: 40
Default

Thanks for the reply. I understand this is just a small fragment of a whole mRNA, but for a span of 150 bases, I can't understand why we should find no stop codons on all 6 reading frames.
all_your_base is offline   Reply With Quote
Old 04-04-2013, 04:11 AM   #4
syfo
Just a member
 
Location: Southern EU

Join Date: Nov 2012
Posts: 86
Default

Quote:
Originally Posted by all_your_base View Post
Since there are 3 stop codons out of 64 possibilities, we should statistically see a stop every 21AA (63 bases) or so.
...assuming the same frequency for each base, which is usually not the case. What is the GC% of this genome? Also, base distribution is not uniform and often differs between regions (gene/intergenic, exon/intron, etc). You might find GC-rich repeats in 3'UTRs for instance. Last, this subset of 10% might come from the same genomic locus.
Have you first tried fastqc on your reads?
syfo is offline   Reply With Quote
Old 04-04-2013, 05:03 AM   #5
dpryan
Devon Ryan
 
Location: Bonn, Germany

Join Date: Jul 2011
Posts: 2,029
Default

Quote:
Originally Posted by all_your_base View Post
Since there are 3 stop codons out of 64 possibilities, we should statistically see a stop every 21AA (63 bases) or so.
Bases and codons aren't randomly distributed, nor should one to expect them to be.
dpryan is offline   Reply With Quote
Reply

Tags
blastx, ngs, orf, rna-seq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:44 AM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.