Originally posted by dpryan
View Post
Now, with respect to other datasets, your advise to check a random gene from the dataset is based on the assumption that all the entries of the same dataset behave the same. This assumption is based on my example with S.Cerevisae genome above. However it could be (and I still suspect that this could be the case) that some entries of the mouse or human datasets could indicate the 5'-UTR as "gene start", whereas other entries for which the 5'-UTR information is simply not available, indicate the ORF start as the "gene start". Thus checking a random gene wound not prove anything. I hope it is not the case, but just trying to make it sure.
Comment