![]() |
UCSC refSeq Gene and hg19 coordinate
Hello,
I have a list of mRNA NM_ numers. In UCSC, hg19->refGene table, I can get exons and cds coordinates for every NM_. However, when I pull out a subsequence from hg19 based on refGene coordinates, the result seems to be not correct for reverse strand. Reverse complement of the pulled exons dosn't work as well. ------- example: I have a: NM_012345.3 From UCSC i know, that for NM_012345 the first CDS is beetwen 50000:50100, strand: "-", chr1 Then I use: Code:
samtools faidx /path/hg19.fa chr1:50000-50100 Where is the problem? I know that UCSC doesn't use the version (NM_012345 instead of NM_012345.3) but it should work. (hg19 is downloaded from http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/) |
NM_012345 is on chromosome 13 (check the genome browser). I expect you're either reading something wrong or got the wrong refGene table.
|
Quote:
I test ~3000 genes in such way. 1600 works good, they are "+" strand. ~1400 are "-" and when I use samtools faidx, I can't get correct mRNA, CDS. |
Ah, in the future, always give working examples :)
Remember that anything on the "-" strand should end in ATG (actually, CAT), rather than start with it. |
ok, real example:
I have gene IL10, NM_000572.2. Based on NM_000572 from UCSC I get: name: NM_000572 chrom: chr1 strand: - txStart: 206940947 txEnd: 206945839 cdsStart: 206941980 cdsEnd: 206945780 exonStarts: 206940947,206943173,206944251,206944700,206945615, exonEnds: 206942073,206943239,206944404,206944760,206945839, name2: IL10 so first CDS is from 206941980 to 206942073 then I use: Code:
samtools faidx hg19.fa chr1:206941978-206942075 the output: GTCTCAGTTTCGTATCTTCATTGTCATGTAGGCTTCTATGTAGTTGATGAAGATGTCAAACTCACTCATGGCTTTGTAGATGCCTTTCTCTTGGAGCT no ATG, and TAC in here;/ |
It's on the '-' strand, so you're grabbing the end, rather than the beginning :)
|
Quote:
If starnd is "-", start codon is cdsEnd and end codon is cdsStart! Very confusing! + 1 to experience:) |
All times are GMT -8. The time now is 05:11 AM. |
Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.