SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
HTSEQ results format when using refseq hg19 narges RNA Sequencing 0 10-28-2012 07:44 AM
hg19 to Broad b37 coordinate conversion laosaal Bioinformatics 3 08-31-2012 04:23 AM
UCSC refseq to gff3 Wallysb01 Bioinformatics 2 03-29-2012 09:06 AM
Exons from UCSC (Refseq) khb Bioinformatics 0 12-21-2010 09:47 PM
TopHat GFF3 for UCSC Gene HG19 Bio.X2Y Bioinformatics 5 06-07-2010 12:43 PM

Reply
 
Thread Tools
Old 02-05-2013, 08:08 AM   #1
thedamian
Member
 
Location: Barcelona

Join Date: Feb 2012
Posts: 49
Default UCSC refSeq Gene and hg19 coordinate

Hello,
I have a list of mRNA NM_ numers.
In UCSC, hg19->refGene table, I can get exons and cds coordinates for every NM_.

However, when I pull out a subsequence from hg19 based on refGene coordinates, the result seems to be not correct for reverse strand. Reverse complement of the pulled exons dosn't work as well.

-------
example:
I have a: NM_012345.3
From UCSC i know, that for NM_012345 the first CDS is beetwen 50000:50100, strand: "-", chr1
Then I use:
Code:
samtools faidx /path/hg19.fa chr1:50000-50100
The result doesn't start with ATG (and it should starts).


Where is the problem? I know that UCSC doesn't use the version (NM_012345 instead of NM_012345.3) but it should work.

(hg19 is downloaded from http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/)
thedamian is offline   Reply With Quote
Old 02-05-2013, 09:37 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

NM_012345 is on chromosome 13 (check the genome browser). I expect you're either reading something wrong or got the wrong refGene table.
dpryan is offline   Reply With Quote
Old 02-05-2013, 11:17 AM   #3
thedamian
Member
 
Location: Barcelona

Join Date: Feb 2012
Posts: 49
Default

Quote:
Originally Posted by dpryan View Post
NM_012345 is on chromosome 13 (check the genome browser). I expect you're either reading something wrong or got the wrong refGene table.
heh, it was an abstract example 012345 is like abcdef
I test ~3000 genes in such way. 1600 works good, they are "+" strand.
~1400 are "-" and when I use samtools faidx, I can't get correct mRNA, CDS.
thedamian is offline   Reply With Quote
Old 02-05-2013, 11:22 AM   #4
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Ah, in the future, always give working examples

Remember that anything on the "-" strand should end in ATG (actually, CAT), rather than start with it.
dpryan is offline   Reply With Quote
Old 02-06-2013, 11:47 PM   #5
thedamian
Member
 
Location: Barcelona

Join Date: Feb 2012
Posts: 49
Default

ok, real example:
I have gene IL10, NM_000572.2.
Based on NM_000572 from UCSC I get:

name: NM_000572
chrom: chr1
strand: -
txStart: 206940947
txEnd: 206945839
cdsStart: 206941980
cdsEnd: 206945780
exonStarts: 206940947,206943173,206944251,206944700,206945615,
exonEnds: 206942073,206943239,206944404,206944760,206945839,
name2: IL10

so first CDS is from 206941980 to 206942073

then I use:
Code:
samtools faidx hg19.fa chr1:206941978-206942075
( I added +2 to each side because UCSC is 0-based, hg19 1-based)
the output:
GTCTCAGTTTCGTATCTTCATTGTCATGTAGGCTTCTATGTAGTTGATGAAGATGTCAAACTCACTCATGGCTTTGTAGATGCCTTTCTCTTGGAGCT

no ATG, and TAC in here;/
thedamian is offline   Reply With Quote
Old 02-07-2013, 12:13 AM   #6
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

It's on the '-' strand, so you're grabbing the end, rather than the beginning
dpryan is offline   Reply With Quote
Old 02-07-2013, 12:18 AM   #7
thedamian
Member
 
Location: Barcelona

Join Date: Feb 2012
Posts: 49
Default

Quote:
Originally Posted by dpryan View Post
It's on the '-' strand, so you're grabbing the end, rather than the beginning
heh yes, I've just realised it.
If starnd is "-", start codon is cdsEnd and end codon is cdsStart! Very confusing!
+ 1 to experience
thedamian is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:54 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO