SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   about UCSC full length cDNA (http://seqanswers.com/forums/showthread.php?t=5161)

zslee 05-18-2010 12:00 AM

about UCSC full length cDNA
 
in one paper, they use downloaded full length cDNA sequences(from ucsc) to detect A-to-I rna editing by aligning to reference hg16, but when i want to repeat their work by downloading human mRNA sequences(i think this is the data they used), i find there's no difference between the mRNA sequences and hg16, why? did i use the wrong data?

can anyone give me a hint
thanks in advance

jiaco 05-18-2010 03:10 AM

Is this really the first post in a thread or is my browser messed up? I hope that my reply here can actually be considered constructive, as intended.

Quote:

Originally Posted by zslee (Post 18721)
in one paper,

what paper?

Quote:

Originally Posted by zslee (Post 18721)
by downloading human mRNA sequences

from where? which file(s)

Quote:

Originally Posted by zslee (Post 18721)
(i think this is the data they used)

This appears to be the crux of your problem, you do not know what data they used and therefore cannot download it, and cannot repeat their work.

But again, what paper?

Quote:

Originally Posted by zslee (Post 18721)
why? did i use the wrong data?

I honestly have no idea based on the info in your post.

zslee 05-18-2010 03:29 AM

yes, i should make it clear ~
 
the paper i consider is
"Widespread RNA editing of embedded alu elements in the human transcriptome"
in the method part, they mentioned "We obtained all human and mouse full-length cDNA sequences from the UCSC Genome Browser database and aligned them against their reference genome sequences using BLAT"

my problem is i don't know where to download the full length cDNA sequences from UCSC database

thanks

jiaco 05-18-2010 03:44 AM

From http://genome.ucsc.edu/
click Tables
then select the following form entries

Mammal : Human : Mar2006 (or the assembly you want)
Genes and Gene Prediction Tracks : UCSC Genes
knownGene
region: genome

outputformat: sequence
set an output file name
use gzip compression

get output

on the next page, you can select mRNA which will splice out introns and start the download of all ucsc known gene mRNA sequences in fasta format.

(but this is really a question for UCSC genome browser, no?)

steven 05-18-2010 03:52 AM

Quote:

Originally Posted by zslee (Post 18721)
in one paper, they use downloaded full length cDNA sequences(from ucsc) to detect A-to-I rna editing by aligning to reference hg16, but when i want to repeat their work by downloading human mRNA sequences(i think this is the data they used), i find there's no difference between the mRNA sequences and hg16, why? did i use the wrong data?

can anyone give me a hint
thanks in advance

Yep indeed, looks like if you go to "Tables" then select group=mRNAs and ESTs, track=human mRNAs, then table=all_mrna, the output=sequence sends you to "Human mRNAs *Genomic* Sequence", which i understand is a genomic extraction based on the positions from the alignments, which is obviously not what you want.
To get the original transcript sequences, try with table=RefSeq Genes instead, you will be given the choice between genomic, protein and mRNA.
I bet the later works for you.
cheers,
s.

zslee 05-18-2010 04:08 AM

Quote:

Originally Posted by jiaco (Post 18736)
From http://genome.ucsc.edu/
click Tables
then select the following form entries

Mammal : Human : Mar2006 (or the assembly you want)
Genes and Gene Prediction Tracks : UCSC Genes
knownGene
region: genome

outputformat: sequence
set an output file name
use gzip compression

get output

on the next page, you can select mRNA which will splice out introns and start the download of all ucsc known gene mRNA sequences in fasta format.

(but this is really a question for UCSC genome browser, no?)

yes, i've tried, it works, thanks to jiaco !
and i'm not familiar with UCSC

zslee 05-18-2010 04:11 AM

Quote:

Originally Posted by steven (Post 18738)
Yep indeed, looks like if you go to "Tables" then select group=mRNAs and ESTs, track=human mRNAs, then table=all_mrna, the output=sequence sends you to "Human mRNAs *Genomic* Sequence", which i understand is a genomic extraction based on the positions from the alignments, which is obviously not what you want.
To get the original transcript sequences, try with table=RefSeq Genes instead, you will be given the choice between genomic, protein and mRNA.
I bet the later works for you.
cheers,
s.

yes, just as you mentioned, i need mRNA not genomic sequences,
and in the paper i considered, they download data same as jiaco's suggestion
thanks to steven !


All times are GMT -8. The time now is 07:57 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.