Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
finding exon numbers in fasta exon file efoss Bioinformatics 1 10-20-2011 04:57 PM
Finding Exon-Intron Junctions without a reference genome brachysclereid Bioinformatics 3 05-22-2011 07:21 AM
count reads across exon junctions suninsky Bioinformatics 4 02-23-2011 08:34 AM
Alignment at exon-exon junctions Boel RNA Sequencing 2 12-09-2010 12:12 PM
Finding exon-exon junction vincebrown General 10 11-16-2010 05:08 PM

Thread Tools
Old 12-15-2011, 04:31 AM   #1
Junior Member
Location: Boston, MA

Join Date: Dec 2011
Posts: 1
Default Exon-exon junctions

We need to create a custom human mRNA expression database containing reshuffled exon-exon junctions.

Can anyone point to a data source where we can download human mRNA sequences and their exon-exon junctions, which we can reshuffle using perl or the like?
kentsis is offline   Reply With Quote
Old 12-15-2011, 08:53 AM   #2
Senior Member
Location: St. Louis, MO, USA

Join Date: Apr 2011
Posts: 124

You might try the refFlat.txt.gz file, available here

(for me, the access time of this page is pretty slow).

A description of the file format can be found here

The same exon can be represented in several transcripts, so it would be good to do a sort | uniq on the list of exons. Also be aware that exons can overlap.

BAMseek is offline   Reply With Quote
Old 12-18-2011, 07:32 AM   #3
NGS specialist
Location: Malaysia

Join Date: Apr 2008
Posts: 249

Try the Useq MakeTranscriptome tool.

Takes a UCSC ref flat table of transcripts and generates two multi fasta files of
transcripts and splices (known and theoretical). All possible unique splice junctions
are created given the exons from each gene's transcripts. In some cases this is
computationally intractable and theoretical splices from these are not complete.
Read through occurs with small exons to the next up or downstream so keep the sequence
length radius to a minimum to reduce the number of junctions. Overlapping exons are
assumed to be mutually exclusive.
You can then align all your reads and then use the SamTranscriptomeParser to convert these back to genomic coordinates.

We have a full protocol on how to do this

Last edited by zee; 12-18-2011 at 07:35 AM.
zee is offline   Reply With Quote

exon, exon locations

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 03:21 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO