SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Find & download exact bed file corresponding to UCSC browser track afkoeppel Bioinformatics 3 03-12-2014 05:38 AM
RepeatMasker track UCSC in R Mimoeschen Bioinformatics 0 04-17-2013 11:21 AM
where can I download an example sequences sihua Bioinformatics 2 12-01-2011 04:26 PM
Download human gene sequences ritzriya Bioinformatics 6 03-24-2011 05:05 AM
download all gene sequences sinakv Bioinformatics 5 01-28-2010 01:19 AM

Reply
 
Thread Tools
Old 03-24-2014, 10:56 AM   #1
biznatch
Senior Member
 
Location: Canada

Join Date: Nov 2010
Posts: 124
Default Download DNA sequences for RepeatMasker track

1. I want to download the DNA sequences for the mouse mm9 RepeatMasker track from the UCSC Genome Browser. When I tried the Table Browser it got to 167 MB after ~10 minutes then stopped, having only completed chromosome 1 and part of 2. The file I downloaded ended with:

Code:
procedures have exceeded timeout: 1200 seconds, function has ended.
My internet connection is very fast but it was downloading pretty slowly so I assume it's being limited by UCSC's speed. Should I be able to download this from the Table Browser or is there a better way?

2. Here is what one sequence looks like:

Code:
>mm9_rmsk_L1_Mur2 range=chr1:3000002-3000156 5'pad=0 3'pad=0 strand=- repeatMasking=none
AAATGTTAAATCTAAAAAAATCCTAACAAGAAACAGCCAGGAAATCTGGG
ACACTATGAAAAGACCAAACCTAAGAAAAATAGGAATAAAAGAAGGACAA
AAGTTTCAGCTGAAACACCCAGAAAACATATTAAACTAAATCATAGAAAA
GAATT
The header includes the repeat name (L1_Mur2), but I would also like the repeat Family and Class which you can get if you download the RepeatMasker track itself, but not if you download the actual sequences like I'm trying to do. I'm pretty sure I could use Perl and add correct Family and Class info to each sequence but if there is some way to get the sequences with this information already included it would save a bit of time.

[Edit] Problem solved using bedtools getfasta with the genome fasta file and a bed file for each type of repeat.

Last edited by biznatch; 03-24-2014 at 01:19 PM.
biznatch is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:50 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO