SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Repeat Masker with high throughput data sameet Bioinformatics 1 06-04-2013 01:30 AM
Repeat masker installation jomaco Bioinformatics 9 10-30-2011 07:13 AM
makeblastdb error while using repeat masker locally novino11 Bioinformatics 1 09-08-2011 11:06 AM
1000 genome data and other human ref sequence differ johnadam33 Bioinformatics 4 01-05-2011 03:16 AM
SOCS: Efficient mapping of Applied Biosystems SOLiD sequence data to a ref genome... ECO Literature Watch 0 10-20-2008 08:53 PM

Reply
 
Thread Tools
Old 02-02-2010, 05:18 AM   #1
kwebb
Member
 
Location: Wahington, DC

Join Date: Jul 2008
Posts: 21
Default Ref Genome Repeat Masker

I am looking for a tool that will mask (replace with X's) regions of the reference genome that occur multiple times. So something along the lines of BLASTing the reference genome to itself, identifying regions of a given length or longer that are identical or highly identical due to multiple occurrences within the reference genome, and then replacing those positions with X's. Does anyone know of this tool?

I've collected 36 and 40 bp short reads for a eukaryotic genome that is not yet published or annotated but I do have access to contigs and super contigs. Being able to assemble to only the unique regions will simplify our analyses.

Thanks for your help.
kwebb is offline   Reply With Quote
Old 02-02-2010, 05:59 AM   #2
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,178
Default

RepeatMasker

http://www.repeatmasker.org/
kmcarr is offline   Reply With Quote
Old 02-02-2010, 12:16 PM   #3
kwebb
Member
 
Location: Wahington, DC

Join Date: Jul 2008
Posts: 21
Default

Thanks.

I've looked into RepeatMasker and it appears that I have to pay for a license for either Cross_Match, ABBLast/WUBlast or DeCypher? I'm at a government lab. Is there a free work around or another tool that you know of?
kwebb is offline   Reply With Quote
Old 02-03-2010, 08:25 AM   #4
whsqwghlm
Member
 
Location: Cambridge, UK

Join Date: Jun 2009
Posts: 14
Default

vmatch can be used to find repetitive sequences in a genome, but does not classify them for you.
whsqwghlm is offline   Reply With Quote
Old 02-15-2010, 06:37 PM   #5
ewingad
Junior Member
 
Location: Philadelphia

Join Date: Aug 2008
Posts: 6
Default

Give P-clouds a try: http://www.evolutionarygenomics.com/PClouds.html
ewingad is offline   Reply With Quote
Old 02-16-2010, 08:03 AM   #6
Thomas Doktor
Senior Member
 
Location: University of Southern Denmark (SDU), Denmark

Join Date: Apr 2009
Posts: 105
Default

If you supply the repeat regions in a BED file, BEDTools can mask the fasta sequences for you.
Thomas Doktor is offline   Reply With Quote
Old 03-29-2010, 10:45 PM   #7
robsyme
Junior Member
 
Location: Perth, Western Australia

Join Date: Jan 2009
Posts: 6
Default

You don't have to pay for crossmatch. If you email the people at Washington, they send you out an academic copy if you put your name to the licence agreement.
-r
robsyme is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:05 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO