SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
De Novo Assembly using Ray Farhat De novo discovery 18 05-23-2012 02:19 PM
How to get started in de novo assembly? ymc Bioinformatics 4 01-14-2012 12:29 AM
how to resolve repeat areas with Velvet when doing de novo assembly salmonella De novo discovery 1 10-24-2011 09:42 PM
De novo assembly mihir.karnik General 1 09-07-2011 02:49 PM
de novo assembly vs. reference assembly fadista General 3 02-16-2011 12:11 AM

Reply
 
Thread Tools
Old 09-22-2011, 05:36 PM   #1
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 509
Default de novo assembly of repeat elements

As part of a de novo assembly project, I'd like to try to identify repeat elements - everything from single gene duplications (difficult) to transposons (less difficult). The data are Illumina PE-101 reads, ~50X coverage. My (admittedly unsophisticated) approach is to assemble contigs (I'll try both de Bruijn and overlap assemblers), then flag those with >2X average read depth.

Two questions:
1) are there any tools designed for this application?
2) any suggestions for alternative strategies (e.g., candidate identification by sequence conservation, branch counting of de Bruijn graphs, etc.)?

Thanks,
Harold
HESmith is offline   Reply With Quote
Old 09-22-2011, 11:04 PM   #2
jimmybee
Senior Member
 
Location: Adelaide, Australia

Join Date: Sep 2010
Posts: 119
Default

Whats your species?
jimmybee is offline   Reply With Quote
Old 09-23-2011, 07:00 AM   #3
Hobbe
Member
 
Location: Uppsala, Sweden

Join Date: Apr 2010
Posts: 29
Default

Repeatmasker both identifies and masks repeat-elements. Some assembly programs, like Mira, also mark repeat regions with tags. In Mira you can check the sequences identified as repeats in the projectname_info_readrepeats.lst file.
Hobbe is offline   Reply With Quote
Old 09-23-2011, 09:03 AM   #4
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 509
Default

Quote:
Originally Posted by jimmybee View Post
Whats your species?
Nematodes for now, but there are likely to be others in the future.

Harold
HESmith is offline   Reply With Quote
Old 09-23-2011, 09:03 AM   #5
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 509
Default

Thanks for the recommendations, Hobbe. I'll look into them.

Harold
HESmith is offline   Reply With Quote
Old 10-26-2011, 03:48 PM   #6
saemi
Junior Member
 
Location: University of British Columbia, Vancouver

Join Date: Oct 2010
Posts: 5
Default

Hi Harold

How is your project regarding the de novo assembly of transposons going? I'm interested in doing a similar project to compare transposons among closely related plant species using illumina sequencing. What programs are you using?

Cheers,Saemi
saemi is offline   Reply With Quote
Old 10-27-2011, 06:19 AM   #7
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

For plants you will see lots of LTR retrotransposons. These might assemble with the LTRs in the middle -- since the LTRs are, well long (0.2-5 kb, for the most part) and also, repeats that flank the internal domains of these ubiquitous transposable elements.

There was even an program designed by Jeremy DeBarry when he was at UGA in Bennetzen lab, that took advantage of this to pull LTR retros from full genome assemblies and reconstruct their LTRs in the correct positions. He called it the "AAARF" algorithm. (Get it? UGA Bulldogs, aaarf?)

Ah, here it is. Also a publication. Looks like it is also usable for other sorts of elements as well. There you go: code from a maize lab -- you know maize, the organism where transposable elements were discovered? Worth a look.

--
Phillip
pmiguel is offline   Reply With Quote
Old 10-27-2011, 07:45 AM   #8
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 509
Default

Hi Saemi,

I'm still waiting to obtain the sequence data for this project, so I don't have any results to report. I'll keep you posted regarding my progress.

Harold
HESmith is offline   Reply With Quote
Old 10-27-2011, 12:43 PM   #9
saemi
Junior Member
 
Location: University of British Columbia, Vancouver

Join Date: Oct 2010
Posts: 5
Default

Hi

@Harold, OK great, I'm looking forward to hear from you.

@Phillip, Great thank you very much for the information and the paper. I'll will take a close look at it. One of the things I'm concerned about is the fact that I plan to use an Illumina Hi-Seq in my project, on species which don't have a reference genome. Most of the available methods for looking at transposons in a shotgun library I've seen, work on 454 sequences. I guess one way to go is to do a de novo assembly first on the data but them I'm afraid to loose information from my dataset.

Thank you guys
Saemi
saemi is offline   Reply With Quote
Old 11-09-2011, 03:03 AM   #10
Claudia34
Junior Member
 
Location: Geneva, Switzerland

Join Date: Sep 2010
Posts: 9
Default

Hi,

We have the project to identify genome-wide transposition events in flies after know-down of a protein of interest. We think sequencing and de novo assembly are the best way to do it. Do you agree ?
I don't know if Illumina is the best technology for these kinds of analysis because of the sequencing length. Do you have any recommendation?

Thanks,
Claudia
Claudia34 is offline   Reply With Quote
Old 11-09-2011, 07:38 AM   #11
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 509
Default

Hi Claudia,

If you already have a genome assembly for your species and you know the sequences of your transposons, you can use the strategy described here. Briefly, use paired-end sequencing and map the different ends to genomic and transposon sequences to identify insertion sites. Let me know if you want more details.

Harold
HESmith is offline   Reply With Quote
Old 11-09-2011, 07:50 AM   #12
Claudia34
Junior Member
 
Location: Geneva, Switzerland

Join Date: Sep 2010
Posts: 9
Default

Hi Harold,

Thanks a lot for your answer. I think we can use this strategy because we are working on drosophila melanogaster. I will carefully read this paper.

Thanks again,
Claudia
Claudia34 is offline   Reply With Quote
Old 11-09-2011, 04:11 PM   #13
adaptivegenome
Super Moderator
 
Location: US

Join Date: Nov 2009
Posts: 437
Default

Quote:
Originally Posted by Claudia34 View Post
Hi Harold,

Thanks a lot for your answer. I think we can use this strategy because we are working on drosophila melanogaster. I will carefully read this paper.

Thanks again,
Claudia
What are you suppressing in the flies, Hsp90?
adaptivegenome is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:46 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO