SEQanswers

Go Back   SEQanswers > Introductions
Similar Threads
Thread Thread Starter Forum Replies Last Post
Genome Res De novo bacterial genome sequencing: millions of very short reads assembly b_seite Literature Watch 1 10-04-2017 11:26 PM
mtDNA genome assembly using MITObim not working! JackieBadger Bioinformatics 4 07-30-2013 10:15 AM
Targeted Genome Assembly for region poorly represented in reference genome? gumbos Bioinformatics 1 01-09-2012 04:01 PM
New to genome assembly jsomers Introductions 1 03-05-2011 09:41 AM

Reply
 
Thread Tools
Old 03-26-2014, 11:19 AM   #1
genetics_jo
Member
 
Location: Corvallis, OR

Join Date: Feb 2014
Posts: 11
Smile New User--Working on genome assembly

Hi All!
My name is John Henning and I'm working with the organism Humulus lupulus L--commonly known as "hops." I work for USDA-ARS and am housed at Oregon State University.

I'm currently working on sequencing the hop genome and have been using velvet for the first round of de novo sequencing. Have had a few runs that have successfully completed but not happy with the results (N50 = 270 - 1000). Thus, I've come here seeking answers from you experts!!
genetics_jo is offline   Reply With Quote
Old 03-26-2014, 11:42 AM   #2
nucleus
Junior Member
 
Location: Albany,ny

Join Date: May 2013
Posts: 7
Default

Although I do not have much experience with eukaryotic genomes, I am pretty much sure that Velvet would not be the best assembler for you. First off, what kind of library do you have (single-end, paired-end, mate pairs?) and from which platform?

You can take a look at the GAGE assembly competition page (http://gage.cbcb.umd.edu/), they list a few of the popular assembler with recipes on there. This may be a good starting point. Also, look at other plant genome papers and see what they have used, that always helps. Finally, I was glancing at the loblolly pine genome paper. They used Masurca for the assembly, might work for you.
nucleus is offline   Reply With Quote
Old 03-26-2014, 11:45 AM   #3
ctseto
Member
 
Location: SE MN

Join Date: Oct 2013
Posts: 44
Default

There are some interesting new de novo assembly papers on the polyploid loblolly pine that might give you some ideas.

http://genomebiology.com/content/pdf...4-15-3-r59.pdf
http://www.genetics.org/content/196/3/875.full.pdf+html

As a bacteria guy I can say I have no experience in terms of plants. You are free to browse the De novo discovery forum and repost there.

What libraries and platforms did you use? Have you tried other assemblers? Presumably the diploid form as well?

Edit: While I was pulling my links for loblolly I was beaten to the punch. I am a SPAdes person (having gone from Velvet and Abyss, working on Ray at the moment). SPAdes has a diploid diSPAdes functionality that is pretty interesting but I haven't yet really put it through its paces.

Last edited by ctseto; 03-26-2014 at 11:59 AM.
ctseto is offline   Reply With Quote
Old 03-26-2014, 11:48 AM   #4
nucleus
Junior Member
 
Location: Albany,ny

Join Date: May 2013
Posts: 7
Default

Sorry ctseto
nucleus is offline   Reply With Quote
Old 03-26-2014, 12:06 PM   #5
genetics_jo
Member
 
Location: Corvallis, OR

Join Date: Feb 2014
Posts: 11
Default

Thanks for your replies!! Much quicker than I anticipated!

I have the following data: two genomes (male and female plant) that I performed illumina 101bp paired end sequencing on (hiSeq 2000) single lane each, two RNA-seq 101 bp paired end reads (illumina Hiseq also) that are via NCBI and finally a set of "long reads" EST libraries downloaded from NCBI (~25,000 ESTs).

My problem is that I have access to a 1-Tbyte RAM machine but only can access for one week per month due to sharing amongst 4 different groups. Thus, trying to see if I can't get my departments server up and running assembly.
genetics_jo is offline   Reply With Quote
Old 03-26-2014, 12:13 PM   #6
ctseto
Member
 
Location: SE MN

Join Date: Oct 2013
Posts: 44
Default

I'm looking at NCBI's list of 25712 EST's for H. lupulus and they are somewhat short, in the <1k range. It's probable there are repeats in the hop genome that your assembly would stall on.

There's a sanity check of trying to map your reads against your EST's to assess coverage. That said, how powerful is your departments server? It may be sufficient with newer, more efficient assemblers and other techniques to reduce assembly size (e.g, diginorm to generate a subset of data, then assemble, then use diginorm contigs to extend assembly with un-normalized data).

Edit: http://seqanswers.com/forums/showthread.php?t=40024
ctseto is offline   Reply With Quote
Old 03-27-2014, 07:54 AM   #7
samanta
Senior Member
 
Location: Seattle

Join Date: Feb 2010
Posts: 109
Default

I wrote something up to answer your question -

http://www.homolog.us/blogs/blog/201...mbly-question/
__________________
http://homolog.us
samanta is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -8. The time now is 06:23 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO