SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Tophat for finding long ncRNA with short reads? KevinLam Bioinformatics 3 02-24-2017 09:11 AM
Short and Long Reads mido1951 Bioinformatics 1 05-22-2014 06:35 AM
velvet: 60bp Illimuna reads. -short or -long? Kiroro Bioinformatics 0 09-10-2011 02:10 PM
Combined assembly of 454 and Solexa reads zqiqi0808 Bioinformatics 1 07-18-2011 06:50 AM

Reply
 
Thread Tools
Old 04-19-2016, 08:06 AM   #1
XCL
Junior Member
 
Location: Germany

Join Date: Apr 2016
Posts: 4
Default Combined assembly analysis (short reads + long reads)

Dear NGS Experts,

I have a question about combined genome assembly.

We have 75X Hiseq sequencing of an animal species genome (about 3Gb genome size) together with 50X Pacbio Sequel system, now, we would like to make a combined assembly analysis of these 350Gb data. Anybody knows any tools for this kind of analysis?

Many thanks.
XCL is offline   Reply With Quote
Old 04-19-2016, 08:10 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,696
Default

For reference, cross-posted: https://www.biostars.org/p/187524/
GenoMax is offline   Reply With Quote
Old 04-19-2016, 08:32 AM   #3
XCL
Junior Member
 
Location: Germany

Join Date: Apr 2016
Posts: 4
Default

Quote:
Originally Posted by GenoMax View Post
For reference, cross-posted: https://www.biostars.org/p/187524/
Hi GenoMax, thanks a lot. Exactly, if there is any tool which can use the short reads to correct the error on the long reads, that would be great. Such tools should be very useful for combined assembly analysis.
XCL is offline   Reply With Quote
Old 04-19-2016, 08:42 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,696
Default

Did you see this link: https://github.com/PacificBioscience...Bio-Long-Reads pacBioToCA is the tool you want.
GenoMax is offline   Reply With Quote
Old 04-19-2016, 11:18 AM   #5
XCL
Junior Member
 
Location: Germany

Join Date: Apr 2016
Posts: 4
Default

Quote:
Originally Posted by GenoMax View Post
Did you see this link: https://github.com/PacificBioscience...Bio-Long-Reads pacBioToCA is the tool you want.
Dear GenoMax, thank you very much, that link is quite useful. There is another problem/situation that the genome is with very high heterozygosity, whether the five hybrid assemblers (pacBioToCA, ECTools, SPAdes, Cerulean, dbg2olc) can deal with such situation?
XCL is offline   Reply With Quote
Old 04-19-2016, 11:52 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,696
Default

You are not going to know until you try. Sounds to me like someone is going to stay busy for a while. Hope you have access to some beefy compute resources since this is going to take a lot of RAM etc.
GenoMax is offline   Reply With Quote
Old 04-19-2016, 08:54 PM   #7
wdecoster
Member
 
Location: Antwerp, Belgium

Join Date: Oct 2015
Posts: 95
Default

The term you're looking for is "hybrid assembly".
wdecoster is offline   Reply With Quote
Old 04-19-2016, 11:18 PM   #8
XCL
Junior Member
 
Location: Germany

Join Date: Apr 2016
Posts: 4
Default

Quote:
Originally Posted by GenoMax View Post
You are not going to know until you try. Sounds to me like someone is going to stay busy for a while. Hope you have access to some beefy compute resources since this is going to take a lot of RAM etc.
Right, we will try them all. The enough RAM is always important until the transfer rate of sad can reach at least 5GB/s. Thank you again.
XCL is offline   Reply With Quote
Old 04-20-2016, 03:46 AM   #9
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,696
Default

I would suggest engaging with PacBio tech support early for this challenging project. Dr. Hall (rhall) from PacBio participates on this forum and he may be a good resource for hints.

If you are getting PacBio data from a sequence provider then be sure to ask for raw data files (*.h5). These would be needed for some of the tools we have discussed.
GenoMax is offline   Reply With Quote
Old 04-20-2016, 06:34 AM   #10
fanli
Senior Member
 
Location: California

Join Date: Jul 2014
Posts: 197
Default

There was a thread floating around here somewhere suggesting that you use PacBio reads for assembly and Illumina reads to do error correction, etc. afterwards.
fanli is offline   Reply With Quote
Reply

Tags
assembly, genome, illumina, pacbio

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:12 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO