SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Fungal Genome Annotation shashankgupta Bioinformatics 39 02-13-2015 12:01 AM
Fungal genome annotation Fitoedu Bioinformatics 2 02-03-2015 10:12 PM
Fungal genome Annotation Bgansw Bioinformatics 2 11-05-2012 01:06 AM

Reply
 
Thread Tools
Old 12-02-2015, 01:19 PM   #1
nwfungi
Junior Member
 
Location: Fargo ND

Join Date: Nov 2015
Posts: 5
Default Fungal Genome Assembly... a painful experience

Hi all and thanks for any insight you have in advance.

I am trying to assemble a fungal genome of approximately 42Mb in size. I have reads from an Ion Torrent run and some old data our lab had from a PacBio sequencing in fastq format.

I was wondering what would be a good assembly program to start attempting. I have heard that SPAdes is an option though the genome size is larger than bacteria. Falcon has also been suggested along with PBcR.

Any hints and help would be extremely appreciated.
nwfungi is offline   Reply With Quote
Old 12-02-2015, 02:19 PM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,127
Default

Have you tried falcon as was suggested in a previous thread you had started? SPAdes is not going to work for this size genome.

BTW: How much PacBio data do you have and is it clean/long reads. If you don't have much/good data this is going to be a difficult task.

If you have a relatively close genome available then going the mapping route first may be useful. It will help you understand quality of your data.
GenoMax is offline   Reply With Quote
Old 12-03-2015, 03:33 AM   #3
akorobeynikov
Member
 
Location: Saint Petersburg, Russia

Join Date: Sep 2013
Posts: 25
Default

Quote:
Originally Posted by GenoMax View Post
Have you tried falcon as was suggested in a previous thread you had started? SPAdes is not going to work for this size genome.
This is certainly not true.

Quote:
Originally Posted by nwfungi View Post
Any hints and help would be extremely appreciated.
I'd suggest to give SPAdes a try. If uncertain - just contact SPAdes support for some advices.
akorobeynikov is offline   Reply With Quote
Old 12-03-2015, 04:14 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,127
Default

Quote:
Originally Posted by akorobeynikov View Post
This is certainly not true.

I'd suggest to give SPAdes a try. If uncertain - just contact SPAdes support for some advices.
Has something changed in recent past?

We have seen the recommendation that SPAdes was designed for "standard isolates and single-cell MDA bacteria assemblies". There is no data on SPAdes site that shows successful assemblies with genomes of larger size/multiple chromosomes.

Can you provide some guidance on the RAM requirements for a genome this size? (It would depend to some extent on how much data @nwfungi has)

Last edited by GenoMax; 12-03-2015 at 06:22 AM.
GenoMax is offline   Reply With Quote
Old 12-03-2015, 07:30 AM   #5
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,315
Default

ABySS-PE seems to love assembling compact fungal genomes. That said we typically do PCR-free PE libraries + a cheap mate-pair library using only Illumina reads. N50>1 megabase and ~1000 scaffolds >1Kb total, is what I remember getting.

I haven't even seen an Ion Torrent data set, though, so I have no idea how that will work.

--
Phillip
pmiguel is offline   Reply With Quote
Old 12-03-2015, 12:01 PM   #6
Bukowski
Senior Member
 
Location: UK

Join Date: Jan 2010
Posts: 390
Default

Quote:
Originally Posted by GenoMax View Post
Have you tried falcon as was suggested in a previous thread you had started? SPAdes is not going to work for this size genome.

BTW: How much PacBio data do you have and is it clean/long reads. If you don't have much/good data this is going to be a difficult task.
The PacBio coverage isn't the only critical question here (although the OP would do well to answer it!), but they suggest they have FASTQ files for the PacBio reads. Not sure about Falcon, but pretty sure HGAP.3 requires bax/bas.h5 as assembly input.
Bukowski is offline   Reply With Quote
Old 12-03-2015, 01:18 PM   #7
AntonioRFranco
Member
 
Location: Cordoba, Spain

Join Date: Feb 2013
Posts: 21
Default

Take a look to the Masurca assembler
http://www.ncbi.nlm.nih.gov/pubmed/23990416
AntonioRFranco is offline   Reply With Quote
Old 12-04-2015, 08:05 AM   #8
nwfungi
Junior Member
 
Location: Fargo ND

Join Date: Nov 2015
Posts: 5
Default

Thanks for all the information and apologies for slow response time.

As for the PB data I have, it is in fastq format and we are in the process of trying to get the original files but this sequencing was done far enough in the past that it may not be possible. The quality is questionable and the coverage is low at best. I've only run FastQC on it to check it but nothing was flagged. We are having a more comprehensive PB sequencing effort done as we speak. I was more interested in seeing if the small amount of PB data I had could be used in the meantime to create a slightly more useful assembly and get a process hammered out.
nwfungi is offline   Reply With Quote
Reply

Tags
assembly, pacbio, spades

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:01 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO