SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Oxford Nanopore (http://seqanswers.com/forums/forumdisplay.php?f=45)
-   -   Assembly using Illumina + Nanopore 1D reads? (http://seqanswers.com/forums/showthread.php?t=77471)

JonB 08-08-2017 11:38 PM

Assembly using Illumina + Nanopore 1D reads?
 
Hi,

I am trying to assemble a eukaryotic genome of about 300MB. I have Illumina data, and I am thinking of trying out the MinION Basic Starter Pack to use for scaffolding. But it produces only 1D reads, can it still be used for scaffolding in combination with Illumina data?

Thanks,

Jon

cstack 08-10-2017 02:18 PM

Maybe. What is your Illumina coverage? There are a few scaffolders that would seem to work for that sort of thing. This manuscript has good comparisons between hybrid assemblers using minion / pacbio data for yeast. Their results might not translate to your work, but it would be a decent place to start.

JonB 08-10-2017 10:48 PM

Thanks for the manuscript!
I'm not exactly sure about the Illumina coverage at the moment, but it's very high at least. Mostly I was concerned about using only 1D because of the error rate, but I don't think it'll be a problem together with the Illumina data. I guess I'll just test it and see.

colindaven 08-10-2017 11:21 PM

Good idea. If I were you I'd go for as much Nanopore as I could afford, eg 30X, then create an assembly from this alone using Canu. Then I'd correct the assembly using the nanopore data. In my experience, long reads are always far better than short for contiguous assemblies.
Hybrid - at least in 2016 - was still a bit of a nightmare.

JonB 08-10-2017 11:32 PM

Thanks!
Yes, I was not sure about which order to do things (assembly and correction). But your suggestion is very helpful.

cstack 08-11-2017 07:14 AM

Quote:

Originally Posted by colindaven (Post 210061)
Good idea. If I were you I'd go for as much Nanopore as I could afford, eg 30X, then create an assembly from this alone using Canu..

We work on a plant species that is difficult to get DNA from, and from our first few flowcells we were getting approx. 3-5Gbp of 1d reads (using 9.4 chem w/ the standard ligation kit). If you get something similar then it would only take 2-3 flow cells to (in theory) reach 30X coverage.

Also, I think the typical canu pipeline has an overlap error correction step. I've never looked at the coverage needed for this to be really effective, but I bet 30X would be at the lower end. I agree with colindaven about hybrid assembly -- it can get really messy. If you can build some nice scaffolds with ONT data, then you might be able to simply map the illumina reads and call a consensus from this. I'd be very interested to hear how you or others would approach this!

JonB 08-11-2017 09:41 AM

That's also the case for me. I work on an algae and I struggle to get a lot of DNA due to sub-optimal cultures. But the DNA I have is of really high quality though.

Thanks for all the suggestions. I'll order the MinION kit and keep you updated on how the assembly goes.

apredeus 11-03-2017 04:39 AM

Quote:

Originally Posted by JonB (Post 209971)
Hi,

I am trying to assemble a eukaryotic genome of about 300MB. I have Illumina data, and I am thinking of trying out the MinION Basic Starter Pack to use for scaffolding. But it produces only 1D reads, can it still be used for scaffolding in combination with Illumina data?

Thanks,

Jon

If you got plenty of computational resources, run several assemblies and then see which one does best. If you are moderately successful with your 1D runs, you'll generate about 10 Gb of data from two flow cells, which is 30x - just about borderline for which assembler to choose. Should you have really high coverage of long reads, classical long-read assemblers like canu or miniasm+racon combo would give you best results. After that you'd run nanopolish, and then pilon with your Illumina data, and would probably have 99.(2..8)% correct assembly.

If you end up with less nanopore (10-20X) but lots (100Х+) of Illumina, do give Masurca a shot, it should perform best.


All times are GMT -8. The time now is 01:55 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.