Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • JonB
    Member
    • Jan 2010
    • 85

    Assembly using Illumina + Nanopore 1D reads?

    Hi,

    I am trying to assemble a eukaryotic genome of about 300MB. I have Illumina data, and I am thinking of trying out the MinION Basic Starter Pack to use for scaffolding. But it produces only 1D reads, can it still be used for scaffolding in combination with Illumina data?

    Thanks,

    Jon
  • cstack
    Member
    • May 2017
    • 16

    #2
    Maybe. What is your Illumina coverage? There are a few scaffolders that would seem to work for that sort of thing. This manuscript has good comparisons between hybrid assemblers using minion / pacbio data for yeast. Their results might not translate to your work, but it would be a decent place to start.

    Comment

    • JonB
      Member
      • Jan 2010
      • 85

      #3
      Thanks for the manuscript!
      I'm not exactly sure about the Illumina coverage at the moment, but it's very high at least. Mostly I was concerned about using only 1D because of the error rate, but I don't think it'll be a problem together with the Illumina data. I guess I'll just test it and see.

      Comment

      • colindaven
        Senior Member
        • Oct 2008
        • 417

        #4
        Good idea. If I were you I'd go for as much Nanopore as I could afford, eg 30X, then create an assembly from this alone using Canu. Then I'd correct the assembly using the nanopore data. In my experience, long reads are always far better than short for contiguous assemblies.
        Hybrid - at least in 2016 - was still a bit of a nightmare.

        Comment

        • JonB
          Member
          • Jan 2010
          • 85

          #5
          Thanks!
          Yes, I was not sure about which order to do things (assembly and correction). But your suggestion is very helpful.

          Comment

          • cstack
            Member
            • May 2017
            • 16

            #6
            Originally posted by colindaven View Post
            Good idea. If I were you I'd go for as much Nanopore as I could afford, eg 30X, then create an assembly from this alone using Canu..
            We work on a plant species that is difficult to get DNA from, and from our first few flowcells we were getting approx. 3-5Gbp of 1d reads (using 9.4 chem w/ the standard ligation kit). If you get something similar then it would only take 2-3 flow cells to (in theory) reach 30X coverage.

            Also, I think the typical canu pipeline has an overlap error correction step. I've never looked at the coverage needed for this to be really effective, but I bet 30X would be at the lower end. I agree with colindaven about hybrid assembly -- it can get really messy. If you can build some nice scaffolds with ONT data, then you might be able to simply map the illumina reads and call a consensus from this. I'd be very interested to hear how you or others would approach this!

            Comment

            • JonB
              Member
              • Jan 2010
              • 85

              #7
              That's also the case for me. I work on an algae and I struggle to get a lot of DNA due to sub-optimal cultures. But the DNA I have is of really high quality though.

              Thanks for all the suggestions. I'll order the MinION kit and keep you updated on how the assembly goes.

              Comment

              • apredeus
                Senior Member
                • Jul 2012
                • 151

                #8
                Originally posted by JonB View Post
                Hi,

                I am trying to assemble a eukaryotic genome of about 300MB. I have Illumina data, and I am thinking of trying out the MinION Basic Starter Pack to use for scaffolding. But it produces only 1D reads, can it still be used for scaffolding in combination with Illumina data?

                Thanks,

                Jon
                If you got plenty of computational resources, run several assemblies and then see which one does best. If you are moderately successful with your 1D runs, you'll generate about 10 Gb of data from two flow cells, which is 30x - just about borderline for which assembler to choose. Should you have really high coverage of long reads, classical long-read assemblers like canu or miniasm+racon combo would give you best results. After that you'd run nanopolish, and then pilon with your Illumina data, and would probably have 99.(2..8)% correct assembly.

                If you end up with less nanopore (10-20X) but lots (100Х+) of Illumina, do give Masurca a shot, it should perform best.

                Comment

                Latest Articles

                Collapse

                • GATTACAT
                  Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by GATTACAT
                  Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                  Today, 11:43 AM
                • SEQadmin2
                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by SEQadmin2


                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                  Here are nine questions we think about, in roughly the order they matter, before...
                  06-18-2026, 07:11 AM
                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  06-02-2026, 10:05 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, Yesterday, 05:37 AM
                0 responses
                7 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-26-2026, 11:10 AM
                0 responses
                17 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-17-2026, 06:09 AM
                0 responses
                52 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                110 views
                0 reactions
                Last Post SEQadmin2  
                Working...