Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Nanopore - circular Assembly

    Hi,

    we succesfully sequenced a DNA sample and assembled the genome with canu.
    We got one circular contig which is perfect. But the contig is overlapping.

    Since nanopore Reads are very long some reads at the end have the same sequence as the reads at the beginning of the contig and vice versa.
    Canu is also reporting that the contig is circular.

    Now we want to fix those reads at the beginning and the end of the contig to get one linear contig without overlaps.

    I cannot find any software to help us there except "circlator". I guess it's the minimus2 function we have to use here, but this function has dependencies to the software AMOS which seems to be impossible to install on Ubuntu 18.04.

    Can anyone help us here? Maybe any alternative software to circlator?


    Of course we could always trim the contig manually by finding the end of the contig at the beginning and then trim at this position or use a script which does the same... but...the best code is still the one which has already been written by someone else :-)
    Last edited by iltisanni; 05-11-2018, 04:04 AM.

  • #2
    Also you can use blast or mummer to detect the overlapping ends...

    First you need to detect by how much the ends are overlapping.
    Than you can save non-overlapping portion + a single copy of the overlapping area sequence to a file.

    You can detect overlapping ends of the contig(s) by the standalone blast using the master-slave alignment formatting output option (blast the sequence against itself).: Lower the expect value -e 1e-50 or less and crank up the word size to 16 - 64bp (-W 32)
    Also dotplot/mummer alignment against itself may be userfull.

    Using above info you can decide which base-range to keep, so you get non - overlapping ends.
    Than you open your sequence in Artemis or similar editor and do select->base range
    and save the selected base range to a fasta file: File->Write->Bases of selection->Fasta format.

    Comment


    • #3
      Thank you. You helped me a lot and your suggestion to align the sequence against itself was right. Now I found the trimming point and trimmed the fasta with a simple cat X.fasta | cut -c 1-XXX > trimmed.fasta after deleting the header line first and inserting it again at the end in the trimmed.fasta

      I found this information directly in the canu documentation:


      --->

      An alternative is to run MUMmer to get self-alignments on the contig and use those trim points. For example, assuming the circular element is in tig00000099.fa. Run:

      nucmer -maxmatch -nosimplify tig00000099.fa tig00000099.fa
      show-coords -lrcTH out.delta


      to find the end overlaps in the tig. The output would be something like:

      1 1895 48502 50400 1895 1899 99.37 50400 50400 3.76 3.77 tig00000001 tig00000001
      48502 50400 1 1895 1899 1895 99.37 50400 50400 3.77 3.76 tig00000001 tig00000001

      means trim to 1 to 48502. There is also an alternate writeup.

      <---
      Last edited by iltisanni; 05-14-2018, 12:28 AM.

      Comment


      • #4
        Originally posted by iltisanni View Post
        Hi,

        I cannot find any software to help us there except "circlator". I guess it's the minimus2 function we have to use here, but this function has dependencies to the software AMOS which seems to be impossible to install on Ubuntu 18.04.

        Hi, I use Circlator in similar scenarios. I think you can just use the 'normal' Circlator function and not specifically 'minimus2', which indeed is a hassle as far as I remember.

        Code:
        circlator all <assembly.fasta> <corrected_longreads_from_canu.fasta> <output_folder> --threads <nr_of_trheads>
        Then I check the output of

        Code:
        04.merge.circularise_details.log
        in the output folder to hopefully see a line like

        Code:
        [merge circularise_details]	scaffold1|size4159270|arrow	Circularized: yes
        Then the file
        Code:
        06.fixstart.fasta
        is the final output file which should have fixed coordinates without overlaps etc. Let me know if this helps.

        Comment


        • #5
          Originally posted by Ali May View Post
          Hi, I use Circlator in similar scenarios. I think you can just use the 'normal' Circlator function and not specifically 'minimus2', which indeed is a hassle as far as I remember.
          I'm not sure about the "fixstart" option. We want exactly what is written for the "minimus2" option but "fixstart" just sets a new starting point at the first dnaA gene if finds. But it does not circularize contigs by merging any overlapping contigs if I'm not mistaken...

          Comment


          • #6
            Originally posted by iltisanni View Post
            I'm not sure about the "fixstart" option. We want exactly what is written for the "minimus2" option but "fixstart" just sets a new starting point at the first dnaA gene if finds. But it does not circularize contigs by merging any overlapping contigs if I'm not mistaken...
            I see, although the option I suggested was 'all', which does include circularisation (https://github.com/sanger-pathogens/...wiki/Task:-all). However it's true that it includes also the 'fixstart' option, so in your case not ideal as I understand.

            Comment


            • #7
              Originally posted by Ali May View Post
              I see, although the option I suggested was 'all', which does include circularisation (https://github.com/sanger-pathogens/...wiki/Task:-all). However it's true that it includes also the 'fixstart' option, so in your case not ideal as I understand.
              Oh Hey.. I just recognized the "merge" function which is included with the all option.

              I guess this does what I want...I will try it. Alle the other functions coming with the "all" option are not needed in my case.

              The only thing I don't get is whether the "merge" function uses spades for anything? And if spades is used, for what?
              My assembler is canu because it seems to be the best right now for Nanopore Reads, so nothing with spades...
              Last edited by iltisanni; 05-14-2018, 05:25 AM.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM
              • seqadmin
                The Impact of AI in Genomic Medicine
                by seqadmin



                Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                02-26-2024, 02:07 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 03-14-2024, 06:13 AM
              0 responses
              33 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-08-2024, 08:03 AM
              0 responses
              72 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-07-2024, 08:13 AM
              0 responses
              80 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-06-2024, 09:51 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X