Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • iltisanni
    Member
    • Mar 2017
    • 21

    Nanopore - circular Assembly

    Hi,

    we succesfully sequenced a DNA sample and assembled the genome with canu.
    We got one circular contig which is perfect. But the contig is overlapping.

    Since nanopore Reads are very long some reads at the end have the same sequence as the reads at the beginning of the contig and vice versa.
    Canu is also reporting that the contig is circular.

    Now we want to fix those reads at the beginning and the end of the contig to get one linear contig without overlaps.

    I cannot find any software to help us there except "circlator". I guess it's the minimus2 function we have to use here, but this function has dependencies to the software AMOS which seems to be impossible to install on Ubuntu 18.04.

    Can anyone help us here? Maybe any alternative software to circlator?


    Of course we could always trim the contig manually by finding the end of the contig at the beginning and then trim at this position or use a script which does the same... but...the best code is still the one which has already been written by someone else :-)
    Last edited by iltisanni; 05-11-2018, 04:04 AM.
  • Markiyan
    Senior Member
    • Sep 2010
    • 126

    #2
    Also you can use blast or mummer to detect the overlapping ends...

    First you need to detect by how much the ends are overlapping.
    Than you can save non-overlapping portion + a single copy of the overlapping area sequence to a file.

    You can detect overlapping ends of the contig(s) by the standalone blast using the master-slave alignment formatting output option (blast the sequence against itself).: Lower the expect value -e 1e-50 or less and crank up the word size to 16 - 64bp (-W 32)
    Also dotplot/mummer alignment against itself may be userfull.

    Using above info you can decide which base-range to keep, so you get non - overlapping ends.
    Than you open your sequence in Artemis or similar editor and do select->base range
    and save the selected base range to a fasta file: File->Write->Bases of selection->Fasta format.

    Comment

    • iltisanni
      Member
      • Mar 2017
      • 21

      #3
      Thank you. You helped me a lot and your suggestion to align the sequence against itself was right. Now I found the trimming point and trimmed the fasta with a simple cat X.fasta | cut -c 1-XXX > trimmed.fasta after deleting the header line first and inserting it again at the end in the trimmed.fasta

      I found this information directly in the canu documentation:


      --->

      An alternative is to run MUMmer to get self-alignments on the contig and use those trim points. For example, assuming the circular element is in tig00000099.fa. Run:

      nucmer -maxmatch -nosimplify tig00000099.fa tig00000099.fa
      show-coords -lrcTH out.delta


      to find the end overlaps in the tig. The output would be something like:

      1 1895 48502 50400 1895 1899 99.37 50400 50400 3.76 3.77 tig00000001 tig00000001
      48502 50400 1 1895 1899 1895 99.37 50400 50400 3.77 3.76 tig00000001 tig00000001

      means trim to 1 to 48502. There is also an alternate writeup.

      <---
      Last edited by iltisanni; 05-14-2018, 12:28 AM.

      Comment

      • Ali May
        Member
        • Aug 2016
        • 13

        #4
        Originally posted by iltisanni View Post
        Hi,

        I cannot find any software to help us there except "circlator". I guess it's the minimus2 function we have to use here, but this function has dependencies to the software AMOS which seems to be impossible to install on Ubuntu 18.04.

        Hi, I use Circlator in similar scenarios. I think you can just use the 'normal' Circlator function and not specifically 'minimus2', which indeed is a hassle as far as I remember.

        Code:
        circlator all <assembly.fasta> <corrected_longreads_from_canu.fasta> <output_folder> --threads <nr_of_trheads>
        Then I check the output of

        Code:
        04.merge.circularise_details.log
        in the output folder to hopefully see a line like

        Code:
        [merge circularise_details]	scaffold1|size4159270|arrow	Circularized: yes
        Then the file
        Code:
        06.fixstart.fasta
        is the final output file which should have fixed coordinates without overlaps etc. Let me know if this helps.

        Comment

        • iltisanni
          Member
          • Mar 2017
          • 21

          #5
          Originally posted by Ali May View Post
          Hi, I use Circlator in similar scenarios. I think you can just use the 'normal' Circlator function and not specifically 'minimus2', which indeed is a hassle as far as I remember.
          I'm not sure about the "fixstart" option. We want exactly what is written for the "minimus2" option but "fixstart" just sets a new starting point at the first dnaA gene if finds. But it does not circularize contigs by merging any overlapping contigs if I'm not mistaken...

          Comment

          • Ali May
            Member
            • Aug 2016
            • 13

            #6
            Originally posted by iltisanni View Post
            I'm not sure about the "fixstart" option. We want exactly what is written for the "minimus2" option but "fixstart" just sets a new starting point at the first dnaA gene if finds. But it does not circularize contigs by merging any overlapping contigs if I'm not mistaken...
            I see, although the option I suggested was 'all', which does include circularisation (https://github.com/sanger-pathogens/...wiki/Task:-all). However it's true that it includes also the 'fixstart' option, so in your case not ideal as I understand.

            Comment

            • iltisanni
              Member
              • Mar 2017
              • 21

              #7
              Originally posted by Ali May View Post
              I see, although the option I suggested was 'all', which does include circularisation (https://github.com/sanger-pathogens/...wiki/Task:-all). However it's true that it includes also the 'fixstart' option, so in your case not ideal as I understand.
              Oh Hey.. I just recognized the "merge" function which is included with the all option.

              I guess this does what I want...I will try it. Alle the other functions coming with the "all" option are not needed in my case.

              The only thing I don't get is whether the "merge" function uses spades for anything? And if spades is used, for what?
              My assembler is canu because it seems to be the best right now for Nanopore Reads, so nothing with spades...
              Last edited by iltisanni; 05-14-2018, 05:25 AM.

              Comment

              Latest Articles

              Collapse

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, 06-09-2026, 11:58 AM
              0 responses
              25 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-05-2026, 10:09 AM
              0 responses
              31 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-04-2026, 08:59 AM
              0 responses
              39 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 12:03 PM
              0 responses
              62 views
              0 reactions
              Last Post SEQadmin2  
              Working...