Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assembly MiSeq Paired End

    Hi ,

    I'm looking for doing an assembly on a between 80 to 100Mb nematode genome.
    I've paired end 2x250bp , and I read a lot about assembly on the internet but I confused me more than help me to choose the right tool ...
    I read a lot on SPAdes, velvet, soapdenovo, abyss, ... The problem is that the kmer size is really short , and I read that I've to take a kmer size of more than the half mean length sequence, and preferrably 2/3 of the mean length sequence, normally it's 146 here. But I can do it with no one.

    Have you solutions for me, or way to continue my research ?

    Regards
    PacMan

  • #2
    We've had good results with ABySS on paired-end MiSeq for fungal genomes about half that size. You can use longer kmers with ABySS, but you need to specify the maximum size you want when you first configure it, for example to configure ABySS to have maximum kmer 128 you simply do

    Code:
    ./configure --enable-maxk=128 && make
    It's all described in the ABySS README.

    Comment


    • #3
      Ok thanks.
      So, I'm working on a supercomputer with many bioinformatics tools pre-installed, it will be better to reinstall it in my source directory myself with a better configuration (especially with a known configuration) ?
      Or how to know on which kmer is it configurate, and where to find it ? (Like a log file of the configuration)
      Last edited by PacMan; 07-12-2014, 09:03 AM.

      Comment


      • #4
        Yes, and ABySS is very easy to set up.

        In our national cluster, we have three separate modules of the latest ABySS with different maxk values for each. Perhaps your cluster has something like that as well, alternatively you can ask the support team to install an additional module of ABySS configured with this large maxk, if you think it could be useful to others as well.

        But yes, installing it for yourself is quickest and most directly under your own control.

        Comment


        • #5
          And in answer to your question about determining which kmer is configured, the only way I know of is to supply an absurdly large value to the ABYSS executable directly (not the abyss-pe wrapper script).

          Code:
          $ ABYSS -k 1000 -o dummy.out dummy.in
          ABYSS: ../Common/Kmer.h:49: static void Kmer::setLength(unsigned int): Assertion `length <= 128' failed.
          Aborted (core dumped)

          Comment


          • #6
            Thanks a lot ! I'll try it

            And i'm already open to others assemblers proposal

            Comment


            • #7
              Originally posted by Benjamin
              I guess that is the right method when its about comparing both the things and moving ahead into the right direction because finally that is the way for it.
              Till the the time its more like a way which doesn't really work on certain things and we can surely get to learn things from them.
              Could you explain a little bit more what you mean ? I'm not sure that I understood what you said ..

              Comment


              • #8
                Pacman, that is spam, I've already reported it. Some sort of bot, check the rest of 'Benjamin's posts and they're nonsense.

                Comment


                • #9
                  I looked at all his posts before to post it and I found it really weird, but I wasn't sure ...

                  Comment


                  • #10
                    As far as other assemblers, we used spades and mira to assemble the MiSeq data as well but they both took a very very long time, days instead of an hour or so, and as far as we can tell so far (read mappings, protein alignments, etc) the ABySS assembly looks very good.

                    Comment


                    • #11
                      I am not sure if 100 Mb genome would be too large for SPAdes but if you are willing you could certainly try it.

                      SPAPdenovo may be another option to try.

                      Comment


                      • #12
                        Thanks for the suggestion, I think I will run both ABYSS and SOAP, if I have the time velvet and find the good configuration , and compare all with QUAST

                        Comment


                        • #13
                          you could also check Platanus. It uses a multi-kmer approach and can handle kmer sizes > 127. We obtain good results with MiSeq 2x300bp reads.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Essential Discoveries and Tools in Epitranscriptomics
                            by seqadmin




                            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                            Yesterday, 07:01 AM
                          • seqadmin
                            Current Approaches to Protein Sequencing
                            by seqadmin


                            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                            04-04-2024, 04:25 PM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, 04-11-2024, 12:08 PM
                          0 responses
                          58 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 10:19 PM
                          0 responses
                          53 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 09:21 AM
                          0 responses
                          45 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-04-2024, 09:00 AM
                          0 responses
                          55 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X