Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • genome assembly with only mate pair reads

    Hi,

    I am mostly comfortable with DNA resequencing, mRNAseq, ChIPseq, etc. data. And always feel difficult handling de novo assembly works. But it comes my way anyway.

    I have a set of data that are mate pair sequencing of a ~1GB genome. It is close to 30x coverage after linker being removed. the insert size is about 8Kb. I don't feel it is a good idea to use mate pair only (I'd rather to have various sized libraries). Without evidence, I feel a single mate pair library sequence is worse than paired end at the same depth. Let me know if I am wrong.

    Now, I am asked to get best out of this data. Without diving in too deep (spend too much time), what the best (practical) case scenario and the worst case scenario I should prepare the collaborator for?

    I have access to a 512GB 32 core machine, and have velvet, soap denovo, and spades to use. Also a CLC bio license that can be moved to that computer. What is the recommended methods, programs, and parameters to use?

    Very much appreciate your thoughts and suggestions!

    By the way, I did recommend them to (at least) sequence another 50x in 2x100~150. But I don't think it is going to fly.

    Thanks!!!

  • #2
    Hello, I'm a newcomer.

    Comment


    • #3
      You need to consider several things.
      Is it a plant or animal genome? Do you have a reference?
      How complex is the genome i.e ploidy etc?
      I don't think mate pair alone can do much. Also you just have one mate pair library.
      A starting point would be to sequence several paired end libraries with varying insert sizes e.g. 180bp, 300bp, 600bp etc. for the contig level assembly and later coupled them with several mate pair libraries e.g. 2kb, 5kb, 8kb etc. for scaffolding. Longer reads e.g. PacBio may also help you to resolve large repetitive regions.
      You need to carefully plan each stage of your project: sequencing, quality control and error correction of reads, preliminary contig assembly, scaffolding and gap closing. And of course there is no single best assembler/pipeline for all assembly problem. You need to evaluate multiple assemblers to find the one that gives you best assembly.

      Comment


      • #4
        Thanks for the reply.

        These are exactly what I thought, and recommended to the researcher. Unfortunately I have no control over how the sequencing was designed. But I can refuse to performed the analysis without adequate data :-)

        Comment


        • #5
          It sounds like a waste of your time. You'll end up with a bad assembly that they probably won't like.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:37 PM
          0 responses
          10 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 06:07 PM
          0 responses
          9 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          51 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          67 views
          0 likes
          Last Post seqadmin  
          Working...
          X