Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • genome assembly with only mate pair reads

    Hi,

    I am mostly comfortable with DNA resequencing, mRNAseq, ChIPseq, etc. data. And always feel difficult handling de novo assembly works. But it comes my way anyway.

    I have a set of data that are mate pair sequencing of a ~1GB genome. It is close to 30x coverage after linker being removed. the insert size is about 8Kb. I don't feel it is a good idea to use mate pair only (I'd rather to have various sized libraries). Without evidence, I feel a single mate pair library sequence is worse than paired end at the same depth. Let me know if I am wrong.

    Now, I am asked to get best out of this data. Without diving in too deep (spend too much time), what the best (practical) case scenario and the worst case scenario I should prepare the collaborator for?

    I have access to a 512GB 32 core machine, and have velvet, soap denovo, and spades to use. Also a CLC bio license that can be moved to that computer. What is the recommended methods, programs, and parameters to use?

    Very much appreciate your thoughts and suggestions!

    By the way, I did recommend them to (at least) sequence another 50x in 2x100~150. But I don't think it is going to fly.

    Thanks!!!

  • #2
    Hello, I'm a newcomer.

    Comment


    • #3
      You need to consider several things.
      Is it a plant or animal genome? Do you have a reference?
      How complex is the genome i.e ploidy etc?
      I don't think mate pair alone can do much. Also you just have one mate pair library.
      A starting point would be to sequence several paired end libraries with varying insert sizes e.g. 180bp, 300bp, 600bp etc. for the contig level assembly and later coupled them with several mate pair libraries e.g. 2kb, 5kb, 8kb etc. for scaffolding. Longer reads e.g. PacBio may also help you to resolve large repetitive regions.
      You need to carefully plan each stage of your project: sequencing, quality control and error correction of reads, preliminary contig assembly, scaffolding and gap closing. And of course there is no single best assembler/pipeline for all assembly problem. You need to evaluate multiple assemblers to find the one that gives you best assembly.

      Comment


      • #4
        Thanks for the reply.

        These are exactly what I thought, and recommended to the researcher. Unfortunately I have no control over how the sequencing was designed. But I can refuse to performed the analysis without adequate data :-)

        Comment


        • #5
          It sounds like a waste of your time. You'll end up with a bad assembly that they probably won't like.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          17 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          22 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          16 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          46 views
          0 likes
          Last Post seqadmin  
          Working...
          X