Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Do we still need to assemble a genome?

    Hi,

    This may sound like a naive question, but I have been trying to come up with answers for a couple days and haven't yet been able to. Thank you in advance for any input.

    Now that we can detect human sequence variations (SNPs, indels, Structural Variants, etc) based on the set of paired-end reads, I wonder if there is still a need to assemble the original sequence. Wasn't the point to detect the variations?

    And we don't need to know the assembled sequence for the new sequence anymore to gain its gene positions because its paired-end reads can be mapped back to the human reference genome, so we can learn the gene positions from there.

    So aside from saving space (100 something GB vs 3GB) and time to analyze the data, do we really need to assemble any new human genome that has been resequenced?

    Thank you!

  • #2
    We know Beijing, New York, Paris, London, Tokyo... , but we still need a World map.

    Comment


    • #3
      No no, please don't get me wrong here. Let me clarify a bit.

      I understand we still need to sequence personalized genomes. However, the question is once we get the reads, do we need to assemble them?

      My thinking is that the need to have a fully assembled sequence arose from the fact that we need to know how this particular sequence vary from the human reference genome (hg18, for example). But based on just the reads, we can use programs like the SOAP package to locate variations already. For everything else, it is supposed to be identical to the reference genome we use.

      So why bother assembling, once we have the reads? Cannot we get all information we need from the reads alone?

      Comment


      • #4
        I would venture to say that most of the interest in assembly is in de novo assembly of novel organisms.

        I believe the number of organisms that have been fully sequenced is still in the low hundreds.
        --
        Jeremy Leipzig
        Bioinformatics Programmer
        --
        My blog
        Twitter

        Comment


        • #5
          Take a look at the recent pan genome paper (not to be confused with the Pan genome paper :-). There may be significant portions of human genome which are not yet represented in any genome database because they are structural variants restricted to populations not yet sampled.

          Full scale de novo sequencing may not always be necessary -- some sort of intelligent local reassembly / reassembly of everything that doesn't map followed by integration with that which does.

          Comment


          • #6
            For reliable detection of variations you need fairly high coverage (at least 10X and more is better), thus you need to assemble the multiple reads to determine the coverage. Regions with low coverage give less certainty in whether a variation is real and high coverage gives more confidence (obviously).

            Comment


            • #7
              I guess what you guys are trying to say here is that, to detect the variations specific to the individual whose genome is being sequenced, we have to assemble the reads anyway. Ok that I agree.

              But do we have a need for the finished personalized human genome sequence? (Assuming that all the variations would have already been detected during the process of genome assembling.)

              Thanks for any input.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 08:47 AM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              60 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              57 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Working...
              X