Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Genome assembly - 2 similar samples - one good, one bad

    Hi, currently doing genome assemblies on 2 very similar samples.

    - 1 assembled brilliantly very quickly.
    - 1 is highly fragmented

    Any reason why one sample should behave so differently from another? - same sequence (Illumina HiSeq), same heterozygosity and repeat content, both screened for contaminants, both collected together, FastQC very similar for both, same assembly methodology, adapter trimmed.

    Thoughts:
    - adapters in the middle of reads?
    - could a virus have inserted itself?

    Any comments welcomed.

  • #2
    Hmm.. interesting.

    Some thoughts:
    1. Was one more inbred than the other perhaps?
    2. Did you check the insert sizes of the libraries? I'm thinking perhaps the mate pair library for the poor assembly resulting one wasn't as good as the other one.
    3. Also, I have seen adapters in the middle of reads. You can quickly check for this if you know the adapter sequence.
    4. I'm thinking if there was a virus, the virus sequence's kmer coverage would've been high enough for the assembler (de-bruijn graph based ones) to screen it out.

    Comment


    • #3
      Thanks Smurali.

      These are field samples, not inbred. Everything is pointing towards a virus being integrated into the chromosome. Will do some more work on this and if I find out anything useful, will add another post.

      Comment


      • #4
        Originally posted by Elsie View Post
        Thoughts:
        - adapters in the middle of reads?
        - could a virus have inserted itself?

        Any comments welcomed.
        Even if there were adapters in the middle of reads, they still would have been trimmed. And I don't see why a virus would cause a poor assembly, unless it randomly inserted itself into a different place in every cell. If it inserted itself once, then the cell replicated, you'd still get a good assembly.

        It sounds more like cancer to me (depending on the organism), or degraded DNA. Have you looked at the insert size distribution and actual error rates of mapped reads (as opposed to just the quality scores)? Also, what is the read length, target insert size, and specific Illumina platform (e.g. HS2500) and run mode, and what kind of organism is it? Diploid or haploid? ...etc.
        Last edited by Brian Bushnell; 09-21-2015, 08:19 AM.

        Comment


        • #5
          Thanks for the comments Brian.
          100bp PE, belong to the Hymenoptera order. Current evidence is pointing towards Polydnaviruses.

          Comment


          • #6
            Wow. this is certainly interesting.
            I can only think of something external that somehow passed through your contamination screening and made it to the sequencing so the virus looks highly possible here.
            When we sequenced and assembled a bunch of arthropods before, the final assembly sometimes did have a lot of contamination from Homo sapiens (on blood feeders), plants and viruses, so this is expected.
            However, I am still intrigued by the fact that it is causing the assembly to be so highly fragmented. Are you going to try and assemble after removing the reads belonging to the virus, Elsie?

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            9 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            50 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            67 views
            0 likes
            Last Post seqadmin  
            Working...
            X