Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is chimeric reads a problem in de novo assembly?

    Hi, dear all!

    I want to perform de novo assembly with four libraries (insert size were 270bp, 500bp, 2K and 5K), and the read are paired and the length is 150bp. After mapping the reads to reference with BWA, there are about 1/3 reads were chimeric for the two mate-pair libraries (2K and 5K). I don't know whether I should filter out these reads? There are little information after google. Considering the short libraries were used for constructing contig, then the reads from long libraries are mapped to contigs to link these contigs, in my opinion, the assembly tool still could use chimeric reads to link the contigs. However, my mate think there were rare chimeric reads in previous experiment since reads were short, and the assemble tool may can't deal with chimeric reads. Furthermore, I think if I filter out these reads, then this wouldn't be a true de novo assemble. So should I filter out the chimeric reads in mate-pair library?

    Also, because I want to test ALLPATHs-LG, so I designed the 270bp library, but according to paper titled "Genome sequencing and comparison of two nonhuman primate animal models, the cynomolgus and Chinese rhesus macaques", the overlap reads were filter out when using SOAPdenovo, should I filter out these reads to use SOAPdenovo?

    Any suggestion would be grateful!

    Best wishes!
    lamz138138

  • #2
    Are you sure the reads are chimeric? How did you make that judgement? MP library data requires special handling.

    Comment


    • #3
      Hi, GenoMax!

      I think this reads are chemical reads after mapping reads to reference (BWA mem). Take one pair read as example, read1 have two hit that one part of reads were mapped to position A and another part were mapped to position B, and read2 would mapped to position B. Besides, I have confirm the mapping result in browser.

      Considering the re-fragment of circularized molecules is about 250~500bp, and the read is 150bp long, it is easy to produce chimeric reads.
      Last edited by lamz138138; 06-14-2016, 05:56 AM.

      Comment


      • #4
        Mate-pair libraries are designed to be "chimeric", in the sense that non-contiguous genomic sequences become contiguous (and inverted) during library prep. The link provided by GenoMax illustrates this point. ALLPATHS-LG actually expects the mate-pair reads to have this type of structure.

        But, given that you have a reference genome, what's the rationale for performing de novo vs. reference-guided assembly?
        Last edited by HESmith; 06-14-2016, 06:30 AM.

        Comment


        • #5
          Hi, HESmith!

          In my opinion, compare to reference-guide, de novo assembly may provide clue for structure variation.

          In fact, I mapped the reads to reference to confirm the company had gave us the right data, and the experiment of mate-pair was successful. Then I found so many chimeric reads (the ratio is about 5% in the paper of SOAPdenovo), and considering the reads is longer than previously, my mate think the assembly tool couldn't deal with this type of reads, while I think it wouldn't be problem.

          According to ALLPATHS-LG manual, I can only find it need overlap reads in short library, are you sure that it expect mate-pair reads in chimerica?

          Thanks for reply!

          Comment


          • #6
            From the ALLPATHS-LG manual:

            "Reads from jumping libraries may be chimeric, that is, they may cross the junction point between the two ends of the insert that occurs in libraries produced using the Illumina sheared library protocol."

            N.B.-Jumping libraries = mate-pair libraries

            Comment


            • #7
              Hi, HESmith!

              Thanks very much, I got it in manual, why I had missed it......

              Do you have experiment with SOAPdenovo, do you think it can deal with chimerica reads too. And should I use 270bp library to construct contig?

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              25 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Working...
              X