Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Paired End Merging

    We are going to undertake some sequencing where our insert size is less than the sum of our read-lengths (paired end overlap).
    We would like to merge our paired end reads to create single "psuedo-reads" with high quality all the way along.
    A quick Google has shown me there are lots of software available for this
    Introduction In very simple terms, current sequencing technology begins by breaking up long pieces of DNA into lots more short pieces of ...

    I've had a play with some dummy data using FLASH and PANDA-Seq and the results seem slightly different between them (different number of final read number and different distributions) despite using the same parameters.
    Has anyone done a comparison of these packages before, or have a particular feel for what they think is the best performer?

  • #2
    That blog is a great reference. Compared fastq-join, FLASH and read-linker from CD-HIT toolkit. Consistent results but fastq-join came out fastest testing with 5% and 20% error tolerance. Been using it since.

    Comment


    • #3
      Utility of merging PE reads?

      Is/are there practical/bioinformatic reasons why merging overlapping reads isn't/hasn't been performed more often? As a 'beginner' to NGS interested in applying it to WG shotgun metgenomics it would seem the obvious thing to do. I plan to use the EBI metgenomics-InterPro portal which analyzes non-assembled sequences but wouldn't it be useful to input combined overlapping PE 'virtual' reads?

      Comment


      • #4
        One practical reason is that the paired reads never come out with all bases the same quality so after trimming poor quality ends a overlapping 150x150 PE read can become non-overlapping (say 125x140). Therefore most tools are designed to accept the Fwd and Rev for a PE read separately along with the expected insert size. The insert size, you will discover, also varies. So keeping all these uncertainties in mind, its better to work with Fwds and Revs separately. If you can overlap some of the reads, more power to you.

        Comment


        • #5
          I use FLASH and it's spot on for >99% of sequences

          Comment


          • #6
            Originally posted by suryasaha View Post
            One practical reason is that the paired reads never come out with all bases the same quality so after trimming poor quality ends a overlapping 150x150 PE read can become non-overlapping (say 125x140). Therefore most tools are designed to accept the Fwd and Rev for a PE read separately along with the expected insert size. The insert size, you will discover, also varies. So keeping all these uncertainties in mind, its better to work with Fwds and Revs separately. If you can overlap some of the reads, more power to you.
            I thought this might be it - a combination of 'poor' sequence quality and the lack of an experimental approach able to isolate fragments of DNA within a tight size range. Isolation from agarose gel slices (whether via a manual approach or something like the Pippin system) and SPRI approaches appear popular - has anyone tried isolating from acrylamide gels rather than agarose as these have greater resolving power?

            Comment


            • #7
              Originally posted by Coltom View Post
              I thought this might be it - a combination of 'poor' sequence quality and the lack of an experimental approach able to isolate fragments of DNA within a tight size range. Isolation from agarose gel slices (whether via a manual approach or something like the Pippin system) and SPRI approaches appear popular - has anyone tried isolating from acrylamide gels rather than agarose as these have greater resolving power?
              For at least some papers, the agarose method produces remarkably tight library distributions, though when I looked at this it was when much shorter insert sizes (~120) were common; with the long reads now it makes sense to go to longer inserts, and perhaps the extreme precision isn't doable there.

              Anything that adds a step may be skipped by some. Also, there is always a risk of paired end merging getting things wrong when dealing with repeats in the same size range as the overlap.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Recent Advances in Sequencing Analysis Tools
                by seqadmin


                The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                05-06-2024, 07:48 AM
              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 05-10-2024, 06:35 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-09-2024, 02:46 PM
              0 responses
              21 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-07-2024, 06:57 AM
              0 responses
              19 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-06-2024, 07:17 AM
              0 responses
              21 views
              0 likes
              Last Post seqadmin  
              Working...
              X