Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging PE reads taking into account minimum positional quality score

    hello everyone,
    I face an unusual issue with merging Illumina Paired-end reads and controling for the merging using individual base quality:
    using your preferred merge (let us say Pear, FLash...) you can of course control for the effect of quality score difference between R1 and R2 for a given position and decide wether the difference is large enough to make the base with highest value the one you keep. Ok, works fine in most cases. Here I have a different situation where I would like to do that + if one of the two scores for a given position is < threshold (let us say 20 for example) then the other strand is kept, whatever the difference in score AS LONG as its own score > 20. And that I could not find it from any PE reads merger yet !
    Any (verified) idea anyone?

    just to avoid out of topic comments, &- yes, I alreadyy though of softmasking low quality bases before merging but I could not find any merger that uses this information also. 2- No, I cannot just remove the reads with low quality bases before merging as I cannot use a strategy based on the % of low quality reads or sliding windows as I really want to use the individual position quality profile for rare variants calling. 3- yes, using illumina correction algorithms like dada2 is an option I will also explore but I would prefer exploring the solution I detail during merging first.
    Thanks all !

  • #2
    You can do custom trimming (based on different profile of errors between R1&R2 (to take into account the higher chance of presence of the erroneous/random data in the end of the R2 read)).

    One can use perl for prototyping a read trimmer (or modify any of the open source ones (if you have a bit of C/C++ knowledge).

    May I ask you a couple questions:

    1. What readlength and platform had you used for you sequencing data aquisition?
    Was it 4chanell or 2 chanell imaging? 2 chanell one has 2x-10x higher the error rate and is limited to 150bp read length, so use a 4 chanell platform.

    2. What was your cluster density? If you use MiSeq or a Hiseq 2500 - I would undercluster to get lower error rate if looking for rare things - lower cluster density lowers error rate by 3X-6X, (while also lowering raw data yields).

    3. Did you use the PCRFree library prep protocol? What was your average insert size and size distribution?
    Last edited by Markiyan; 06-21-2018, 12:56 AM. Reason: Typo fix.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin


      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
      Yesterday, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    39 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    41 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    35 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    55 views
    0 likes
    Last Post seqadmin  
    Working...
    X