Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Quality trimming & filtering illumina reads

    Hi,

    I have a illumina MiSeq data set, 32GB size genome, 300bp reads. Quality of reads degrades towards the 3' end in both R1 & R2, more in R2. I want to align reads to its reference using BWA-mem and later proceed in to variant calling using GATK pipeline.

    I decided to do quality trimming of poor quality bases. I used Trimmomatic with window size 5, avg quality 20 and filtered reads <70bp. Are these parameters too stringent?

    FASTQC reports for raw reads and trimmed reads are attached.

    Output of paired data sets from Trimmomatic recovered 82% for both R1 & R2. Unpaired sets were 8% and 1% for R1 & R2 respectively. In this case is it ok to disregard unpaired sets in the mapping step?

    Based on my raw data is it advisable to straight away move on to mapping & skip trimming?

    How could I verify that my mapping is satisfactory? Would you recommend any tool to check mapping quality?

    Appreciate comments on these isues.

    Thanks
    Best Regards
    Rangika
    Attached Files

  • #2
    Quality-trimming prior to mapping is not usually a good idea at a high level like Q20. If you want to do quality-trimming, something like Q6 is more appropriate. But that's not necessary unless you are using an aligner that is intolerant of errors. In general, every additional base will improve alignment accuracy. Variant-callers take base quality into consideration and should not make spurious calls from low-quality bases.

    Also, I suggest avoiding Trimmomatic because it needlessly generates multiple output files. The process is much easier if reads are maintained in a paired configuration, which BBDuk will do.

    There's no easy way to check mapping quality for real data (it's easy for synthetic data, though). You can either rigorously test and manually verify mappings, or have faith in the tools you are using.
    Last edited by Brian Bushnell; 09-09-2016, 08:55 AM.

    Comment


    • #3
      Thank you Brian. I will proceed & see as you have suggested.

      Comment


      • #4
        If you are worried about your base qualities (tbh the ends of your R2 are a bit low), bootstrapped BQSR may be useful:


        I'm assuming there isn't a reference SNP database for your organism. Also, no real reason to discard unpaired reads IMO.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 06:37 PM
        0 responses
        8 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 06:07 PM
        0 responses
        8 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        49 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        66 views
        0 likes
        Last Post seqadmin  
        Working...
        X