Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Brian Bushnell
    Super Moderator
    • Jan 2014
    • 2709

    As GenoMax says, trimming to Q30 is not beneficial before merging reads. BBMerge has some internal quality-trimming options, so it can try to merge, then quality-trim if it is unsuccessful, then try to merge again, etc. That can slightly increase the merge rate. But typically I just use the whole untrimmed reads as input. The longer the input reads are, the less likely it is for BBMerge to make an accidental incorrect merge, and it does take quality scores into account, so I do not recommend quality-trimming prior to BBMerge. Adapter-trimming is fine though.

    Comment

    • finswimmer
      Member
      • Oct 2016
      • 60

      Hello Brian,

      Originally posted by Brian Bushnell View Post
      Aapter-trimming is fine though.
      Do you recommend adapter trimming prior use of bbmerge? I thought if I provide the adapter sequence to bbmerge, it can find those paires which completly overlap more easy.

      fin swimmer

      Comment

      • silask
        Junior Member
        • Oct 2017
        • 9

        Merge pairs before normalisation?

        Hello, I'm building a pipeline for metagenomics.

        I follow the bb tools user guide and do:
        - normalization with bbnorm
        - error correction with tedpole
        - merge (with extension) with bbmerge

        I want to increase the merging to get a better assembly.
        I suspect that many reads, which could be merge are thrown away during the normalisation.

        Wouldn't it be better to do merging (without extension) first than taking primarily the merged reads, normalize, error-correct and merge with extension?

        What is the best way of normalising paired end and merged pairs or singletons in bbnorm?
        For now I do two rounds of bbnorm and supply the other reads via the `extra` parameter, is there a better way to do?

        Comment

        • chloe1005
          Junior Member
          • Oct 2017
          • 7

          Hi,

          I have the shotgun data. Paired-end reads 100bp each end. I want to do MetaPhlAn2 next to know the general taxonomy profile.

          So I am considering to merge them before the MetaPhlAn2. However, I do not know I need to run bbmap first to do quality control, OR to run bbmerge first to merge the sequence. Any suggestions?

          Thanks in advance

          Comment

          • Brian Bushnell
            Super Moderator
            • Jan 2014
            • 2709

            @chloe - It's normally simplest and most effective to do QC first on the raw data, then anything else (such as merging) later.

            @silask - they way you are doing it is currently the most effective way. It's a little bit annoying to have to run BBNorm twice, but that's the only way to process both paired and unpaired reads.

            Comment

            • chloe1005
              Junior Member
              • Oct 2017
              • 7

              Hi, Brian,

              Thanks for the reply. However, I have tried the QC. I used
              bbduk.sh in=R1.fastq.gz out=filter_R1.fq maq=30
              bbduk.sh in=R2.fastq.gz out=filter_R2.fq maq=30
              (no reads in R1R2 will be trimmed)

              bbduk.sh in=R1.fastq.gz out=clean_R1.fq trimq=30
              bbduk.sh in=R2.fastq.gz out=clean_R2.fq trimq=30
              (it will trim 50% of reverse reads, but no forward reads)

              bbduk.sh in1=R1.fastq.gz in2=R2.fastq.gz out1=R1_001.fq out2=R2.fq outm=fail.fq bhist=hist_base.txt qhist=hist_q.txt aqhist=hist_aq.txt bqhist=hist_bq.txt ecco=t
              (Also no reads will be trimmed)

              But when I run the code BBmerge, only 32.268% of the reads can be joined.

              Do you have any suggestions?

              Thanks in advance.

              Comment

              • GenoMax
                Senior Member
                • Feb 2008
                • 7142

                @chloe1005: It is possible that only 32% of your reads have inserts of a size that the reads can merge.

                `trimq=30` is too severe a bar for trimming. If you have a reference genome then not doing any trimming for quality works fine. If you are doing any de novo work then you may want to trim at Q20 or Q25.

                Comment

                • chloe1005
                  Junior Member
                  • Oct 2017
                  • 7

                  Hi,
                  I am still confusing about the difference between the quality trimming and quality filtering. What is the difference between them?
                  May also know how to get the reference genome? Since I also see the first threads in this post.
                  Looking forward to getting the answer.

                  Comment

                  • uloeber
                    Member
                    • Mar 2013
                    • 44

                    Hi Brian, somehow the t=x flag doesn't reduce the number of nodes in use. Any suggestions what goes wrong or can I somehow include Java flags?
                    Bests,
                    Ulrike

                    Comment

                    • kokyriakidis
                      Member
                      • Jul 2018
                      • 12

                      RQCFilter Norm and EC

                      Hi Brian,

                      I am trying to trim and filter my data with RQCFilter but I cannot find an option for normalisation and error correction. Are there any parameters in this package? Also there is a parameter called -merge. Does it do merging? Should I set it to false and try normalising and error correcting first?

                      Comment

                      • GenoMax
                        Senior Member
                        • Feb 2008
                        • 7142

                        Can you clarify which program you are referring to? I don't think there is a RQCfilter program in BBMap suite.

                        Comment

                        • kokyriakidis
                          Member
                          • Jul 2018
                          • 12

                          Source: https://jgi.doe.gov/data-and-tools/b...preprocessing/

                          "These steps replicate the QA protocol implemented at JGI for Illumina reads. There is a program “RQCFilter” which implements them as a pipeline, but that is not publically available because it has numerous hard-coded paths to reference datasets of contaminants."

                          It is in the bbtools files.

                          Nevermind! 1) Is it a good plan to normalise and error correct first BEFORE merging? 2) Do I need to follow a different approach at trimming and filtering short vs long mate pair reads (Nextera)?
                          Last edited by kokyriakidis; 07-08-2018, 12:15 PM.

                          Comment

                          • GenoMax
                            Senior Member
                            • Feb 2008
                            • 7142

                            Since notes on the page you linked say this:
                            There is a program “RQCFilter” which implements them as a pipeline, but that is not publically available because it has numerous hard-coded paths to reference datasets of contaminants.
                            You should follow the steps that are denoted to replicate that functionality on the linked page.

                            In general @Brian has recommended merging reads before doing any additional manipulations.

                            Comment

                            • kokyriakidis
                              Member
                              • Jul 2018
                              • 12

                              Originally posted by GenoMax View Post
                              Since notes on the page you linked say this:


                              You should follow the steps that are denoted to replicate that functionality on the linked page.

                              In general @Brian has recommended merging reads before doing any additional manipulations.
                              In long pair mate reads I just do the splitNextera extra step? Otherwise the pipeline remains the same?

                              Comment

                              • GenoMax
                                Senior Member
                                • Feb 2008
                                • 7142

                                I would think so. I don't have first hand experience with mate pair reads but I recall that you need to switch one of the reads around.

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                                  by SEQadmin2


                                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                                  Here are nine questions we think about, in roughly the order they matter, before...
                                  Yesterday, 07:11 AM
                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-17-2026, 06:09 AM
                                0 responses
                                16 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-09-2026, 11:58 AM
                                0 responses
                                37 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-05-2026, 10:09 AM
                                0 responses
                                43 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-04-2026, 08:59 AM
                                0 responses
                                49 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...