Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by SylvainL View Post
    I am using BBDuk version 36.84, I guess that's the main difference...
    Well .. let us remove that difference then

    Code:
    $ bbmap/bbduk.sh in=nn.fq out=stdout.fq mm=f hdist=0 edist=0 ktrim=l rcomp=f k=29 literal=CAACAGCAATATACCTTCTCGAGAGGTCT
    java -Djava.library.path=/path_to/bbmap/jni/ -ea -Xmx19498m -Xms19498m -cp /path_to/bbmap/current/ jgi.BBDukF in=nn.fq out=stdout.fq mm=f hdist=0 edist=0 ktrim=l rcomp=f k=29 literal=CAACAGCAATATACCTTCTCGAGAGGTCT
    Executing jgi.BBDukF [in=nn.fq, out=stdout.fq, mm=f, hdist=0, edist=0, ktrim=l, rcomp=f, k=29, literal=CAACAGCAATATACCTTCTCGAGAGGTCT]
    
    BBDuk version 36.84
    Initial:
    Memory: max=19594m, free=19185m, used=409m
    
    Added 1 kmers; time:    0.043 seconds.
    Memory: max=19594m, free=18469m, used=1125m
    
    Input is being processed as unpaired
    Started output streams: 0.018 seconds.
    
    @test
    CTGTCCACAGATCGGAAGAGCACACGTCTGAACTCCAGTCACGTTTCGGAATCTC
    +
    FGGGCEGGGGIIIIIIIIIIIIIIIIGIIIIIIIIIIIIIGIIGFIIFIIIIIII
    Processing time:                0.006 seconds.
    
    Input:                          1 reads                 100 bases.
    KTrimmed:                       1 reads (100.00%)       45 bases (45.00%)
    Total Removed:                  0 reads (0.00%)         45 bases (45.00%)
    Result:                         1 reads (100.00%)       55 bases (55.00%)
    
    Time:                           0.076 seconds.
    Reads Processed:           1    0.01k reads/sec
    Bases Processed:         100    0.00m bases/sec

    Comment


    • Ok, thanks for all this effort. I really don't catch it. Now, it's running with maxlength=1 and I get the expected results.
      Quite weird

      Comment


      • Just for the record what OS/Java version are you using?

        I am not using maxlength=1 and still get the correct answer. Strange indeed.

        Comment


        • Ubuntu 12.04 server...
          java version "1.7.0_121"
          OpenJDK Runtime Environment (IcedTea 2.6.8) (7u121-2.6.8-1ubuntu0.12.04.1)
          OpenJDK 64-Bit Server VM (build 24.121-b00, mixed mode)

          Comment


          • Sounds like, possibly, an intermittent filesystem problem / system problem? Please let me know if it happens again.

            Comment


            • Originally posted by Brian Bushnell View Post
              Sounds like, possibly, an intermittent filesystem problem / system problem? Please let me know if it happens again.
              It appears to go away for @SylvainL by using maxlength=1 option, which is odd.

              PS: I get the correct answer without the need of maxlength.
              Last edited by GenoMax; 01-12-2017, 09:49 AM.

              Comment


              • Originally posted by GenoMax View Post
                It appears to go away by using maxlength=1 option, which is odd.
                Oh - my impression was that the problem occurred once, but then was not replicable either with or without "maxlength=1" (which, actually, should make it so that there is no output at all in this case).

                @SylvainL Sorry for the confusion, can you please clarify what output you are currently getting with and without "maxlength=1"? Currently, I got:

                Code:
                bbduk.sh in=nn.fq out=stdout.fq mm=f hdist=0 edist=0 ktrim=l rcomp=f k=29 literal=CAACAGCAATATACCTTCTCGAGAGGTCT
                
                output:
                @test
                CTGTCCACAGATCGGAAGAGCACACGTCTGAACTCCAGTCACGTTTCGGAATCTC
                +
                FGGGCEGGGGIIIIIIIIIIIIIIIIGIIIIIIIIIIIIIGIIGFIIFIIIIIII
                Code:
                bbduk.sh in=nn.fq out=stdout.fq mm=f hdist=0 edist=0 ktrim=l rcomp=f k=29 literal=CAACAGCAATATACCTTCTCGAGAGGTCT maxlen=1
                
                output:
                ...which is what I expect.

                Comment


                • Hi Brian,

                  if I do NOT set maxlength=1, the output is only reads shorter than 10 after trimming
                  if I set maxlength=1, I can get the normal output, i-e reads around 40-50bp after trimming...

                  I also set trimq=0 to be sure that it wasn't a problem of quality trimming ...

                  Here is my command:

                  Code:
                  ~/Applications/bbmap/bbduk.sh in=./out/${MissMatch}_MM_${Indel}_Indel/${Samplename}.fastq outm=./out/${MissMatch}_MM_${Indel}_Indel/${Samplename}_left.fastq literal=$(cat ./out/Barcodes_with_adapters/${Samplename}) k=$(wc -m ./out/Barcodes_with_adapters/${Samplename} | awk '{print $1-1}') mm=f hdist=${MissMatch} edist=${Indel} ktrim=l rcomp=f maxlength=1 trimq=0;
                  Last edited by SylvainL; 01-12-2017, 10:14 PM.

                  Comment


                  • Hi Brian,

                    Do you think it is possible for bbduk2.sh to trim both 5' end and 3' end with each mapping to a 22 nt reference and and a 3 nt reference, respectively?
                    To be more specific, the 5' end primer is: GCGGAGATCTACCTACGTACTT and the 3' end primer is: TGT

                    Thanks for your great tool!

                    Comment


                    • Originally posted by netasha View Post
                      Hi Brian,

                      Do you think it is possible for bbduk2.sh to trim both 5' end and 3' end with each mapping to a 22 nt reference and and a 3 nt reference, respectively?
                      To be more specific, the 5' end primer is: GCGGAGATCTACCTACGTACTT and the 3' end primer is: TGT

                      Thanks for your great tool!
                      Pretty sure the answer is yes but you may want to do following instead. A similar question recently came up on Biostars and this was @Brian's recommendation.

                      Code:
                      bbduk.sh in=file.fq out=stdout.fq ktrim=r k=3 mm=f literal=TGT rcomp=f ktrimexclusive | bbduk.sh in=stdin.fq out=trimmed.fq ktrim=l k=22 mm=f literal=GCGGAGATCTACCTACGTACTT rcomp=f ktrimexclusive

                      Comment


                      • Originally posted by GenoMax View Post
                        Pretty sure the answer is yes but you may want to do following instead. A similar question recently came up on Biostars and this was @Brian's recommendation.

                        Code:
                        bbduk.sh in=file.fq out=stdout.fq ktrim=r k=3 mm=f literal=TGT rcomp=f ktrimexclusive | bbduk.sh in=stdin.fq out=trimmed.fq ktrim=l k=22 mm=f literal=GCGGAGATCTACCTACGTACTT rcomp=f ktrimexclusive

                        Thanks for your quick reply!
                        I was thinking doing it sequentially as well. But what is the "ktrimexclusive"?

                        Comment


                        • Originally posted by netasha View Post
                          Thanks for your quick reply!
                          I was thinking doing it sequentially as well. But what is the "ktrimexclusive"?
                          Hi Netasha,

                          BBDuk2 can trim the left and right end at the same time, but it can only use a single kmer length, and as a result it won't work in your case. So, 2 passes with BBDuk using 2 different kmer lengths is better. "TGT" is super short, though, which will lead to overtrimming due to coincidental matches. Is there any more fixed sequence following that?

                          BBDuk's normal trimming behavior when matching a kmer is to trim the kmer itself and everything to the right/left of it. "ktrimexclusive" tells BBDuk to only trim to the right/left, but not to trim the matched kmer itself (so TGT or whatever would still remain in the read). Whether or not you should use that flag depends on whether the sequences you want to trim are genomic. For adapters, which are artificial, the ktrimexclusive flag should not be used, but in some cases it should.

                          Comment


                          • That's why I was looking for a parameter which can restrict this short 3mer to be anchored at the rightmost position. I thought you provided such parameters already: restrictright=3. Am I right?

                            Because I was using cutadapt and it can manage to trim it by adding a "$" at the end of the 3 mer: TGT$. If I'm wrong, would it be a lot of efforts to add such function to bbduk2.sh?

                            Thanks for the explanation of the "ktrimexclusive".

                            Comment


                            • Originally posted by netasha View Post
                              That's why I was looking for a parameter which can restrict this short 3mer to be anchored at the rightmost position. I thought you provided such parameters already: restrictright=3. Am I right?
                              Oh, yes, if you know the position then use restrictright=3. If you already know that 100% of the time you want to trim the last 3 bases, then you can just use "ftr2=3" instead of kmer-matching.

                              Comment


                              • Hi Brian,

                                I started using an HPC (36 CPUs @ 3.5 GHz each & 60 GB RAM) to processes NGS data. I noticed that while using BBDuk, neither the RAM nor processors are being challenged, yet BBDuk takes quite awhile to process the reads (about 10 minutes for 90,000 150bp x 2 paired reads). BBDuk Parameters:

                                Adapters
                                Trim: Right End Only
                                Kmer Length: 27
                                Max Substitutions: 3
                                Max Substitutions + INDELs: 0
                                Trim partial adapaters with kmer length: Yes, 7

                                Trim Low Quality - Yes
                                Both Ends
                                Minimum Quality: 20

                                Discard Short Reads - Yes
                                Minimum Length: 75 bp (changed to 75 because I found that primer dimers contributed to assembled reads when cutoff was set at 50)

                                Keep Original Order - Yes


                                I tried using the t=36 flag, but still don't get all of my processors utilized, and the RAM is set to 45 GB and only about 10 GB is utilized. BBMap on the other hand can and does cap out the processors during mapping on the HPC. I'm using BBDuk inside of Geneious, so if you think this is abnormal performance for BBDuk, I can reach out to them to inquire.

                                Thanks
                                Jake

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 06:37 PM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, Yesterday, 06:07 PM
                                0 responses
                                9 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                49 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                67 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X