Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Sisi_ieo
    Junior Member
    • Jan 2019
    • 4

    Problems running BBDuk with class path

    Hi all,

    I'm trying to run bbduk.sh on mac os High Sierra 10.13.1, and it is stuck with some issue with classpath. I've already included the path to the bbduk.sh file in the path and the classpath, but still not working.

    This is the input in terminal:

    Sisi$ bbduk.sh -Xmx24g in=B1_sub_R1.fq in2=B1_sub_R2.fq \
    out=B1_sub_R1_trimmed.fq.gz out2=B1_sub_R2_trimmed.fq.gz \
    literal=GTGCCAGCMGCCGCGGTAA,GGACTACHVGGGTWTCTAAT k=10 ordered=t mink=2 \
    ktrim=l rcomp=f minlength=220 maxlength=280 tbo tpe

    And this is the output:

    java -ea -Xmx24g -Xms24g -cp /usr/local/bin/current/ jgi.BBDukF -Xmx24g in=B1_sub_R1.fq in2=B1_sub_R2.fq out=B1_sub_R1_trimmed.fq.gz out2=B1_sub_R2_trimmed.fq.gz literal=GTGCCAGCMGCCGCGGTAA,GGACTACHVGGGTWTCTAAT k=10 ordered=t mink=2 ktrim=l rcomp=f minlength=220 maxlength=280 tbo tpe
    Error: Could not find or load main class jgi.BBDukF
    Caused by: java.lang.ClassNotFoundException: jgi.BBDukF

    by the way, I couldn't find BBDuckF file.

    Any idea??

    Thanks a lot,

    Comment

    • SNPsaurus
      Registered Vendor
      • May 2013
      • 525

      What's in your /bbmap/current/ directory? I have /bbmap/current/jgi/BBDukF.class
      Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

      Comment

      • Sisi_ieo
        Junior Member
        • Jan 2019
        • 4

        Hi, sorry, I come back with this. I have:
        /Users/owner/bbmap/current/jgi/BBDuk.class
        /Users/owner/bbmap/current/jgi/BBDuk2.class
        but any file with the name BBDukF.class

        The bbduk.sh, which is in /Users/owner/bbmap/bbduk.sh
        has this options at the end:
        fi
        local CMD="java -Djava.library.path=$NATIVELIBDIR $EA $z $z2 -cp $CP jgi.BBDuk $@"
        local CMD="java $EA $z $z2 -cp $CP jgi.BBDuk $@"
        if [[ $silent == 0 ]] && [[ $json == 0 ]]; then
        echo $CMD >&2
        I don't know which should be active and which not.
        Perhaps that's the problem.

        Any idea?

        Thanks

        Comment

        • GenoMax
          Senior Member
          • Feb 2008
          • 7142

          I am not sure why you are having this problem. All you should need to do is download BBMap software on your mac. Unarchive the tar-zipped file and then extend your path to include the "bbmap" directory (export PATH=$PATH:/path_to_bbmap_dir). Don't move contents of the bbmap directory. Move the entire directory to whatever location you want and then amend $PATH.

          Comment

          • Sisi_ieo
            Junior Member
            • Jan 2019
            • 4

            Yes, that is what I've done.

            the path to bbmap is --> /Users/owner/bbmap and it is included in the $PATH

            I think that the problem is related to the way the script searches the classes. None of sh files manage to find the path. As example, executing bbduk.sh retrieves:

            java -ea -Xmx24g -Xms24g -cp /usr/local/bin/current/ jgi.BBDuk ......
            Error: Could not find or load main class jgi.BBDuk
            Caused by: java.lang.ClassNotFoundException: jgi.BBDuk

            I've checked and I don't have any current dir on bin, so I don't know how to tell the script to go directly to the bbmap dir.

            Any idea is greatly appreciated, I stuck with this.

            Comment

            • GenoMax
              Senior Member
              • Feb 2008
              • 7142

              On a Mac I tested this on nothing else was needed to be done. What happens if you just run "bbmap.sh". Does that produce "in-line" bbmap help output?

              Which Java version are you using on your Mac?

              Comment

              • Sisi_ieo
                Junior Member
                • Jan 2019
                • 4

                Thanks for the patience, answering your questions:
                1. If I run "bbmap.sh" or "bbduck.sh" for instance the content of the file is shown (parameters, flags, etc).

                If I run the command with the parameters, like:

                $ bbduk.sh in=sample14_S14_R1.fastq.gz in2=sample14_S14_R2.fastq.gz out=sample14_S14_R1_btrimmed.fastq.gz out2=sample14_S14_R1_btrimmed.fastq.gz literal=GTACACAMCGCCCGTCGC,TGATCCTTCTGCVGGTTCWCCTACG k=10 ordered=t mink=2 ktrim=l rcomp=f minlength=50 maxlength=155 tbo tpe


                Max memory cannot be determined. Attempting to use 1400 MB.
                If this fails, please add the -Xmx flag (e.g. -Xmx24g) to your command,
                or run this program qsubbed or from a qlogin session on Genepool, or set ulimit to an appropriate value.
                java -ea -Xmx1400m -Xms1400m -cp /usr/local/bin/current/ jgi.BBDuk in=sample14_S14_R1.fastq.gz in2=sample14_S14_R2.fastq.gz out=sample14_S14_R1_btrimmed.fastq.gz out2=sample14_S14_R1_btrimmed.fastq.gz literal=GTACACAMCGCCCGTCGC,TGATCCTTCTGCVGGTTCWCCTACG k=10 ordered=t mink=2 ktrim=l rcomp=f minlength=50 maxlength=155 tbo tpe
                Error: Could not find or load main class jgi.BBDuk
                Caused by: java.lang.ClassNotFoundException: jgi.BBDuk

                My current java version is Java 8 Update 191

                Comment

                • GenoMax
                  Senior Member
                  • Feb 2008
                  • 7142

                  We are making progress!

                  How much memory do you have on this Mac? I suggest using 85% of the maximum memory you have available with BBTools. You also need to specify "in1= and out1=" to go with in2= out2= etc. Since you are using IUPAC bases in your literal sequence you also need to run this option on

                  copyundefined=f (cu) Process non-AGCT IUPAC reference bases by making all
                  possible unambiguous copies.

                  You are also trimming to the left side of the read. Is that correct?

                  Can you try this command?

                  Code:
                  bbduk.sh -Xmx4g in1=sample14_S14_R1.fastq.gz in2=sample14_S14_R2.fastq.gz out1=sample14_S14_R1_btrimmed.fastq.gz out2=sample14_S14_R1_btrimmed.fastq.gz literal=GTACACAMCGCCCGTCGC,TGATCCTTCTGCVGGTTCWCCTACG k=10 ordered=t mink=2 ktrim=l rcomp=f minlength=50 maxlength=155 copyundefined=t tbo tpe

                  Comment

                  • aushev
                    Member
                    • Nov 2009
                    • 21

                    Hello,
                    I was trying to filter out rRNA reads using the command like this:
                    bbduk.sh -Xmx3g in=in.fastq out=nonribo.fastq outm=ribo.fastq ref=ribokmers.fa.gz k=31 minlen=3
                    where ribokmers.fa.gz is taken from Brian's googledrive link posted at https://www.biostars.org/p/159959/
                    I noticed that my most abundant rRNA reads (CGCGACCTCAGATCAGACGTGGCGACCCGCTGAATTT) are not filtered. Can anyone explain how this ribokmers.fa was created? What will be the difference if I use, for example, "Human ribosomal DNA complete repeating unit" from GenBank (U13369)? Is there any other recommended source of rRNA sequences for this purpose?
                    Last edited by aushev; 02-25-2019, 12:08 PM.

                    Comment

                    • GenoMax
                      Senior Member
                      • Feb 2008
                      • 7142

                      @aushev: That k-mers file is likely for non-human genomes since it was made from SILVA database.

                      You could use U13369 fasta sequence and then bin the reads that map to it using bbsplit.sh.

                      Comment

                      • aushev
                        Member
                        • Nov 2009
                        • 21

                        Originally posted by GenoMax View Post
                        @aushev: That k-mers file is likely for non-human genomes since it was made from SILVA database.

                        You could use U13369 fasta sequence and then bin the reads that map to it using bbsplit.sh.
                        Thank you @GenoMax!
                        What would be the main advantage of using bbsplit instead of bbduk? As I understand, BBSplit internally uses BBMap, unlike BBDuk - but what would it practically mean? In my scenario, I want to filter out all rRNA reads before doing any further mapping.

                        Comment

                        • GenoMax
                          Senior Member
                          • Feb 2008
                          • 7142

                          Any reads that align to the ribosomal repeat will be identified and separated in a file. Isn't that what you are looking to do?

                          Comment

                          • aushev
                            Member
                            • Nov 2009
                            • 21

                            Originally posted by GenoMax View Post
                            Any reads that align to the ribosomal repeat will be identified and separated in a file. Isn't that what you are looking to do?
                            yes, that's what I wanted - but I just wanted also to understand what is the difference between bbduk and bbsplit for this purpose.

                            Comment

                            • aushev
                              Member
                              • Nov 2009
                              • 21

                              sorry for another dummy question, but I really want to understand how bbduk works and currently I'm having troubles with that... Below I list example of 11 reads containing adapter sequence which I all expected to be detected with the following parameters:
                              Code:
                              bbduk.sh in=falseneg.fastq literal=AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC ktrim=r mink=10 hdist=1 edist=1 hdist2=1 edist2=1
                              So, reference adapter sequence is `AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC`, and all those reads have a match of at least 10 nt (mink=10) and no more than 1 mismatch:

                              ***_(ref)_______________________________AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC
                              1_______________________________________AGATCGGAAGAG_ACACGTCTGAACTCCAGTCACTCGAAGATCTCGTATGC
                              2_______________AGCAGCATTGTACAGGGCTATGACAGATCGGAAGAGCACACGTC_GAACTC
                              3________AGCAGTTGAACATGGGTCAGTCGGTCCTGAGAGATCGGAAGAGCACACAT
                              4______________________________CCTGAGGCTAGATCGGAAGAGCACACGTCTGAAC_CCAGTCACTCGAAGAT
                              5__CGCGACCTCAGATCAGACGTGGCGACCCGCTGAATTTAGATCGGAAGAGT
                              6_________GCATGGGTGGTTCAGTGGTAGAATTCTCGCAGATCGGAAGAGCACACCGT
                              7_______GCATTGGTGGTTCAGTGGTAGAATTCTCGCCTAGATCGGAAGAGCACTCG
                              8________________TAGCTTATCAGACTGATGTTGACAGATCGGAAGAGCACACGTCTGA_CTCC
                              9______TCCCTGGTGGTCTAGTGGTTAGGATTCGGCGCTAGATCGGAAGAGCACAG
                              10______TCCCTGTGGTCTAGTGGTTAGGATTCGGCGCTAGATCGGAAGAGCACGCG
                              11_______________TCGGATCCGTCTGAGCTTGGCTAAGATCGGAAGAGCACACGTCTGGACTC
                              ***_(ref)_______________________________AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC


                              Can you give a hint why none of those reads are matched?
                              Thanks in advance!

                              P.S. Adding qhdist=1 made correct matching, but I still don't understand why edist=1 did not work...
                              Attached Files
                              Last edited by aushev; 02-25-2019, 06:47 PM.

                              Comment

                              • GenoMax
                                Senior Member
                                • Feb 2008
                                • 7142

                                @aushev: Unfortunately Brian no longer has time to participate on this forum. He would really be the only person who can authoritatively answer your questions. You could try to create a ticket on SF site to see if he responds.

                                "edist" directive is for indels so perhaps that is the reason it did not work. I have never had a need to use that directive. Many options for BBTools programs may be applicable in very specific use cases so unless you know for sure you need that option I would go with the defaults. That is all I can offer.

                                Use the smallest/core sequence you are trying to match if you intend to remove all sequence to the right of where that core sequence is found.

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Today, 08:59 AM
                                0 responses
                                8 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                21 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                15 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...