Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trimmomatic error

    Hi! I'm trying to run trimmomatic, and running into the following error:

    My input:
    java -classpath $BIN/trimmomatic/Trimmomatic-0.20/trimmomatic-0.20.jar org.usadellab.trimmomatic.TrimmomaticPE -phred33 9_1.fastq 9_2.fastq 9_1.fastq_PAIRED_1 9_1.fastq_UNPAIRED_1 9_2.fastq_PAIRED_2 9_2.fastq_UNPAIRED_2 ILLUMINACLIP:contaminants.fa:2:40:15 LEADING:3 TRAILING:3 SLIDINGWINDOW:5:15 MINLEN:36
    And the output:
    Exception in thread "main" java.lang.NegativeArraySizeException
    at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.packSeq(IlluminaClippingTrimmer.java:578)
    at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer$IlluminaClippingSeq.<init>(IlluminaClippingTrimmer.java:548)
    at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer$IlluminaClippingSeq.<init>(IlluminaClippingTrimmer.java:538)
    at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.mapClippingSet(IlluminaClippingTrimmer.java:125)
    at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.loadSequences(IlluminaClippingTrimmer.java:111)
    at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.<init>(IlluminaClippingTrimmer.java:43)
    at org.usadellab.trimmomatic.fastq.trim.TrimmerFactory.makeTrimmer(TrimmerFactory.java:26)
    at org.usadellab.trimmomatic.TrimmomaticPE.main(TrimmomaticPE.java:335)
    How do I fix this????
    Thanks in advance!

  • #2
    Have tried this on another server, with another datafile, from the trimmomatic directory itself. Still not working

    Comment


    • #3
      Looks like a good command line. As as guess I am going to say that your input files have a different quality score than 'phred33'. Part of this guess is the 'negative array' message. Try with 'phred64' (or nothing, should default to phred64.

      Comment


      • #4
        Exception in thread "main" java.lang.NegativeArraySizeException
        at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.packSeq(IlluminaClippingTrimmer.java:578)
        at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer$IlluminaClippingSeq.<init>(IlluminaClippingTrimmer.java:548)
        at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer$IlluminaClippingSeq.<init>(IlluminaClippingTrimmer.java:538)
        at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.mapClippingSet(IlluminaClippingTrimmer.java:125)
        at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.loadSequences(IlluminaClippingTrimmer.java:111)
        at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.<init>(IlluminaClippingTrimmer.java:43)
        at org.usadellab.trimmomatic.fastq.trim.TrimmerFactory.makeTrimmer(TrimmerFactory.java:26)
        at org.usadellab.trimmomatic.TrimmomaticPE.main(TrimmomaticPE.java:335)
        Same thing, I'm afraid. And the quality scores are from Casava 1.8.2, so I know for a fact they're Phred+33 (or, more accurately they've got J's in them as well, so they're in the (0,41) range). For example:
        @ILLUMINA:2430TC9ACXX:1:1101:1470:2047 1:N:0:ACAGTG
        NAGTTATTTGCCTCTTTGAAGCGTTTTCCAACAGTATAGATCTCATGAATCAAATCCTCCATGCAGATGATGCCGNATTNNCCAAGAGATCGAGCAATCAA
        +
        #1=DDFFFHHHHHJJJJJJIJIJIJJJIHIJJJJIJJJJIJJJJIJJJJJJJJJJJIIJHIJJFIIIJIJJJIJH#-;B##,,=ADDDDCDDDDBDDDCDD
        @ILLUMINA:2430TC9ACXX:1:1101:1500:2115 1:N:0:ACAGTG
        CGCCATACAGCAGGAATGGGAGCTGCCCCCCTGGGCACAGCTTCTGCACTGTCTCGGTCCGCCTTTTGGTGTCAACGGTGGTAACATTGAAGGTGACTCCC
        +
        CCCFFFFEHHFFFGBBHIJIEFHIIGGIJJGI?FHFGIJGIIJE@;FGHGIHGIHHG;B?>CCDBDC58,89>55:;<2295>C@>CC@@CC?3:44>4@?
        @ILLUMINA:2430TC9ACXX:1:1101:1401:2131 1:N:0:ACAGTG
        CTGGACTTGCTGGCTTCCCTGAAACGGAGAGAGCGAGAGGAGAAGGACGATGGGGAGGACAAGAAGAAGTCCAAAGTCTCCTCCTACAAGGACTGGGAAGA
        +
        @CCFDEFFHHHHHHIJGIJJJGIJIHJ?E@F@F9FAD@FGBGBCFGACG@CHHHF?BD;=?AB?AA>AA>>CCDC<:4>@C@CC?ACDC@ABBDDDD<>@8
        I wonder, actually, if it's the extra "J" that could be the problem...

        Comment


        • #5
          Phred scores should allow 0,93. In any case, following the stack trace, I would bet that it has problems with your contaminants.fa file. How does that file look like?

          Comment


          • #6
            Originally posted by dvanic View Post
            Same thing, I'm afraid. And the quality scores are from Casava 1.8.2, so I know for a fact they're Phred+33 (or, more accurately they've got J's in them as well, so they're in the (0,41) range). For example:

            I wonder, actually, if it's the extra "J" that could be the problem...
            Sorry i haven't been around this forum for a while and missed this thread.

            Arvid is likely correct - the problem looks like its caused by very short sequences in the adapter fasta (<16 bases).

            While short adapter sequences are usually not a good idea anyway (since they will likely have a high false positive rate and match good data), the latest Trimmomatic version, 0.22, should resolve this problem. I've just uploaded it to the website.

            Comment


            • #7
              Thanks. My contaminants.fasta has only the standard illumina PE adapters, so nothing is shorter than 16 bases:
              >adapter1
              GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG
              >adapter2
              ACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >PCRPrimer1
              AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >PRCPrimer2
              CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT
              >GenomicDNASequencingPrimer
              ACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >PEAdapter1
              >GATCGGAAGAGCGGTTCAGCAGGAATGCCGAG
              >PEAdapter2
              ACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >PEPCRPrimers1
              AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >PEprimers2
              CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
              >PESequencingPrimer
              ACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >PESeqprimer2
              GGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
              However, thank you!!!

              I've tried running Trimmomatic 0.22, and it works, with the same contaminants.fa, no more errors, on our server.

              Comment


              • #8
                Hi guys, I had a question about Trimmomatic and trimming in general. Is the standard to trim bases below Phred Score 20, more or less? I have 100 bp PE reads, and the average of each base quality is over 30, but the first 5 bases are around 31 while after that it rises to an average of 35. I am using an Illumina HiSeq 1000. So is the way to go about this to use Trimmomatic with Sliding Window size of 1 and Phred Scores of 20? When I ran it without any trimming, I got 90% mapped reads, but I'm worried that since I didn't trim, I had a lot of mismatched assemblies. What do you guys think? Any recommendations?

                Comment


                • #9
                  More or less, yes. Some programs work very well without trimming and there are arguments for not doing so -- takes extra time and disk space which could be used for the mapping process instead. Your sequences sound like they have very good quality so there is probably no overriding need to trim them. Personally I just chop off the 5' and 3' ends that don't match the q20 limit. Internal poor quality is something that does not worry me.

                  Comment


                  • #10
                    Thanks Rick! I was thinking it was fine, and last time I trimmed the first 5 bases, it just took forever. So would you still recommend that I use Trimmomatic to trim the ends based on quality score lower than 20 for the few reads that do have that? Or since the averages are over 30, is not really necessary? Cause like you mentioned, it really does take a lot of time...

                    Comment


                    • #11
                      It is your data but from my plant and animal background where 90% is something to die for ... I say no trimming is needed. You have good data!

                      Comment


                      • #12
                        Originally posted by billstevens View Post
                        Thanks Rick! I was thinking it was fine, and last time I trimmed the first 5 bases, it just took forever. So would you still recommend that I use Trimmomatic to trim the ends based on quality score lower than 20 for the few reads that do have that? Or since the averages are over 30, is not really necessary? Cause like you mentioned, it really does take a lot of time...
                        Quality trimming is not so critical for alignment, especially for quality-aware aligners. I'd still recommend adapter trimming though, since even a moderate read-through (say 10bp) can cause the read to be lost. And with 100bp, you can easily afford to cut of 30-40bp of adapter, and still map uniquely (most of the time). Then again, at 90% mapping, you're probably golden either way - i like mining data closer to the limit on principle though.

                        As for lack of performance, have you tried multi-threading? Trimmomatic throughput should scale until the bottleneck is either disk IO or (un-)zipping of the input/output on most machines.

                        Comment


                        • #13
                          I tried using the new trimmomatic and foudnthie error
                          java -classpath trimmomatic-0.22.jar org.usadellab.trimmomatic.TrimmomaticPE -phred64 ~/Oiko_otogenetics/index45_TCATTC_L001_R1_001.fastq ~/Oiko_otogenetics/index45_TCATTC_L001_R2_001.fastq ~/Oiko_otogenetics/index45_TCATTC_L001_R1.fastq_PAIRED_1 ~/Oiko_otogenetics/index45_TCATTC_L001_R1.fastq_UNPAIRED_1 ~/Oiko_otogenetics/index45_TCATTC_L001_R2_001.fastq_PAIRED_2 ~/Oiko_otogenetics/index45_TCATTC_L001_R2_001.fastq_UNPAIRED_2 ILLUMINACLIP:~/Oiko_otogenetics/contaminants.txt:2:40:15 LEADING:15 TRAILING:15 SLIDINGWINDOW:5:18 MINLEN:36
                          TrimmomaticPE: Started with arguments: -phred64 /Users/mparida/Oiko_otogenetics/index45_TCATTC_L001_R1_001.fastq /Users/mparida/Oiko_otogenetics/index45_TCATTC_L001_R2_001.fastq /Users/mparida/Oiko_otogenetics/index45_TCATTC_L001_R1.fastq_PAIRED_1 /Users/mparida/Oiko_otogenetics/index45_TCATTC_L001_R1.fastq_UNPAIRED_1 /Users/mparida/Oiko_otogenetics/index45_TCATTC_L001_R2_001.fastq_PAIRED_2 /Users/mparida/Oiko_otogenetics/index45_TCATTC_L001_R2_001.fastq_UNPAIRED_2 ILLUMINACLIP:~/Oiko_otogenetics/contaminants.txt:2:40:15 LEADING:15 TRAILING:15 SLIDINGWINDOW:5:18 MINLEN:36
                          Exception in thread "main" java.io.FileNotFoundException: ~/Oiko_otogenetics/contaminants.txt (No such file or directory)
                          at java.io.FileInputStream.open(Native Method)
                          at java.io.FileInputStream.<init>(FileInputStream.java:106)
                          at org.usadellab.trimmomatic.fasta.FastaParser.parse(FastaParser.java:55)
                          at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.loadSequences(IlluminaClippingTrimmer.java:68)
                          at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.<init>(IlluminaClippingTrimmer.java:45)
                          at org.usadellab.trimmomatic.fastq.trim.TrimmerFactory.makeTrimmer(TrimmerFactory.java:29)
                          at org.usadellab.trimmomatic.TrimmomaticPE.main(TrimmomaticPE.java:335)
                          This is what my contaminants file looks like:
                          >Illumina_Paired_End_Adapter_1
                          ACACTCTTTCCCTACACGACGCTCTTCCGATCT
                          >Illumina_Paired_End_Adapter_2
                          CTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
                          >Illumina_Paried_End_PCR_Primer_1
                          AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
                          >Illumina_Paired_End_PCR_Primer_2
                          CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
                          >Illumina_Paried_End_Sequencing_Primer_1
                          ACACTCTTTCCCTACACGACGCTCTTCCGATCT
                          >Illumina_Paired_End_Sequencing_Primer_2
                          CGGTCTCGGCATTCCTACTGAACCGCTCTTCCGATCT
                          What do I do???? Please help

                          Comment


                          • #14
                            I tried moving the contaminants file to the dir Trimmomatic-0.22/ and it is working. Thanks.

                            Comment


                            • #15
                              I think the problem is not specifying a complete path. Using '~' as a short-cut to your home directory does not work with all programs.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM
                              • seqadmin
                                The Impact of AI in Genomic Medicine
                                by seqadmin



                                Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                                02-26-2024, 02:07 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 03-14-2024, 06:13 AM
                              0 responses
                              34 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-08-2024, 08:03 AM
                              0 responses
                              72 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-07-2024, 08:13 AM
                              0 responses
                              81 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-06-2024, 09:51 AM
                              0 responses
                              68 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X