Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trimmomatic error

    Hi! I'm trying to run trimmomatic, and running into the following error:

    My input:
    java -classpath $BIN/trimmomatic/Trimmomatic-0.20/trimmomatic-0.20.jar org.usadellab.trimmomatic.TrimmomaticPE -phred33 9_1.fastq 9_2.fastq 9_1.fastq_PAIRED_1 9_1.fastq_UNPAIRED_1 9_2.fastq_PAIRED_2 9_2.fastq_UNPAIRED_2 ILLUMINACLIP:contaminants.fa:2:40:15 LEADING:3 TRAILING:3 SLIDINGWINDOW:5:15 MINLEN:36
    And the output:
    Exception in thread "main" java.lang.NegativeArraySizeException
    at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.packSeq(IlluminaClippingTrimmer.java:578)
    at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer$IlluminaClippingSeq.<init>(IlluminaClippingTrimmer.java:548)
    at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer$IlluminaClippingSeq.<init>(IlluminaClippingTrimmer.java:538)
    at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.mapClippingSet(IlluminaClippingTrimmer.java:125)
    at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.loadSequences(IlluminaClippingTrimmer.java:111)
    at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.<init>(IlluminaClippingTrimmer.java:43)
    at org.usadellab.trimmomatic.fastq.trim.TrimmerFactory.makeTrimmer(TrimmerFactory.java:26)
    at org.usadellab.trimmomatic.TrimmomaticPE.main(TrimmomaticPE.java:335)
    How do I fix this????
    Thanks in advance!

  • #2
    Have tried this on another server, with another datafile, from the trimmomatic directory itself. Still not working

    Comment


    • #3
      Looks like a good command line. As as guess I am going to say that your input files have a different quality score than 'phred33'. Part of this guess is the 'negative array' message. Try with 'phred64' (or nothing, should default to phred64.

      Comment


      • #4
        Exception in thread "main" java.lang.NegativeArraySizeException
        at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.packSeq(IlluminaClippingTrimmer.java:578)
        at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer$IlluminaClippingSeq.<init>(IlluminaClippingTrimmer.java:548)
        at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer$IlluminaClippingSeq.<init>(IlluminaClippingTrimmer.java:538)
        at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.mapClippingSet(IlluminaClippingTrimmer.java:125)
        at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.loadSequences(IlluminaClippingTrimmer.java:111)
        at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.<init>(IlluminaClippingTrimmer.java:43)
        at org.usadellab.trimmomatic.fastq.trim.TrimmerFactory.makeTrimmer(TrimmerFactory.java:26)
        at org.usadellab.trimmomatic.TrimmomaticPE.main(TrimmomaticPE.java:335)
        Same thing, I'm afraid. And the quality scores are from Casava 1.8.2, so I know for a fact they're Phred+33 (or, more accurately they've got J's in them as well, so they're in the (0,41) range). For example:
        @ILLUMINA:2430TC9ACXX:1:1101:1470:2047 1:N:0:ACAGTG
        NAGTTATTTGCCTCTTTGAAGCGTTTTCCAACAGTATAGATCTCATGAATCAAATCCTCCATGCAGATGATGCCGNATTNNCCAAGAGATCGAGCAATCAA
        +
        #1=DDFFFHHHHHJJJJJJIJIJIJJJIHIJJJJIJJJJIJJJJIJJJJJJJJJJJIIJHIJJFIIIJIJJJIJH#-;B##,,=ADDDDCDDDDBDDDCDD
        @ILLUMINA:2430TC9ACXX:1:1101:1500:2115 1:N:0:ACAGTG
        CGCCATACAGCAGGAATGGGAGCTGCCCCCCTGGGCACAGCTTCTGCACTGTCTCGGTCCGCCTTTTGGTGTCAACGGTGGTAACATTGAAGGTGACTCCC
        +
        CCCFFFFEHHFFFGBBHIJIEFHIIGGIJJGI?FHFGIJGIIJE@;FGHGIHGIHHG;B?>CCDBDC58,89>55:;<2295>C@>CC@@CC?3:44>4@?
        @ILLUMINA:2430TC9ACXX:1:1101:1401:2131 1:N:0:ACAGTG
        CTGGACTTGCTGGCTTCCCTGAAACGGAGAGAGCGAGAGGAGAAGGACGATGGGGAGGACAAGAAGAAGTCCAAAGTCTCCTCCTACAAGGACTGGGAAGA
        +
        @CCFDEFFHHHHHHIJGIJJJGIJIHJ?E@F@F9FAD@FGBGBCFGACG@CHHHF?BD;=?AB?AA>AA>>CCDC<:4>@C@CC?ACDC@ABBDDDD<>@8
        I wonder, actually, if it's the extra "J" that could be the problem...

        Comment


        • #5
          Phred scores should allow 0,93. In any case, following the stack trace, I would bet that it has problems with your contaminants.fa file. How does that file look like?

          Comment


          • #6
            Originally posted by dvanic View Post
            Same thing, I'm afraid. And the quality scores are from Casava 1.8.2, so I know for a fact they're Phred+33 (or, more accurately they've got J's in them as well, so they're in the (0,41) range). For example:

            I wonder, actually, if it's the extra "J" that could be the problem...
            Sorry i haven't been around this forum for a while and missed this thread.

            Arvid is likely correct - the problem looks like its caused by very short sequences in the adapter fasta (<16 bases).

            While short adapter sequences are usually not a good idea anyway (since they will likely have a high false positive rate and match good data), the latest Trimmomatic version, 0.22, should resolve this problem. I've just uploaded it to the website.

            Comment


            • #7
              Thanks. My contaminants.fasta has only the standard illumina PE adapters, so nothing is shorter than 16 bases:
              >adapter1
              GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG
              >adapter2
              ACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >PCRPrimer1
              AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >PRCPrimer2
              CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT
              >GenomicDNASequencingPrimer
              ACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >PEAdapter1
              >GATCGGAAGAGCGGTTCAGCAGGAATGCCGAG
              >PEAdapter2
              ACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >PEPCRPrimers1
              AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >PEprimers2
              CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
              >PESequencingPrimer
              ACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >PESeqprimer2
              GGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
              However, thank you!!!

              I've tried running Trimmomatic 0.22, and it works, with the same contaminants.fa, no more errors, on our server.

              Comment


              • #8
                Hi guys, I had a question about Trimmomatic and trimming in general. Is the standard to trim bases below Phred Score 20, more or less? I have 100 bp PE reads, and the average of each base quality is over 30, but the first 5 bases are around 31 while after that it rises to an average of 35. I am using an Illumina HiSeq 1000. So is the way to go about this to use Trimmomatic with Sliding Window size of 1 and Phred Scores of 20? When I ran it without any trimming, I got 90% mapped reads, but I'm worried that since I didn't trim, I had a lot of mismatched assemblies. What do you guys think? Any recommendations?

                Comment


                • #9
                  More or less, yes. Some programs work very well without trimming and there are arguments for not doing so -- takes extra time and disk space which could be used for the mapping process instead. Your sequences sound like they have very good quality so there is probably no overriding need to trim them. Personally I just chop off the 5' and 3' ends that don't match the q20 limit. Internal poor quality is something that does not worry me.

                  Comment


                  • #10
                    Thanks Rick! I was thinking it was fine, and last time I trimmed the first 5 bases, it just took forever. So would you still recommend that I use Trimmomatic to trim the ends based on quality score lower than 20 for the few reads that do have that? Or since the averages are over 30, is not really necessary? Cause like you mentioned, it really does take a lot of time...

                    Comment


                    • #11
                      It is your data but from my plant and animal background where 90% is something to die for ... I say no trimming is needed. You have good data!

                      Comment


                      • #12
                        Originally posted by billstevens View Post
                        Thanks Rick! I was thinking it was fine, and last time I trimmed the first 5 bases, it just took forever. So would you still recommend that I use Trimmomatic to trim the ends based on quality score lower than 20 for the few reads that do have that? Or since the averages are over 30, is not really necessary? Cause like you mentioned, it really does take a lot of time...
                        Quality trimming is not so critical for alignment, especially for quality-aware aligners. I'd still recommend adapter trimming though, since even a moderate read-through (say 10bp) can cause the read to be lost. And with 100bp, you can easily afford to cut of 30-40bp of adapter, and still map uniquely (most of the time). Then again, at 90% mapping, you're probably golden either way - i like mining data closer to the limit on principle though.

                        As for lack of performance, have you tried multi-threading? Trimmomatic throughput should scale until the bottleneck is either disk IO or (un-)zipping of the input/output on most machines.

                        Comment


                        • #13
                          I tried using the new trimmomatic and foudnthie error
                          java -classpath trimmomatic-0.22.jar org.usadellab.trimmomatic.TrimmomaticPE -phred64 ~/Oiko_otogenetics/index45_TCATTC_L001_R1_001.fastq ~/Oiko_otogenetics/index45_TCATTC_L001_R2_001.fastq ~/Oiko_otogenetics/index45_TCATTC_L001_R1.fastq_PAIRED_1 ~/Oiko_otogenetics/index45_TCATTC_L001_R1.fastq_UNPAIRED_1 ~/Oiko_otogenetics/index45_TCATTC_L001_R2_001.fastq_PAIRED_2 ~/Oiko_otogenetics/index45_TCATTC_L001_R2_001.fastq_UNPAIRED_2 ILLUMINACLIP:~/Oiko_otogenetics/contaminants.txt:2:40:15 LEADING:15 TRAILING:15 SLIDINGWINDOW:5:18 MINLEN:36
                          TrimmomaticPE: Started with arguments: -phred64 /Users/mparida/Oiko_otogenetics/index45_TCATTC_L001_R1_001.fastq /Users/mparida/Oiko_otogenetics/index45_TCATTC_L001_R2_001.fastq /Users/mparida/Oiko_otogenetics/index45_TCATTC_L001_R1.fastq_PAIRED_1 /Users/mparida/Oiko_otogenetics/index45_TCATTC_L001_R1.fastq_UNPAIRED_1 /Users/mparida/Oiko_otogenetics/index45_TCATTC_L001_R2_001.fastq_PAIRED_2 /Users/mparida/Oiko_otogenetics/index45_TCATTC_L001_R2_001.fastq_UNPAIRED_2 ILLUMINACLIP:~/Oiko_otogenetics/contaminants.txt:2:40:15 LEADING:15 TRAILING:15 SLIDINGWINDOW:5:18 MINLEN:36
                          Exception in thread "main" java.io.FileNotFoundException: ~/Oiko_otogenetics/contaminants.txt (No such file or directory)
                          at java.io.FileInputStream.open(Native Method)
                          at java.io.FileInputStream.<init>(FileInputStream.java:106)
                          at org.usadellab.trimmomatic.fasta.FastaParser.parse(FastaParser.java:55)
                          at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.loadSequences(IlluminaClippingTrimmer.java:68)
                          at org.usadellab.trimmomatic.fastq.trim.IlluminaClippingTrimmer.<init>(IlluminaClippingTrimmer.java:45)
                          at org.usadellab.trimmomatic.fastq.trim.TrimmerFactory.makeTrimmer(TrimmerFactory.java:29)
                          at org.usadellab.trimmomatic.TrimmomaticPE.main(TrimmomaticPE.java:335)
                          This is what my contaminants file looks like:
                          >Illumina_Paired_End_Adapter_1
                          ACACTCTTTCCCTACACGACGCTCTTCCGATCT
                          >Illumina_Paired_End_Adapter_2
                          CTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
                          >Illumina_Paried_End_PCR_Primer_1
                          AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
                          >Illumina_Paired_End_PCR_Primer_2
                          CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
                          >Illumina_Paried_End_Sequencing_Primer_1
                          ACACTCTTTCCCTACACGACGCTCTTCCGATCT
                          >Illumina_Paired_End_Sequencing_Primer_2
                          CGGTCTCGGCATTCCTACTGAACCGCTCTTCCGATCT
                          What do I do???? Please help

                          Comment


                          • #14
                            I tried moving the contaminants file to the dir Trimmomatic-0.22/ and it is working. Thanks.

                            Comment


                            • #15
                              I think the problem is not specifying a complete path. Using '~' as a short-cut to your home directory does not work with all programs.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              31 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              32 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              28 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              53 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X