Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by westerman View Post
    I think the problem is not specifying a complete path. Using '~' as a short-cut to your home directory does not work with all programs.
    Indeed the ~ is the problem.

    Expanding '~' is actually done by the shell, and with some simplifications, it only performs the expansion when the tilde is at the start of a whitespace delimited 'word', which it is for the input / output file paths, but not for the adapter path (due to the ILLUMINACLIP: part).

    Comment


    • #17
      Hi,

      I'm trying to run Trimmomatic and get this error:


      Exception in thread "main" java.io.FileNotFoundException: illuminaClipping.fa (No such file or directory)

      Here is my input:

      java -verbose -classpath /ichec/work/nglif015b/Trimmomatic-0.22/trimmomatic-0.22.jar org.usadellab.trimmomatic.TrimmomaticPE -threads 12 -phred33 /ichec/work/nglif015b/rawdata/batch1/1.control_1.fastq /ichec/work/nglif015b/rawdata/batch1/1.control_2.fastq 1.control.pe_1 1.control.up_1 1.control.pe_2 1.control.up_2 ILLUMINACLIP:illuminaClipping.fa:2:40:15 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

      Any interpretations would be hugely appreciated!

      Best,

      N

      Comment


      • #18
        (Sorry if I'm hijacking this thread!)

        Comment


        • #19
          Originally posted by nr23 View Post
          Hi,

          I'm trying to run Trimmomatic and get this error:

          Exception in thread "main" java.io.FileNotFoundException: illuminaClipping.fa (No such file or directory)
          It seems it can't find the illuminaClipping.fa file, which should contain the adapters you would like trimmed - you will need to specify the full path to this file if it is not in the current directory when the command is run.

          Comment


          • #20
            Ah OK. I don't actually need to clip any adapters so can get rid of ILLUMINACLIP:illuminaClipping.fa:2:40:15 entirely.

            Many thanks!

            N

            Comment


            • #21
              Hi all, I have tried Trimmomatic using the following command line:
              Code:
              java -classpath /path/to/Trimmomatic-0.22/trimmomatic-0.22.jar org.usadellab.trimmomatic.TrimmomaticPE  -phred33   /path/to/Sample_H-xi.fq1 /path/to/Sample_H-xi.fq2 /path/to/Sample_H-xi.fq1.pair /path/to/Sample_H-xi.fq1.unpair  /path/to/Sample_H-xi.fq2.pair /path/to/Sample_H-xi.fq2.unpair  ILLUMINACLIP:/path/to/adaptor.PEdna.fa:2:6:6 SLIDINGWINDOW:4:15 MINLEN:40
              I have used this command line to the 10000 test paired-end reads, and the output information is:
              Code:
              TrimmomaticPE: Started with arguments: -phred33   /path/to/Sample_H-xi.fq1 /path/to/Sample_H-xi.fq2 /path/to/Sample_H-xi.fq1.pair /path/to/Sample_H-xi.fq1.unpair  /path/to/Sample_H-xi.fq2.pair /path/to/Sample_H-xi.fq2.unpair  ILLUMINACLIP:/path/to/adaptor.PEdna.fa:2:6:6 SLIDINGWINDOW:4:15 MINLEN:40
              Using Clipping Sequence: 'ACACTCTTTCCCTACACGACGCTCTTCCGATCT'
              Using Clipping Sequence: 'GATCGGAAGAGCACACGTCTGAACTCCAGTCAC'
              Using Clipping Sequence: 'GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCATCTCGTATGCCGTCTTCTGCTTG'
              Using Clipping Sequence: 'CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT'
              Using Clipping Sequence: 'CGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT'
              Using Clipping Sequence: 'GATCGGAAGAGCGGTTCAGCAGGAATGCCGAG'
              Using Clipping Sequence: 'AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT'
              Using Clipping Sequence: 'ACACTCTTTCCCTACACGACGCTCTTCCGATCT'
              ILLUMINACLIP: Using 0 prefix pairs, 8 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
              Input Read Pairs: 10000 Both Surviving: 9271 (92.71%) Forward Only Surviving: 159 (1.59%) Reverse Only Surviving: 499 (4.99%) Dropped: 71 (0.71%)
              TrimmomaticPE: Completed successfully
              the adaptor seqeunce file is:
              Code:
              >Adapters1_5
              GATCGGAAGAGCGGTTCAGCAGGAATGCCGAG
              >adapters1_3
              ACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >pcr_primer_1.01
              AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >pcr_primer_2.01
              CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
              >sequecing_primer_1
              ACACTCTTTCCCTACACGACGCTCTTCCGATCT
              >sequencing_primer_2
              CGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
              >multi_index_read_seq_primer
              GATCGGAAGAGCACACGTCTGAACTCCAGTCAC
              >TruSeqAdapter_7
              GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCATCTCGTATGCCGTCTTCTGCTTG
              Although I have set the "<palindrome clip threshold>:<simple clip threshold>" to very low 6, the output file still contain the adaptor always at the end of the read like this:
              Code:
              @HISEQ700708:102:C09WVACXX:6:1101:6041:2237#CAGATC/1
              AATCACTACGCATGTATTGTATTCAATTTTGATCATTCATGGAGAAATATTACGAATTGTTGTGTACAGAAAATTTCGACAACTTAGAGATCGGAAGAGCA
              +
              CCCFFFFFHGHHHJIJJJJIJJJJJIJJJJJJJJJJJJJHIJIIJJJJJJJIIIHIJJJHIJHHIJJIJJJJJJHEHHHFFDDEEEEDDDDDDDDDDDBDD
              another related thread is:
              Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


              Any suggestions are appreciated!

              -P
              Last edited by pengchy; 01-16-2013, 01:38 AM. Reason: add another related thread link

              Comment


              • #22
                Originally posted by pengchy View Post
                Although I have set the "<palindrome clip threshold>:<simple clip threshold>" to very low 6, the output file still contain the adaptor always at the end of the read like this:
                Code:
                @HISEQ700708:102:C09WVACXX:6:1101:6041:2237#CAGATC/1
                AATCACTACGCATGTATTGTATTCAATTTTGATCATTCATGGAGAAATATTACGAATTGTTGTGTACAGAAAATTTCGACAACTTAGAGATCGGAAGAGCA
                +
                CCCFFFFFHGHHHJIJJJJIJJJJJIJJJJJJJJJJJJJHIJIIJJJJJJJIIIHIJJJHIJHHIJJIJJJJJJHEHHHFFDDEEEEDDDDDDDDDDDBDD
                another related thread is:
                Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


                Any suggestions are appreciated!
                There are a few 'features' of trimmomatic which might help explain what's happening here.

                Trimmomatic doesn't check for reverse-complement versions of the provided contaminant sequences - in general which read and orientation a 'contaminant' can appear may be known, and this information can be used to narrow the search and risk of false positives.

                Also, this looks like a read-through situation, which is specifically what palindrome mode was designed to find - but it is critical to use the correct sequences and naming in this mode. I've recently been given permission to distribute these sequences, so they will be included in future releases, hopefully reducing these kinds of problems.

                The recommended sequences for TruSeq3 are:
                Code:
                >PrefixPE/1
                TACACTCTTTCCCTACACGACGCTCTTCCGATCT
                >PrefixPE/2
                GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
                Hope this helps,

                Tony.

                Comment


                • #23
                  Thanks Tony, it works. The reverse end of the read through pairs have been filtered. Many Thanks.

                  Best,
                  Pengcheng

                  Comment


                  • #24
                    I am trying to trim my amplicon-based sequencing data with trimmomatic. I am having hard-time trying to figure out how to input my primer sequences for trimming. I do a HEADCROP:20 to remove the primers, however, in cases where there is read-through I cannot get trimmomatic to remove those. I do not want to do palindromic since I want the reverse preserved. I am concatenating my primers with adapter sequences to make sure that amplicon ends are trimmed (eg PE1+F/1 and PE2+R/2 were F and R are primer sequences in 5-3 orientation. PE1 and PE2 are the same sequences as PrefixPE/1 and 2). I have overlapping amplicons so the very same primer sequence may be within another amplicon but should not be trimmed. Does anyone has a working example on how to input the primer sequences. Any thoughts anyone,

                    thanks alot

                    Comment


                    • #25
                      Originally posted by Bioblue View Post
                      I am trying to trim my amplicon-based sequencing data with trimmomatic. I am having hard-time trying to figure out how to input my primer sequences for trimming. I do a HEADCROP:20 to remove the primers, however, in cases where there is read-through I cannot get trimmomatic to remove those. I do not want to do palindromic since I want the reverse preserved. I am concatenating my primers with adapter sequences to make sure that amplicon ends are trimmed (eg PE1+F/1 and PE2+R/2 were F and R are primer sequences in 5-3 orientation. PE1 and PE2 are the same sequences as PrefixPE/1 and 2). I have overlapping amplicons so the very same primer sequence may be within another amplicon but should not be trimmed. Does anyone has a working example on how to input the primer sequences.
                      Hi,

                      Not sure i understand the requirements fully, but if you can email me (address on trimmomatic website) an example of some reads you want trimmed, the primer sequences etc, i should be able to tell you if/how you can achieve what you want.

                      BTW, the latest version of trimmomatic (to be released soon), has the option to preserve both reads from a read-through pair.

                      Comment


                      • #26
                        Hi Folks:

                        I was wondering about the option -phred. I was told that a quality score above 20 should be ok for de novo assmbly reads but I could not force trimmomatic to accept the option -phred20.
                        Does anyone know why?

                        Comment


                        • #27
                          From what I understand the option -Phred is to tell the program how your quality is coded, either -33 or 64. If you want to trim using a quality of 20, you need to use the sliding window option.

                          Hope this was helpful.

                          Comment


                          • #28
                            Thank you so much for your reply.
                            I think I finally figured it out.

                            best whishes
                            Originally posted by mariruilo View Post
                            From what I understand the option -Phred is to tell the program how your quality is coded, either -33 or 64. If you want to trim using a quality of 20, you need to use the sliding window option.

                            Hope this was helpful.

                            Comment


                            • #29
                              some troubles in using trimmomatic

                              i use trimmomatic to clean my paired-end reads, i used ScriptSeq-v2 RNA-Seq Library Preparation Kit.
                              my adaptors file is formated as:

                              >PrefixPE/1
                              CAAGCAGAAGACGGCATACGAGAT
                              >PrefixPE/2
                              GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
                              >PE/1
                              CAAGCAGAAGACGGCATACGAGAT
                              >PE/1_rc
                              ATCTCGTATGCCGTCTTCTGCTTG
                              >PE/2
                              GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
                              >PE/2_rc
                              AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC

                              and when i run:

                              java -jar trimmomatic-0.32.jar PE -threads 2 -phred33 -trimlog r.log 1.fastq 2.fastq Out_paired_1.fastq Out_unpaired_1.fastq Out_paired_2.fastq Out_unpaired_2.fastq ILLUMINACLIP:adapters.fa:2:30:27 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:50

                              my sequences are not cleaned.

                              the input files i use:

                              1.fastq
                              @HWI-D00562:33:C67RBANXX:7:1101:12177:2410 1:N:0:ACAGTG
                              GACCGCGGTTCTATTTTGTTGGTTTTCGGAACTGAGGCCATGATTAAGAGGAACTAGATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTGAAAAAA
                              +
                              BBBCCGGGGGGGGGGGGEGGGGGGGGGGFGGGGGGGGGGGGGGDEGGGGGGGGGGGGGEGGGGDGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG@DBFEGGGGGGGGGGGGGGGGGGGGGGGGG

                              2.fastq
                              @HWI-D00562:33:C67RBANXX:7:1101:12177:2410 2:N:0:ACAGTG
                              AGTTCCTCTTAATCATGGCCTCAGTTCCGAAAACCAACAAAATAGAACCGCGGTCAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAAAAA
                              +
                              CCBCCGGGGDGGGGGGCEGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGDDGGGGGGGGGGGGGGGGGGG89


                              (in sequence 1.fastq you can see the adaptor and index: AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC ACAGTG ATCTCGTATGCCGTCTTCTGCTTG)

                              i don't know what's wrong, maybe adaptors file is bad formated ?

                              Comment


                              • #30
                                Originally posted by mslider View Post
                                i use trimmomatic to clean my paired-end reads, i used ScriptSeq-v2 RNA-Seq Library Preparation Kit.
                                my adaptors file is formated as:

                                Code:
                                >PrefixPE/1
                                CAAGCAGAAGACGGCATACGAGAT
                                >PrefixPE/2
                                GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
                                >PE/1
                                CAAGCAGAAGACGGCATACGAGAT
                                >PE/1_rc
                                ATCTCGTATGCCGTCTTCTGCTTG
                                >PE/2
                                GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
                                >PE/2_rc
                                AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC
                                
                                and when i run:
                                
                                java -jar trimmomatic-0.32.jar PE -threads 2 -phred33 -trimlog r.log 1.fastq 2.fastq Out_paired_1.fastq Out_unpaired_1.fastq Out_paired_2.fastq Out_unpaired_2.fastq  ILLUMINACLIP:adapters.fa:2:30:27 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:50
                                
                                my sequences are not cleaned.
                                
                                the input files i use:
                                
                                1.fastq
                                @HWI-D00562:33:C67RBANXX:7:1101:12177:2410 1:N:0:ACAGTG
                                GACCGCGGTTCTATTTTGTTGGTTTTCGGAACTGAGGCCATGATTAAGAGGAACTAGATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTGAAAAAA
                                +
                                BBBCCGGGGGGGGGGGGEGGGGGGGGGGFGGGGGGGGGGGGGGDEGGGGGGGGGGGGGEGGGGDGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG@DBFEGGGGGGGGGGGGGGGGGGGGGGGGG
                                
                                2.fastq
                                @HWI-D00562:33:C67RBANXX:7:1101:12177:2410 2:N:0:ACAGTG
                                AGTTCCTCTTAATCATGGCCTCAGTTCCGAAAACCAACAAAATAGAACCGCGGTCAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAAAAA
                                +
                                CCBCCGGGGDGGGGGGCEGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGDDGGGGGGGGGGGGGGGGGGG89
                                (in sequence 1.fastq you can see the adaptor and index: AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC ACAGTG ATCTCGTATGCCGTCTTCTGCTTG)

                                i don't know what's wrong, maybe adaptors file is bad formated ?
                                Did you make this adapters file yourself? The sequences for PrefixPE/1, PE/1 and PE/1_rc are wrong. They represent the very ends of the library molecules, attached to the flow cell, not the ends adjoining the insert which is what your want.

                                In fact ScriptSeq library adapter sequences are identical to Illumina TruSeq libraries. Use the TruSeq3-PE-2.fa adapter file which comes with Trimmomatic.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM
                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                30 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                32 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                28 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                53 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X