Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #91
    Originally posted by Lays Cruz View Post
    Yes. My data is from Illumina MiSeq and my command's line is as follows:
    #java -jar /usr/local/bin/trimmomatic-0.30.jar PE -threads 4 -phred33 ./jatoba/Hst_S2_L001_R1_001.fastq ./jatoba/Hst_S2_L001_R2_001.fastq ./jatoba/fq/Hst_S2_PE_1p.fq ./jatoba/fq/Hst_S2_SR_1p.fq ./jatoba/fq/Hst_S2_PE_2p.fq ./jatoba/fq/Hst_S2_SR_2p.fq LEADING:30 TRAILING:30 SLIDINGWINDOW:4:30 HEADCROP:18 MINLEN:20

    Thanks.
    How about just running:

    Code:
    #java -jar /usr/local/bin/trimmomatic-0.30.jar PE -threads 4 -phred33 -basein jatoba/Hst_S2_L001_R1_001.fastq -baseout ./jatoba/Hst_S2_trim LEADING:30 TRAILING:30 SLIDINGWINDOW:4:30 HEADCROP:18 MINLEN:20
    BTW: Are there 3 pairs of samples or are you just providing names of files for holding the output?

    Comment


    • #92
      Originally posted by GenoMax View Post
      How about just running:

      Code:
      #java -jar /usr/local/bin/trimmomatic-0.30.jar PE -threads 4 -phred33 -basein jatoba/Hst_S2_L001_R1_001.fastq -baseout ./jatoba/Hst_S2_trim LEADING:30 TRAILING:30 SLIDINGWINDOW:4:30 HEADCROP:18 MINLEN:20
      BTW: Are there 3 pairs of samples or are you just providing names of files for holding the output?
      No. It comes from a only sample. The other files are the outputs, two PE files with the paired sequences and two SR files with single reads removed from R1 and R2 files.

      Much appreciate your suggestion, I will try now.

      Thanks.

      Comment


      • #93
        Originally posted by Brian Bushnell View Post
        Hi all,

        I'm testing Trimmomatic's performance on adapter removal, and the results are mysteriously bad. So I'd like to make sure I'm not doing anything wrong. This is my command line (modified from the website):

        java -Xmx8g -jar trimmomatic-0.32.jar SE -phred33 dirty.fq tclean.fq ILLUMINACLIP:gruseq.fa:2:30:10

        ...where dirty.fq is a file containing reads with adapter sequences and gruseq.fa is a file containing the adapter sequences. The adapters are inserted synthetically and the reads are tagged, so I know precisely what the correct results should be, and what I'm getting is not really close. Any suggestions?
        The command looks ok, except that you are not providing min adapater length (which defaults to 8). In what way is the output "mysteriously bad"?

        Comment


        • #94
          Originally posted by GenoMax View Post
          The command looks ok, except that you are not providing min adapater length (which defaults to 8). In what way is the output "mysteriously bad"?
          Most of the adapters didn't get removed.

          Comment


          • #95
            Hi,

            Sorry to hijack this post! I got paired-end 150bp RAD sequencing data that I am currently cleaning with Trimmomatic. I just have two quick questions to make sure I am not going wrong anywhere.

            1) By the looks of it, most of my adapter contamination occurs within the read. I.e. I have 'P1 -sequence-P1-sequence'. In this case, the single alignment adapter mode will trim this sequence up until the start of the second P1 adapter occurring in the read, leaving me with just 'P1-sequence', am I correct? I am just thinking that I would actually prefer for Trimmomatic to discard these reads entirely, as the reverse part of this read will in all likelihood not come from the same locus as the surviving bit of the forward read? Is there an option to do this? The palindrom mode does not appear to pick these within-read sequences up.

            2) Just as a very general question, I also appear to have quite a bit of reverse-complement adapter contamination in my data set and I was wondering if anybody has experience with this for RAD data? Is this something I need to worry about?

            Many thanks in advance for any feedback

            Sarah

            Comment


            • #96
              Hi,

              I've chosen trimmomatic for my studies about reads processing.
              However I have to justify why I choose this tool instead of others pre processing tools.

              Do you have any articles which compare trimmomatic to other tools (except trimmomatic's authors article) from what I can get some informations ?

              thanks

              Comment


              • #97
                I have a comparison of adapter trimming here:

                Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


                ...though it's not published or peer-reviewed. I'll be doing another comparison of quality-trimming soon.

                Also, here's a paper comparing quality-trimming methods:

                Next Generation Sequencing is having an extremely strong impact in biological and medical research and diagnostics, with applications ranging from gene expression quantification to genotyping and genome reconstruction. Sequencing data is often provided as raw reads which are processed prior to analysis 1 of the most used preprocessing procedures is read trimming, which aims at removing low quality portions while preserving the longest high quality part of a NGS read. In the current work, we evaluate nine different trimming algorithms in four datasets and three common NGS-based applications (RNA-Seq, SNP calling and genome assembly). Trimming is shown to increase the quality and reliability of the analysis, with concurrent gains in terms of execution time and computational resources needed.
                Last edited by Brian Bushnell; 06-01-2014, 08:15 AM.

                Comment


                • #98
                  Just in case anybody ever has a similar problem or is confused and stumbles across my post:

                  Looking closer at my files and where the reverse adapter contamination occurred, it became obvious that the rc sequences were actually simply adapter read through and everything that followed was nonesense which Trimmomatic then perfectly removed. This means in roughly 3% of cases, my fragments were too short for the 150bp HiSeq and the RAD size selection did not work perfectly, but considering it is only 3% and it was my first set of libraries I am fairly happy with that.

                  Above I stated that I was concerned the forward read would not match the reverse read. Now I believe this is only the case in 100-1000 fragments that consist of tiny fragments with adapter ligating to other tiny fragments of adapter. The majority of the contamination however presents itself in reverse complementary form.

                  This all obviously rests upon the understanding that when adapter read-through occurs, it will be reverse complementary of the P2 adapters in the forward reads, and reverse complementary of the P1 adapters in the reverse reads. Please feel free to point out if there is something wrong with my logic!

                  Comment


                  • #99
                    Hi, I found trimmomatic very useful.
                    And it works well with my Hiseq data using Nextera PE adaptor in single end mode.
                    But I found the output is quite strange in paired end mode:

                    My output after trimming:
                    Input Read Pairs: 12484647 Both Surviving: 4943420 (39.60%) Forward Only Surviving: 7297375 (58.45%) Reverse Only Surviving: 16245 (0.13%) Dropped: 227607 (1.82%)

                    It seems that the forward and reverse reads after trimming is very unbalanced.
                    What would cause this?
                    Thanks.

                    Comment


                    • In trimmomatic's default mode, when it trims adapters from paired end reads, it drops the second read of the pair, because the two reads are reverse complements of each other, so the second read doesn't add any extra information.

                      In newer versions of trimmomatic this can be turned off, so that it keeps both reads after trimming adapters from paired reads. You need to specify 'TRUE' for the <keepBothReads> parameter of the ILLUMINACLIP command.

                      ILLUMINACLIP:<fastaWithAdaptersEtc>:<seed mismatches>:<palindrome clip threshold>:<simple clip threshold>:<minAdapterLength>:<keepBothReads>
                      Last edited by mastal; 06-06-2014, 01:12 AM.

                      Comment


                      • Hi, I have successfully trim the transposase sequence from the Hiseq data using trimmomatic, but I would like to trim the primer as well, it only has 15 base long, and I see it is fail to trim it using trimmomatic, do I need to change the seed length of 16 base? How can it be done? Thanks

                        Comment


                        • What was the trimmomatic command that you used, and what sequences did you use in the adapters fasta file?

                          Comment


                          • Hi! I think I'm putting in the trimmomatic code incorrectly. I'm using the binary file download of trimmomatic, I'm running these programs off of windows 8 (that might be a problem), and I'm using fastq files that need paired end adapter sequence trimming.

                            Here is the script I'm using:

                            java -jar C:\Users\kevluv93\Desktop\Trimmomatic-0.32\Trimmomatic-0.32\trimmomatic-0.32.jar PE -phred33 C:\Users\kevluv93\Desktop\L11of2_S1_L001_R1_001.fastq.fq C:\Users\kevluv93\Desktop\L11of2_S1_L001_R2_001.fastq.fq C:\Users\kevluv93\Desktop\output_forward_paired.fq C:\Users\kevluv93\Desktop\output_forward_unpaired.fq C:\Users\kevluv93\Desktop\output_reverse_paired.fq C:\Users\kevluv93\Desktop\output_reverse_unpaired.fq ILLUMINACLIP:C:\Users\kevluv93\Desktop\Trimmomatic-0.32\Trimmomatic-0.32\TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

                            To make things simple, all the files are on my desktop, so if there is an error in the paths I gave trimmomatic to get the files, keep that in mind.

                            Here is the error message:

                            Multiple cores found: Using 4 threads
                            Trimmomatic PE: Started with arguments: -phred33 C:\Users...
                            Multiple cores found: Using 4 threads
                            Exception in thread "main" java.lang.NumberFormatException: For input string: "\Users\kevluv93\Desktop\Trimmomatic-0.32\Trimmomatic-0.32\TruSeq3-PE.fa
                            at java.lang.NumberFormatException.for InputString (unknown source)
                            at java.lang.Integer.parseInt(Unknown Source)
                            at org.usadellab.trimmomatic.trim.IlluminaClippingTrimmer.makeIlluminaClippingTrimmer(IlluminaClippingTrimmer.java:53)
                            at org.usadellab.trimmomatic.trim.TrimmerFactory.makeTrimmer (TrimmerFactory.java:27)

                            I don't know what is wrong with the script I put in. I used the example on the trimmomatic website as the base for this one, so I thought it would work. Did I input the script correctly? Do the parameters make sense?

                            New to this, undergrad, greatly appreciative of any input!
                            Last edited by kevluv93; 06-15-2014, 11:39 AM.

                            Comment


                            • @kevluv93

                              I have not used trimmomatic on Windows, but it looks as if it's expecting what comes after 'C:' to be the second parameter for the ILLUMINACLIP command.

                              Try putting quotes around the path to the fasta file and see if that helps.

                              Code:
                              ILLUMINACLIP:'C:\Users\kevluv93\Desktop\Trimmomatic-0.32\Trimmomatic-0.32\TruSeq3-PE.fa':2:30:10

                              Comment


                              • You probably have the parameters in the wrong order. Java is parsing "\Users\kevluv93\Desktop\Trimmomatic-0.32\Trimmomatic-0.32\TruSeq3-PE.fa" and expecting an integer, which it is not. I'm not a Trimmomatic expert and I don't know the correct command line, but Trimmomatic should work fine on Windows with the correct command.

                                If you continue to have problems, send me a pm.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM
                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                26 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                29 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                25 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                52 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X