Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trimmomatic Sliding Window vs. Removing Adapters

    Hello.

    I have a quick question. WIll a sliding window suffice in removing the adapters? or must I use the ILLUMINACLIP : command.

  • #2
    Hi
    sliding window will only remove bad quality bases. To remove adapters you should [also] use Illuminaclip.

    best
    bjrön

    Comment


    • #3
      Finding the fasta file with the adapters

      Hello.

      I have the sequences of the adapters, but how do I create a fasta file in the correct format to use in ILLUMINACLIP ?

      Comment


      • #4
        Originally posted by arcolombo698 View Post
        Hello.

        I have the sequences of the adapters, but how do I create a fasta file in the correct format to use in ILLUMINACLIP ?
        What library preps are you using? Adapter files for the typical illumina preps (TruSeq and Nextera) are already included with recent versions of the tool.

        If you have something unusual, i can help you create appropriate adapter files.

        Thanks,

        Tony.

        Comment


        • #5
          Hi
          We have an internal file with many adapter, with the following format:

          Code:
          >[Oligonucleotide sequences for Genomic DNA 1]
          GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG
          >[Oligonucleotide sequences for Genomic DNA 2]
          ACACTCTTTCCCTACACGACGCTCTTCCGATCT
          >[PCR Primers 1]
          AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
          >[PCR Primers 2]
          CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT
          >[Genomic DNA Sequencing Primer]
          ACACTCTTTCCCTACACGACGCTCTTCCGATCT
          >[Paired End DNA oligonucleotide sequences]
          GATCGGAAGAGCGGTTCAGCAGGAATGCCGAG
          So, just usual plain FASTA

          Comment


          • #6
            Thank you. I have emailed you already all of my problems. check your email from A.colombo. I used the TruSeq RNA sample Prep kit, and all the indices were in my email I provided. I can re-post an original thread if you would like. I hope my emails have made sense.

            To publicly restate the problem

            1) I noticed that my adapters matched the TruSeq2-PE.fa and also added all the indices from the illumina adapter sequences.pdf which is available on their website.

            However my original FASTQC results , without using trimmomatic are:

            Sequence Count Percentage Possible Source
            GCAGATAGTGAGGAAAGTTGAGCCAATAATGACGTGAAGTCCGTGGAAGC 55443 0.1791778373403407 No Hit
            AGTAGTATAGTGATGCCAGCAGCTAGGACTGGGAGAGATAGGAGAAGTAG 53934 0.17430112871081896 No Hit
            GCTTTGGCTCTCCTTGCAAAGTTATTTCTAGTTAATTCATTATGCAGAAG 52865 0.17084638946300004 No Hit
            TTTGATGGTAAGGGAGGGATCGTTGACCTCGTCTGTTATGTAAAGGATGC 47976 0.15504637058312476 No Hit
            GCCATATCGGGGGCACCGATTATTAGGGGAACTAGTCAGTTGCCAAAGCC 35179 0.11368968381573591 No Hit
            AGCTTTGGCTCTCCTTGCAAAGTTATTTCTAGTTAATTCATTATGCAGAA 32875 0.10624373505336474 No Hit
            GTCAGTTCAGTGTTTTAATCTGACGCAGGCTTATGCGGAGGAGAATGTTT 32490 0.10499951184437475 No Hit
            TTGTCAGTTCAGTGTTTTAATCTGACGCAGGCTTATGCGGAGGAGAATGT 32329 0.1044792003206153 No Hit
            AGTTAGATTTACGCCGATGAATATGATAGTGAAATGGATTTTGGCGTAGG 31317 0.1012086707426988 No Hit
            [FAIL] Kmer Content




            after using a custom adapter.fa my results did not get rid of the adapters, but reduced them greatly. What are the best parameters which would remove the adapters, but still giving a sufficient read length?

            Trimmed (improved quality results)

            GCTTTGGCTCTCCTTGCAAAGTTATTTCTAGTTAATTCATTATGCAGAAG 32355 0.15813829366068524 No Hit
            TTTGATGGTAAGGGAGGGATCGTTGACCTCGTCTGTTATGTAAAGGATGC 31148 0.15223896062256292 No Hit
            GCAGATAGTGAGGAAAGTTGAGCCAATAATGACGTGAAGTCCGTGGAAGC 30047 0.14685771317022436 No Hit
            AGTAGTATAGTGATGCCAGCAGCTAGGACTGGGAGAGATAGGAGAAGTAG 24201 0.11828480435426497 No Hit
            GCCATATCGGGGGCACCGATTATTAGGGGAACTAGTCAGTTGCCAAAGCC 23374 0.1142427592651787 No Hit
            TTGTCAGTTCAGTGTTTTAATCTGACGCAGGCTTATGCGGAGGAGAATGT 23325 0.11400326687175034 No Hit
            AGTTAGATTTACGCCGATGAATATGATAGTGAAATGGATTTTGGCGTAGG 22057 0.10780579024180911 No Hit
            AGCTTTGGCTCTCCTTGCAAAGTTATTTCTAGTTAATTCATTATGCAGAA 20926 0.10227791479349403 No Hit
            GTCAGTTCAGTGTTTTAATCTGACGCAGGCTTATGCGGAGGAGAATGTTT 20488


            Thank you very much. best

            Comment


            • #7
              Since you removed almost half of them might it be possible that you remove only one direction ?

              From the Manual:

              If you want to check for the reverse-complement of a specific sequence, you need to

              specifically include the reverse-complemented form of the sequence as well, with another

              name. As an example have a look at the TruSeq2-PE.fa file

              >PCR_Primer1

              AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT

              >PCR_Primer1_rc

              AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT
              I would just use a small fraction of your reads and test it otherwise with different settings.

              cheers

              Comment


              • #8
                just quickly looking at the FastQC results by eye, I don't see any that match the Illumina adapters.

                Most of the adapter sequences occur towards the 3' end of the reads, whereas the over-represented sequences reported by FastQC are from the first (5' end) 50 bases of the reads.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                9 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                50 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                67 views
                0 likes
                Last Post seqadmin  
                Working...
                X