Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trimming Haloplex adapters

    Hi everyone!
    I am trying to trim resulting amplicon reads from Haloplex and sequenced with MiSeq.
    I've found some information : http://seqanswers.com/forums/showthr...light=haloplex. In fact, the user swNGS says that he have a well defined protocol to trim those sequences. However, I've just found a technical note from Illumina (Mutation Detection and CNV Analysis
    for Illumina Sequencing data from HaloPlex Target Enrichment Panels using NextGENe Software for Clinical Research) reagrding the NextGene software, saying that there is no 5' adapter sequences and just a pair of 3' sequences.
    As far as I know, the resulting amplicons from HaloPlex are constructions like:
    PCR_primer-----Illumina_adaptor-----target-----Illumina_adaptor-----Barcode-----PCR_primer
    Should I trim just by Illumina adaptors or PCR_primers should be removed too??
    Any advice will be appreciate!

  • #2
    I used cutadapt (https://code.google.com/p/cutadapt/) for trimming. I needed to include the barcode and primers in the trimmed sequence because of how poorly my haloplex experiment worked. Hope this helps.

    python cutadapt -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACATGCCTAACTCGTATGCCGTCTTCTGCTTG infile_1.fastq > outfile_1.fastq

    python cutadapt -a AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT infile_2.fastq > outfile_2.fastq

    Forward Sequence:
    GATCGGAAGAGCACACGTCTGAACTCCAGTCAC
    [barcode]
    CTCGTATGCCGTCTTCTGCTTG

    Reverse Sequence:
    AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT

    Comment


    • #3
      Thank you so much adamdeluca. That's is exactly what I was looking for, a pair of sequences to be trimmed. I will try trimmomatic and cutadapt in order to find differences!

      Comment


      • #4
        According to the official Agilent instructions:
        Example using HaloPlex Illumina Data:
        cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC mate1.fastq \
        -o mate1.trimmed.fastq;
        cutadapt -a AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT \
        mate2.fastq -o mate2.trimmed.fastq;
        Not sure why R1 sequence differs from adamdeluca's answer.

        Comment


        • #5
          My sequence differed because I was seeing reads into the barcodes.

          Comment


          • #6
            Hi,

            After trimming the Fastq reads according to :
            "Example using HaloPlex Illumina Data:
            cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC mate1.fastq \
            -o mate1.trimmed.fastq;
            cutadapt -a AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT \
            mate2.fastq -o mate2.trimmed.fastq;"

            Do you see occurrences where the whole read sequence has been trimmed? I am seeing this kind of occurrence which causes issues while mapping since there is no read in mate1.fastq and there is in mate2.fastq.

            Any help would be greatly appreciated.

            Comment


            • #7
              Originally posted by rsanghvi View Post
              Hi,
              Do you see occurrences where the whole read sequence has been trimmed?
              Yes, I saw a great number of fragments with no insert at all.

              Comment


              • #8
                Oh ok. (Atleast i am not doing anything wrong)

                So how did you overcome that issue in order to map?

                So is it better to specify a minimum size of the read to be kept in the output files and also use the --paired-end option in cutadapt to keep the entries similar in both read files.

                Please let me know if there is any other way to overcome this issue.

                Comment


                • #9
                  Originally posted by rsanghvi View Post
                  So how did you overcome that issue in order to map?
                  I didn't have the issue, if there is no insert, both reads in the pair should trim to nothing. Mine did at least.

                  Good luck

                  Comment


                  • #10
                    Hi,
                    I found the same issue working with those data.
                    According to my opinion, setting a minimum read lenght should be included in he management of NGS data. However, I fix the problem using Trim galore software (http://www.bioinformatics.babraham.a...s/trim_galore/) which uses cutadapt, too.
                    From a certain customer service I got advice of using cutadapt first and then trim galore to resynchronize R1 and R2 fastq files since some reads might drop out during Cutadapt. In fact, using just Trim galore with the option "do not output unpaired reads” is a good way to proceed.
                    I hope this helps.

                    Comment


                    • #11
                      Hi Jordi,

                      Thank you for the information. This definitely helps.
                      This sounds like the --paired-end option in cutadapt.

                      But will give both a try.

                      Thanks.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM
                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      31 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      32 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 09:21 AM
                      0 responses
                      28 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-04-2024, 09:00 AM
                      0 responses
                      53 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X