Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • fastq-mcf paired end adapter trimming

    Hi!

    I'm working on a transcriptome de novo assembly, and I'm having some difficulties removing adapter contaminants from my 100 bp PE reads. According to FASTQC, I have no more than ~1% of my reads with overrepresented adapter sequences. For ex:

    90743 reads 0.4% of reads TruSeq Adapter, Index 6 (100% over 50bp)
    AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATATCTCGTATG
    42478 reads 0.10% of reads TruSeq Adapter, Index 6 (100% over 50bp)
    GATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATATCTCGTATGC

    I've been tried to remove these sequences using fastq-mcf, since this seems to work well for PE reads.

    However, I keep getting way more reads removed then what FastQC is telling me is present. I've been playing around with parameters, but without much improvement. I'm realizing now that the program trims partial adapter sequences from the ends, possibly even if there are just a few base pairs that match the adapter sequence? Is this generally what adapter trimming does?What if I'm only interested in trimming out the sequences overrepresented as described in FastQC, (full 65 bp-50 bp of adapter contaminants)?

    There are so many parameters for this program, and I'm not sure how to set them to remove only what I need... and now I'm not sure what exactly I'm "supposed" to be removing (full or partial adapter sequence matches...)




    Cheers

  • #2
    are the below adapters sequence (OK) for trimming of illumina 250PE DNA reads
    >NexteraUniversalAdapter
    CTGTCTCTTATACACATCT
    >TruSeq_Read1
    AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
    >TruSeq_Read2
    AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
    >Nextera_mate_pair_Read1
    CTGTCTCTTATACACATCT
    >Nextera_mate_pair_Read2
    AGATGTGTATAAGAGACAG
    >PolyA
    AAAAAAAAAAAAAAAAAAAAAAAAAAA
    >sv1
    AATGATACGGCGACCACCGAGATCTACACGCCTCCCTCGCGCCATCAG
    >sv2
    CAAGCAGAAGACGGCATACGAGAT
    >sv3_barcode
    CGGTCTGCCTTGCCAGCCCGCTCAG

    Comment


    • #3
      @mmmm
      It depends whether the library prep was done using a TruSeq kit or a Nextera kit.

      Comment


      • #4
        @gevieir
        The sequences that FastQC lists as over-represented are based on the first 50 bases (5' end) in the reads.

        Adapter sequences are usually found towards the ends of the reads (3' end), when the insert is shorter than the read length and so you read into the adapter sequence.

        So it's not surprising that you would get many more reads trimmed.

        Comment


        • #5
          library prep was done using nexteraxt so

          is there a harm to include (truseq) adapters??

          also, after using fastq-mcf (to trim adapters and remove bases of low qulaity <20)- then check on FastQC, still can see bases at 3' end of lower qulaity (~10-20 Q score)??

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:37 PM
          0 responses
          10 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 06:07 PM
          0 responses
          9 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          50 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          67 views
          0 likes
          Last Post seqadmin  
          Working...
          X