Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • dimo
    Member
    • Mar 2012
    • 10

    WXS kmer over-representation post trimming.

    Greetings and salutations,
    I am attempting to analyse some human Illumina WXS data (SureSelect) and am getting some unfamiliar kmer over-representation. Usually post trimming (trimmomatic) I get quite good removal of kmers but with this latest data I can't seem to get rid of the 3-primer kmers. My question is in two parts.

    1) Should I worry about this or just continue alignment/variant calling.

    2) If I should worry about, how can I trim these successfully?

    My trimmomatic parameters and pre/post trimming images are below, but I am probably missing something super obvious.

    Thanks in advance.



    trimmomatic-0.36.jar -phred33 1_ATGCCTAA_L001_R1.fastq.gz 1_ATGCCTAA_L001_R2.fastq.gz output_forward_paired.fq.gz output_forward_unpaired.fq.gz output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz ILLUMINACLIP:adapters/TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:75




  • Brian Bushnell
    Super Moderator
    • Jan 2014
    • 2709

    #2
    Maybe you are using the wrong adapter sequences. With the BBMap package, you can discover adapter sequence like this:

    Code:
    bbmerge.sh in1=1_ATGCCTAA_L001_R1.fastq.gz in2=1_ATGCCTAA_L001_R2.fastq.gz outa=adapters.fa reads=8m
    Alternately, you can use the set of adapter sequences included with the package in bbmap/resources/adapters.fa, which is pretty complete. I recommend trimming with BBDuk which is more sensitive than other trimming tools, in my tests:

    Code:
    bbduk.sh in1=1_ATGCCTAA_L001_R1.fastq.gz in2=1_ATGCCTAA_L001_R2.fastq.gz out1=trimmed1.fq.gz out2=trimmed2.fq.gz ref=adapters.fa ktrim=r k=23 mink=11 hdist=1 tbo tpe minlen=75

    Comment

    • dimo
      Member
      • Mar 2012
      • 10

      #3
      Appreciate the response Brian, will run through BBMap and report back.

      Comment

      • dimo
        Member
        • Mar 2012
        • 10

        #4
        Originally posted by Brian Bushnell View Post
        Maybe you are using the wrong adapter sequences. With the BBMap package, you can discover adapter sequence like this:

        Code:
        bbmerge.sh in1=1_ATGCCTAA_L001_R1.fastq.gz in2=1_ATGCCTAA_L001_R2.fastq.gz outa=adapters.fa reads=8m
        Alternately, you can use the set of adapter sequences included with the package in bbmap/resources/adapters.fa, which is pretty complete. I recommend trimming with BBDuk which is more sensitive than other trimming tools, in my tests:

        Code:
        bbduk.sh in1=1_ATGCCTAA_L001_R1.fastq.gz in2=1_ATGCCTAA_L001_R2.fastq.gz out1=trimmed1.fq.gz out2=trimmed2.fq.gz ref=adapters.fa ktrim=r k=23 mink=11 hdist=1 tbo tpe minlen=75

        Worked an absolute treat. You are a gentleman and a scholar, thanking you.

        Comment

        Latest Articles

        Collapse

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, 06-05-2026, 10:09 AM
        0 responses
        15 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-04-2026, 08:59 AM
        0 responses
        33 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-02-2026, 12:03 PM
        0 responses
        35 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-02-2026, 11:40 AM
        0 responses
        23 views
        0 reactions
        Last Post SEQadmin2  
        Working...