![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Advice needed on De novo sequences Kmer content | henriettevdz | Introductions | 6 | 08-31-2015 09:35 AM |
K-mer content and adapter contamination | rich22 | Bioinformatics | 0 | 10-21-2014 03:55 AM |
Advice Needed on Tuxedo Suite | thickrick99 | RNA Sequencing | 2 | 08-13-2014 10:22 AM |
GPU vs Phi: advice needed | yaximik | Bioinformatics | 4 | 09-11-2013 11:11 AM |
ChIP-seq advice needed | bcm | Illumina/Solexa | 2 | 03-22-2010 05:22 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Sweden Join Date: Nov 2014
Posts: 18
|
![]()
Hi folks,
I am trying to do adapter and low quality trimming of a fungal genome (prepared with Illumina DNA nano kit and sequenced with HiSeq 2000 100PE). After using BBduk to trim adapters and low quality reads as following >./bbduk.sh in1=R1.fastq.gz in2=R2.fastq.gz out1=R1_q25.fastq.gz out2=R2_q25.fastq.gz ktrim=r k=21 mink=11 hdist=2 tpe tbo ref=resources/adapters.fa qtrim=rl trimq=25 Still FASTQC showed a K-mer content warning for both R1 and R2 reads [ https://goo.gl/photos/Lsyt7YJeQnjB8HQq5 ]. Can I have your opinion how shall I handle my data? Shall I just remove the first 20 bases to be on a safe side? Or it is normal behavior for a library prepared with the nano kit? Thanks in advance and have a great day! Last edited by Vinn; 04-21-2017 at 07:47 AM. |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,082
|
![]()
What kind of analysis are you trying to do? In general I have never worried about k-mer warnings from FastQC.
|
![]() |
![]() |
![]() |
#3 |
Member
Location: Sweden Join Date: Nov 2014
Posts: 18
|
![]() |
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,082
|
![]()
Take a look at @Brian's suggestions in this thread. I have provided a link for a specific post but take a look at the whole thread. He should be along with more later.
|
![]() |
![]() |
![]() |
#5 |
Member
Location: Sweden Join Date: Nov 2014
Posts: 18
|
![]()
Thank you, I will read the thread through.
![]() |
![]() |
![]() |
![]() |
#6 |
Super Moderator
Location: Walnut Creek, CA Join Date: Jan 2014
Posts: 2,707
|
![]()
Kmer-content spikiness at the beginning of the read is normal for many fragmentation methodologies and should not be removed. I'm not sure what's going on at the end, though...
|
![]() |
![]() |
![]() |
#7 |
Member
Location: Sweden Join Date: Nov 2014
Posts: 18
|
![]()
Thanks for your reply Brian. Just to be on a safe side, do you think it is better to trim the end off?
|
![]() |
![]() |
![]() |
#8 |
Super Moderator
Location: Walnut Creek, CA Join Date: Jan 2014
Posts: 2,707
|
![]()
Excessive trimming reduces accuracy, and will degrade the results of any experiment. If you want to be confident that bases are genomic rather than artificial, I suggest you follow this methodology:
1) Map the reads to the reference (if you don't have a reference, you can make a quick assembly with Tadpole) with BBMap like this: Code:
bbmap.sh in=reads.fq ref=ref.fa mhist=mhist.txt qhist=qhist.txt If there is not an increased error rate in a region of the read, there is no reason to trim it. And conversely, it is prudent to trim if there is a high error rate at one end or the other. |
![]() |
![]() |
![]() |
#9 |
Member
Location: Sweden Join Date: Nov 2014
Posts: 18
|
![]()
Thanks so much Brian for your advice. I will try as you suggested.
![]() |
![]() |
![]() |
![]() |
Tags |
fastqc, genome assembly, illumina, quality control |
Thread Tools | |
|
|