SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
[FASTQC] Biases in GC whole sequence content kazi1 Bioinformatics 9 06-03-2014 12:20 AM
FastQC,kmer content, per base sequence content: is this good enough mgg Bioinformatics 10 11-06-2013 11:45 PM
fastqc kmer content error btaboada Bioinformatics 1 08-13-2013 10:27 AM
FastQC: odd kmer content zshuhua Introductions 3 05-13-2013 08:36 PM
kmer content warning in FastQC vallejov RNA Sequencing 0 04-05-2013 11:10 AM

Reply
 
Thread Tools
Old 07-29-2014, 09:22 AM   #1
biotechkk
Junior Member
 
Location: germany

Join Date: Jul 2014
Posts: 5
Default Help needed on Fastqc-adapter content

Dear all,

i have reads from nextseq 500. I have removed adaptors using fastq-mcf. I run Fastqc for the trimmed reads. I am getting fail result for adapter content,k-mer and per base seqeunce content.
Whether iam missing any adapters to remove?. I can see the probelm with first 30bp. I have attached adapter seq and screenshopt of fastqc results. Any help will be highly useful to me.

Thanks

Regards
Manoj
Attached Images
File Type: png adapter_content.png (13.5 KB, 79 views)
File Type: png kmer_profiles.png (52.5 KB, 48 views)
File Type: png per_base_sequence_content.png (27.7 KB, 41 views)
Attached Files
File Type: zip adaptors_list.zip (1.8 KB, 11 views)
biotechkk is offline   Reply With Quote
Old 07-29-2014, 09:35 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,080
Default

Is this RNAseq data? The bias seen at the beginning of reads is a common observation because of the "random" primers not being so random (http://seqanswers.com/forums/showthr...t=4846&page=14) and does not need trimming. A similar bias is also seen with tagmentation reactions and is normal.
GenoMax is offline   Reply With Quote
Old 07-29-2014, 10:04 AM   #3
biotechkk
Junior Member
 
Location: germany

Join Date: Jul 2014
Posts: 5
Default

Thanks. It is exon data. Can i remove first 20bp. Is there any other way to remove that bias. i have mapped with bwa and got mapping quality of 60%. So i need to remove this bias in the beginning of the reads.
biotechkk is offline   Reply With Quote
Old 07-29-2014, 10:29 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,080
Default

Does the mapping quality (60%) refer to the percentage of reads that are mapping? I doubt the low % is because of the first 20 bp but if you want to give it a try you could remove them and see if it improves your mapping.

What kind of reference genome are you mapping to? Is it a reasonably complete one (e.g. human)?

More than likely you may have some other issue (primer dimers etc) that may be preventing reads from mapping.
GenoMax is offline   Reply With Quote
Old 07-29-2014, 11:00 AM   #5
biotechkk
Junior Member
 
Location: germany

Join Date: Jul 2014
Posts: 5
Default

Yes. only 60% of paired reads mapped to human genome(hg19). how to identify primer dimer in my reads and remove those from my reads.
biotechkk is offline   Reply With Quote
Old 07-30-2014, 12:47 AM   #6
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

There are two problems here. You have the typical RNA-Seq kmer bias problem at the start of your reads showing up in your Kmer and base composition plots. This isn't something you can fix, or that would be improved by trimming so don't worry about this.

Secondly you have read-through adapter contamination of your reads as shown by the adapter content plot however this is not produced by the normal common Illumina adapter, but comes from the transposase which Illumina uses to fragment their libraries in some of their kits. This means that your data would benefit from being trimmed to remove this and this will help with the mapping efficiency. You'll need to modify the default options for the trimming program you use to specify that it's the transposase sequence (CTGTCTCTTATA) which you want to remove. Pretty much all trimming programs will allow you to specify this sequence. After trimming you should see the adapter content plot be flat in the trimmed data and hopefully your mapping efficiency will go up too.
simonandrews is offline   Reply With Quote
Old 07-30-2014, 02:11 AM   #7
biotechkk
Junior Member
 
Location: germany

Join Date: Jul 2014
Posts: 5
Default

Thanks for your help. I have removed the transposase sequence and now i can see the adapter content plot flat. I will map the reads and see the maping quality.
biotechkk is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:36 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO