Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
FastQC,kmer content, per base sequence content: is this good enough mgg Bioinformatics 10 11-06-2013 11:45 PM
Kmer content subuhikhan General 9 03-05-2012 01:05 AM
kmer content in the first bases of Illumina sequence brachysclereid Bioinformatics 2 01-09-2012 03:54 PM
RNA-Seq: GC-Content Normalization for RNA-Seq Data. Newsbot! Literature Watch 0 12-20-2011 03:00 AM
Long peak length from ChIP-seq data Chiper Epigenetics 12 03-17-2010 04:08 PM

Thread Tools
Old 03-15-2012, 05:02 PM   #1
Junior Member
Location: San Diego

Join Date: Feb 2012
Posts: 9
Default weird kmer-content peak in RNA-seq data

Hi all,

I have attached a pic of the kmer-content of my RNA-seq experiment.
Input was a fastq-file, 51bp reads, over 30 million reads, RNA-seq on an Illumina Hiseq.
At about the 21st position in the reads, I see the AAAAA 5-mer suddenly rising. Does anyone have a clue what might be causing that? I see it in all my samples. RNA samples were collected after arresting translation with cycloheximide.
Could it be that something is wrong with the fragment size? If so, how do I check that??

Thanks a million!

Attached Images
File Type: png kmer_profile.png (74.2 KB, 127 views)
kareldegendt is offline   Reply With Quote
Old 03-16-2012, 05:34 AM   #2
Location: Boston

Join Date: Sep 2010
Posts: 14

Hi Karel
It's possible that the AAAA k-mer that you're seeing at around 21 base pairs are arrested transcripts caused by cyclohexamide treatment. You can check this possibility out by looking at the length distribution plot in FASTQC. However, if you see uniform 51 base pair reads, this doesn't mean that your AAAA k-mer is not due to terminated transcripts because you may have sequenced the 3' UTR. I would probably try to address this computationally by segregating out the reads with the AAAA k-mer at 21 bases. I would then trim them and run the shorter (ie 19 bp) and longer (ie 50 bp) reads through my analyis pipeline separately. I'm assuming the goal of your experiment is looking at genes that are transcribed rapidly or with whatever stimulus you gave your cells before Chx treatment vs slowly/not in response to the stimulus. This should tell you something about these genes.
kalidaemon is offline   Reply With Quote
Old 08-22-2012, 04:45 PM   #3
Location: Coralville IA

Join Date: Mar 2012
Posts: 15

I think the kmer contents of my data is pretty darn bad. Is there anyway to filter them or trim them from my data and analyze them separately.
Any useful comment is highly appreciated.
Attached Images
File Type: jpg Capture.JPG (88.2 KB, 57 views)
mparida is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 07:31 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO