Unconfigured Ad

**GenoMax** · 08-13-2013, 11:55 AM

Is this RNA-seq data?

Following thread may be useful (though it refers to a MiSeq run the issue is applicable to illumina sequencing in general): http://seqanswers.com/forums/showthread.php?t=30448

One more: http://seqanswers.com/forums/showthread.php?t=17219

**mattanswers** · 08-13-2013, 12:18 PM

Thank you very much, GenoMax, for the links. They are very informative.

This is RNA-Seq. It seems the first 12 or so bases are due to 'random' priming, but I was also wondering about why the lines on the graph stay up at ~50% for the length of the graph ? Random priming would explain the first 12 or so bases, but why the steady % for the rest of the sequence ?

**GenoMax** · 08-13-2013, 02:48 PM

Originally posted by mattanswers View Post

Thank you very much, GenoMax, for the links. They are very informative.

This is RNA-Seq. It seems the first 12 or so bases are due to 'random' priming, but I was also wondering about why the lines on the graph stay up at ~50% for the length of the graph ? Random priming would explain the first 12 or so bases, but why the steady % for the rest of the sequence ?

http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/3%20Analysis%20Modules/11%20Overrepresented%20Kmers.html

**mattanswers** · 08-14-2013, 11:20 AM

Thanks again for your help, GenoMax.

My sequence length is only 50 bases and the quality is very good.

From what I read on the linked site, it seems that I have 6 kmers that are 50-fold enriched throughout the length of my sequence. But what does this mean in terms of sample quality ?

If I have 25-30 million reads and there is a 50 fold enrichment of these kmers (most likely I would guess from the adaptor) then how many sequences does that affect ? So, if there were 100,000 sequences in which had adaptor sequence at various positions other than the end of the sequence what would the fold-enrichment be ? 100,000 affected sequences may be enough to make the fold-enrichment high, but they are only a small percentage of the total. On the other hand, if I had a much smaller number of total sequences, then the fold-enrichment may be a problem. So, I guess I want to know how to relate fold-enrichment and total number of sequences in order to tell if the fold-enrichment is a problem or just from an insignificant part of the total.

Topics	Statistics	Last Post
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 13 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 24 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 28 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, 06-02-2026, 11:40 AM	0 responses 22 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 11:40 AM

Unconfigured Ad

fastqc kmer relative enrichment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News