Hi,
As part of my new faculty appointment I am the faculty adviser for my department helping get off the ground an Illumina sequencing core for the university.
While trying to put together some guidelines regarding sequencing coverage I became quite confused as to what is right,
Can anyone refer me to the most recent best practices or good papers dealing with this issue?
The original ENCODE recommendations do not agree much with my experience.
Outside the fact that you need at least 3 and not 2 biological replicates to do good stats the 30M PE reads do not seem enough according to my calculation bellow:
Given a Human Genome size of 3 billion bp, assuming that 80% of the reads will be mapped with high accuracy and estimating that 10% of the genome makes polyA RNA (this is the proportion of the genome I usually end up mapping to)
the average coverage of 30M 100 bp reads (0.03 billion reads) is: (0.03x100x0.8)/(3x0.1)= 8X
this seems really low, is my calculation correct?
is my mistake assuming that 10% of the genome gets mapped (if we assume 2% then you get 40X coverage, but that is not my experience)
thanks in advance for the feedback
As part of my new faculty appointment I am the faculty adviser for my department helping get off the ground an Illumina sequencing core for the university.
While trying to put together some guidelines regarding sequencing coverage I became quite confused as to what is right,
Can anyone refer me to the most recent best practices or good papers dealing with this issue?
The original ENCODE recommendations do not agree much with my experience.
Outside the fact that you need at least 3 and not 2 biological replicates to do good stats the 30M PE reads do not seem enough according to my calculation bellow:
Given a Human Genome size of 3 billion bp, assuming that 80% of the reads will be mapped with high accuracy and estimating that 10% of the genome makes polyA RNA (this is the proportion of the genome I usually end up mapping to)
the average coverage of 30M 100 bp reads (0.03 billion reads) is: (0.03x100x0.8)/(3x0.1)= 8X
this seems really low, is my calculation correct?
is my mistake assuming that 10% of the genome gets mapped (if we assume 2% then you get 40X coverage, but that is not my experience)
thanks in advance for the feedback
Comment