Hi guys,
From first time I entered RNA-seq analysis, I always count the total number of reads that map to certain genomic area (with coverageBed of BEDTools or my own scripts) and then normalize with DESeq.
Recently, I was asked by a friend, why I summarize the reads and not for example take the maximum coverage as the number of RNA molecules that were in the sample.
We both know that there are biases like PCR-bias or sequencing-bias and biological more complicated mechanisms like alternative-splicing or alternative-promoter.
But who said that the sum is more robust to these biases?, my answer to him was that this is the method everybody use, but as you can see this is not a good excuse.
Is anyone ever compare, I guess sum is more robust but I'm not sure... what do you think?
Thanks in advance,
Oz Solomon
Israel
From first time I entered RNA-seq analysis, I always count the total number of reads that map to certain genomic area (with coverageBed of BEDTools or my own scripts) and then normalize with DESeq.
Recently, I was asked by a friend, why I summarize the reads and not for example take the maximum coverage as the number of RNA molecules that were in the sample.
We both know that there are biases like PCR-bias or sequencing-bias and biological more complicated mechanisms like alternative-splicing or alternative-promoter.
But who said that the sum is more robust to these biases?, my answer to him was that this is the method everybody use, but as you can see this is not a good excuse.
Is anyone ever compare, I guess sum is more robust but I'm not sure... what do you think?
Thanks in advance,
Oz Solomon
Israel