SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Sample normalization - ChIP-Seq DNA_Monk Bioinformatics 0 02-13-2016 05:01 AM
DESeq normalization and sample counts of zero alabadorf Bioinformatics 2 01-29-2015 03:43 PM
Multi-sample vs Single sample SNP calling for Linkage analysis meher Bioinformatics 0 10-23-2013 05:13 AM
Nextera XT multiplex sample normalization koadman Illumina/Solexa 7 10-02-2012 04:32 PM
chip-seq normalization when two sample total reads vary largely tujchl Bioinformatics 0 01-10-2012 03:00 AM

Reply
 
Thread Tools
Old 03-20-2017, 10:24 AM   #1
SamCurt
Member
 
Location: Iowa

Join Date: May 2010
Posts: 36
Default The necessarity of between-sample normalization?

I'm currently perform some analyses involving cross-project expression data. Because it involves linear equations, we have decided to take log(TPM) as that algorithm's input.

While using older workflows involving rsubread or htseq-count would always require us to perform between-sample normalization, newer transcript quantification tools such as RSEM, Kalisto and Salmon gives out reads and (at a minimum) TPM as their raw output.

But even in that case, should I take out the reads, normalize it with DESeq2/edgeR, and calculate the TPMs instead? I'm not particularly comfortable with not doing between-sample normalizations, but I have a feeling that it's the norm these days.
SamCurt is offline   Reply With Quote
Old 03-20-2017, 10:34 AM   #2
cmbetts
Member
 
Location: Bay Area

Join Date: Jun 2012
Posts: 83
Default

Somebody more knowledgeable may correct me, but the statistical methods used by DESeq2 rely on having the raw read counts to calculate power and significance, and therefore can't use normalized values like TPM (There's a huge statistical difference in seeing one count in a million reads vs a thousand counts in a billion that's lost in normalization)
cmbetts is offline   Reply With Quote
Old 03-20-2017, 10:37 AM   #3
SamCurt
Member
 
Location: Iowa

Join Date: May 2010
Posts: 36
Default

Quote:
Originally Posted by cmbetts View Post
Somebody more knowledgeable may correct me, but the statistical methods used by DESeq2 rely on having the raw read counts to calculate power and significance, and therefore can't use normalized values like TPM (There's a huge statistical difference in seeing one count in a million reads vs a thousand counts in a billion that's lost in normalization)
Of course I know things like DESeq2 can't take TPM. The options I mentioned above are:
  1. Take the TPM from the quantifier directly to downstream.
  2. Take the expected read count from the quantifier, normalized by DESeq2 (etc), and then calculate TPM from this normalized number.
SamCurt is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:27 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO