I tried the option to emit count tables in cuffdiff. I expected the counts to be integers, but I get fractions. Are these normalized counts?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
about --emit-count-tables
Hi Cole and PFS,
I have some questions about --emit-count-tables option, which is said that
"Cuffdiff will output a file for each condition (called <sample>_counts.txt)".
I run Cuffdiff with 4 samples' bam file together, but I got only locus_var.txt and pooled_counts.txt.
Q1, what does pooled_counts.txt indicate?
Q2, in locus_var.txt, the number of condition column indicates each samples in the same order as those in cuffdiff command line?
*Here is my command,
cuffdiff -p 2 -u --emit-count-tables ./../compare/cuffcmp.combined.gtf ./../Nipponbare/tophat_out/accepted_hits.bam ./../Kasalath/tophat_out/accepted_hits.bam ./../Nip_Kas/tophat_out/accepted_hits.bam ./../Kas_Nip/tophat_out/accepted_hits.bam
*the content of locus_var.txt
0 TCONS_00005163 1348.57 15275.1 132648 50019.6 30 3 0 0 0.219325 0 0 0 0 -1.0565
0 TCONS_00005164 1348.57 15275.1 132648 50019.6 30 3 0 0 0.0778509 0 0 0 0 0.0389116
1 TCONS_00000001 1348.57 15275.1 132648 96366.6 30 3 0 0 0.751663 0 0 0 0 0.676941
1 TCONS_00005163 1348.57 15275.1 132648 96366.6 30 3 0 0 0.00888725 0 0 0 0 -5.68775
1 TCONS_00005164 1348.57 15275.1 132648 96366.6 30 3 0 0 0.240978 0 0 0 0 1.74781
2 TCONS_00000001 1348.57 15275.1 132648 94122.3 30 3 0 0 0.699411 0 0 0 0 0.572296
2 TCONS_00005163 1348.57 15275.1 132648 94122.3 30 3 0 0 0.223565 0 0 0 0 -1.03559
2 TCONS_00005164 1348.57 15275.1 132648 94122.3 30 3 0 0 0.0784949 0 0 0 0 0.138381
0 TCONS_00005166 55.532 1915.14 986.219 19.6345 30 4 0 0 0 0 0 0 0 -inf
0 TCONS_00006797 55.532 1915.14 986.219 19.6345 30 4 0 0 0.996369 0 0 0 0 2.04356
0 TCONS_00006796 55.532 1915.14 986.219 19.6345 30 4 0 0 0.503341 0 0 0 0 1.0567
0 TCONS_00005165 55.532 1915.14 986.219 19.6345 30 4 0 0 0.00028983 0 0 0 0 -9.7241
3 TCONS_00000001 1348.57 15275.1 132648 158408 30 3 0 0 0.676484 0 0 0 0 0.521408
The each numbers of condition column are corresponding to...
0 -> Nipponbare
1 -> Kasalath
2 -> Nip_Kas
3 -> Kas_Nip
correct?
And one more question, after grouping this file into each sample_txt by condition number and sorting by description, I got 2 same records like this..
0 TCONS_00000001 1348.57 15275.1 132648 50019.6 30 3 0 0 0.703948 0 0 0 0 0.588801
0 TCONS_00000001 1348.57 15275.1 132648 50019.6 30 3 0 0 0.703948 0 0 0 0 0.588801
0 TCONS_00000002 3202.34 180770 597601 89875.9 30 8 0 0 0.00282914 0 0 0 0 -4.4797
0 TCONS_00000002 3202.34 180770 597601 89875.9 30 8 0 0 0.00282914 0 0 0 0 -4.4797
0 TCONS_00000005 885.268 8618.77 68982.9 14506.8 30 3 0 0 0.440866 0 0 0 0 0.181828
0 TCONS_00000005 885.268 8618.77 68982.9 14506.8 30 3 0 0 0.440866 0 0 0 0 0.181828
Why are there same records??
If possible, please describe the details about each columns of locus_var.txt?
I really appreciate any your helps,
zun
Comment
-
Originally posted by Cole Trapnell View PostYes, these are the result of "common scale transformation" of counts. See the DESeq paper for how these are computed - we do the same thing to transform all counts across the replicates for a condition to a common scale for that replicate before fitting the dispersion model.
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...-
Channel: Articles
Yesterday, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
55 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
51 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
45 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
55 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment