I am trying to understand what do the columns mean for a sequencing dataset that I downloaded from NCBI. The link to the page is http://www.ncbi.nlm.nih.gov/geo/quer...i?acc=GSE49372, and I download the "GSE49372_RAW.tar" file where, after uncompress, we see each sample is summarized in a .txt file. I extract a subset of the first sample as below:
gene_id bundle_id chr left right FPKM FPKM_conf_lo FPKM_conf_hi status
YAL069W 12733 I 334 649 0 0 0 OK
YAL068W-A 12733 I 537 792 0 0 0 OK
YAL068C 12734 I 1806 2169 2.22061 0 5.20096 OK
YAL067W-A 12735 I 2479 2707 0 0 0 OK
YAL067C 12736 I 7234 9016 44.7682 31.3864 58.15 OK
YAL066W 12737 I 10090 10399 0 0 0 OK
Here I see the "left" and "right" columns correspond to read counts (is that correct?). Because I want to obtain a read count matrix with rows as genes and columns as samples (as we analyze in edgeR or DESeq), I don't know how can I summarize the "left" and "right" columns for this sample. I appreciate any suggestions
gene_id bundle_id chr left right FPKM FPKM_conf_lo FPKM_conf_hi status
YAL069W 12733 I 334 649 0 0 0 OK
YAL068W-A 12733 I 537 792 0 0 0 OK
YAL068C 12734 I 1806 2169 2.22061 0 5.20096 OK
YAL067W-A 12735 I 2479 2707 0 0 0 OK
YAL067C 12736 I 7234 9016 44.7682 31.3864 58.15 OK
YAL066W 12737 I 10090 10399 0 0 0 OK
Here I see the "left" and "right" columns correspond to read counts (is that correct?). Because I want to obtain a read count matrix with rows as genes and columns as samples (as we analyze in edgeR or DESeq), I don't know how can I summarize the "left" and "right" columns for this sample. I appreciate any suggestions
Comment