![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Bug? duplicated genes in cufflinks output genes.expr | silin284 | Bioinformatics | 3 | 05-18-2014 12:19 AM |
Common elements within two set of miRNAs | Giorgio C | Bioinformatics | 1 | 01-05-2012 10:35 AM |
Fastq sorting and common part? | stoker | Bioinformatics | 0 | 07-07-2011 01:53 AM |
very common mutation not in SNP databases | fpepin | Bioinformatics | 17 | 02-11-2011 06:14 PM |
edgeR with no replication (Common disp or poisson) | sergio | Bioinformatics | 3 | 05-24-2010 09:53 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Maryland Join Date: Aug 2011
Posts: 52
|
![]()
Now I have two Excel spreadsheets, parts of them are as below:
Spreadsheet1 Gene Change Padj 1 212 0.0006 2 99 0.09 3 15 0.7 4 59 0.456 5 418 0.00001 6 566 0 Spreadsheet2 Gene Statistic FDR 1 28 0.0005 2 17 0.007 3 3 0.81 4 8 0.28 5 48 0.002 6 111 0 Using Excel, how can I create one new spreadsheet that only includes the Genes with both Padj and FDR are less than 0.05? And how can I do this in any other statsitical software, Like R, SPSS? Thanks very much for any idea for any question! |
![]() |
![]() |
![]() |
#2 |
--Site Admin--
Location: SF Bay Area, CA, USA Join Date: Oct 2007
Posts: 1,358
|
![]()
This is cake in R...
Depending on how large the datasets are, read them into R. The example below (using read.table) is just for using the clipboard in MacOSX (highlight in excel, copy, then run that line). The equiv in windows is read.table("clipboard",sep='\t',header=TRUE). Code:
> df1 <- read.table(pipe("pbpaste"),header=TRUE) > df1 Gene Change Padj 1 1 212 0.00060 2 2 99 0.09000 3 3 15 0.70000 4 4 59 0.45600 5 5 418 0.00001 6 6 566 0.00000 > df2 <- read.table(pipe("pbpaste"),header=TRUE) > df2 Gene Statistic FDR 1 1 28 0.0005 2 2 17 0.0070 3 3 3 0.8100 4 4 8 0.2800 5 5 48 0.0020 6 6 111 0.0000 > df3 <- merge(df1,df2) > df3 Gene Change Padj Statistic FDR 1 1 212 0.00060 28 0.0005 2 2 99 0.09000 17 0.0070 3 3 15 0.70000 3 0.8100 4 4 59 0.45600 8 0.2800 5 5 418 0.00001 48 0.0020 6 6 566 0.00000 111 0.0000 > df3[df3$Padj< 0.05 & df3$FDR < 0.05,] Gene Change Padj Statistic FDR 1 1 212 6e-04 28 5e-04 5 5 418 1e-05 48 2e-03 6 6 566 0e+00 111 0e+00 > write.csv(df3[df3$Padj< 0.05 & df3$FDR < 0.05,]) "","Gene","Change","Padj","Statistic","FDR" "1",1,212,6e-04,28,5e-04 "5",5,418,1e-05,48,0.002 "6",6,566,0,111,0 |
![]() |
![]() |
![]() |
#3 | |
Member
Location: Maryland Join Date: Aug 2011
Posts: 52
|
![]()
Thanks ECO, I run the similar codes in R, too. And I can get the result like yours Just based on the Spreadsheets showed in this thread. Sorry for not telling more info about my data:But I have more Genes(37970) in Spreadsheet2 than Spreadsheet1 (17980). It supposed to have 17980 Genes left after merging [the 19990(37970-17980) Genes which only in Spreadsheet2 will not appear in the merged sheets]. Unfortunately, it messed up after merging. Any idea will be greatly appreciated.
Quote:
Last edited by byou678; 10-03-2011 at 01:08 PM. |
|
![]() |
![]() |
![]() |
#4 |
--Site Admin--
Location: SF Bay Area, CA, USA Join Date: Oct 2007
Posts: 1,358
|
![]()
Just need to tell merge what to do...(note I added two extra genes into df1)
Code:
> df1 Gene Change Padj 1 1 212 0.00060 2 2 99 0.09000 3 3 15 0.70000 4 4 59 0.45600 5 5 418 0.00001 6 6 566 0.00000 7 7 123 0.00000 8 8 125 0.04000 > df2 Gene Statistic FDR 1 1 28 0.0005 2 2 17 0.0070 3 3 3 0.8100 4 4 8 0.2800 5 5 48 0.0020 6 6 111 0.0000 > merge(df1,df2,on=Gene) Gene Statistic FDR Change Padj 1 1 28 0.0005 212 0.00060 2 2 17 0.0070 99 0.09000 3 3 3 0.8100 15 0.70000 4 4 8 0.2800 59 0.45600 5 5 48 0.0020 418 0.00001 6 6 111 0.0000 566 0.00000 |
![]() |
![]() |
![]() |
#5 | |
Member
Location: Maryland Join Date: Aug 2011
Posts: 52
|
![]()
Thanks again ECO. Yes, I like using R too and I have figured it out.
Quote:
|
|
![]() |
![]() |
![]() |
Tags |
common genes extract |
Thread Tools | |
|
|