Hi ,
I got the two different RNA-Seq data sequenced from Illumina Hi-seq machines. I got the data as fpkm tables. The data were sequenced at different places and at different time points. I dont have much information on one dataset,as it was done two years back. And other was done recently using Tophat and Cufflinks. I need to combine both the sequencing results and focus on the list of differentially expressed genes and do some bioinformatic approaches for module detection .
From one dataset , I have list of DE genes as 20,000 and in second dataset were 38000 genes. I am wondering whether is it reasonable to consider the data for further analysis as the library protocol may be different so was differing number of genes in both datasets.
Is it good idea to start with fasta sequences and do the analysis like mapping,cufflinks from scratch ?
Any ideas is warmly acceptable
I got the two different RNA-Seq data sequenced from Illumina Hi-seq machines. I got the data as fpkm tables. The data were sequenced at different places and at different time points. I dont have much information on one dataset,as it was done two years back. And other was done recently using Tophat and Cufflinks. I need to combine both the sequencing results and focus on the list of differentially expressed genes and do some bioinformatic approaches for module detection .
From one dataset , I have list of DE genes as 20,000 and in second dataset were 38000 genes. I am wondering whether is it reasonable to consider the data for further analysis as the library protocol may be different so was differing number of genes in both datasets.
Is it good idea to start with fasta sequences and do the analysis like mapping,cufflinks from scratch ?
Any ideas is warmly acceptable
Comment