Hello,
I am relatively new to analyses of RNA-seq. I am right now analyzing human blood data from 22 biological samples using the tophat and cufflinks pipeline. The cufflinks command I used to analysed the *.bam files (generated from tophat using hg19 reference) is
cufflinks -p 16 -o S01 -G Homo_sapiensPlusChr.GRCh37.63.gtf -b hg19.fa -u -N --compatible-hits-norm ./Sample1_accepted_hits.bam
Both the genes.fpkm_tracking and the isoform.fpkm_tracking resulting output files generated seem to have a relatively large proportion of "LOWDATA" and "FAIL" calls for the FPKM_status attribute.
This proportion of these calls seems similar (~30%) across the multiple samples and also the genes getting these calls seem almost the same again across the multiple samples.
I am not sure if I am doing something wrong - or if this is the expected behavior of the algorithms. I am hoping that I am (or the algorithms) are doing something incorrect.
We have matching microarray data generated from these samples. Some of the highly expressing genes from the microarray data get this "FAIL" status even though the FPKM values seem relatively high.
Any help would be appreciated.
Thanks,
-Reuben
I am relatively new to analyses of RNA-seq. I am right now analyzing human blood data from 22 biological samples using the tophat and cufflinks pipeline. The cufflinks command I used to analysed the *.bam files (generated from tophat using hg19 reference) is
cufflinks -p 16 -o S01 -G Homo_sapiensPlusChr.GRCh37.63.gtf -b hg19.fa -u -N --compatible-hits-norm ./Sample1_accepted_hits.bam
Both the genes.fpkm_tracking and the isoform.fpkm_tracking resulting output files generated seem to have a relatively large proportion of "LOWDATA" and "FAIL" calls for the FPKM_status attribute.
This proportion of these calls seems similar (~30%) across the multiple samples and also the genes getting these calls seem almost the same again across the multiple samples.
I am not sure if I am doing something wrong - or if this is the expected behavior of the algorithms. I am hoping that I am (or the algorithms) are doing something incorrect.
We have matching microarray data generated from these samples. Some of the highly expressing genes from the microarray data get this "FAIL" status even though the FPKM values seem relatively high.
Any help would be appreciated.
Thanks,
-Reuben
Comment