Dear all,
I am using tophat/cufflinks on a set of single end sequencing data from 4 different biological samples (without replicates). My first goal is to compare the rpkm in the different conditions.
Here is how I use tophat:
tophat -p 4 -G /path/to/gff_file /path/to/genome_file /path/to_input_file
And here cufflinks usage I used:
cufflinks -G /path/to_gff_file /path/to/sam_file
When I look to the transcript.expr file I am surprize to see that sometimes same ensembl transcript have multiple rpkm, such:
ENSDART00000000198 1367346 Zv8_scaffold3091 65836 148200 0.03 1 1 0 0.1 0.02 2139
ENSDART00000000198 1367332 Zv8_scaffold3091 65836 148200 1.47 1 1 1.03 1.9 0.77 2139
This is true also in the gene.expr file...
Is it because I didnt use the -g 1 option in tophat to restrict to single hit in the genome?
Could help me to tune the options in tophat and cufflink to avoid this splitting of rpkm from the same location?
Cheers
Oliviera
I am using tophat/cufflinks on a set of single end sequencing data from 4 different biological samples (without replicates). My first goal is to compare the rpkm in the different conditions.
Here is how I use tophat:
tophat -p 4 -G /path/to/gff_file /path/to/genome_file /path/to_input_file
And here cufflinks usage I used:
cufflinks -G /path/to_gff_file /path/to/sam_file
When I look to the transcript.expr file I am surprize to see that sometimes same ensembl transcript have multiple rpkm, such:
ENSDART00000000198 1367346 Zv8_scaffold3091 65836 148200 0.03 1 1 0 0.1 0.02 2139
ENSDART00000000198 1367332 Zv8_scaffold3091 65836 148200 1.47 1 1 1.03 1.9 0.77 2139
This is true also in the gene.expr file...
Is it because I didnt use the -g 1 option in tophat to restrict to single hit in the genome?
Could help me to tune the options in tophat and cufflink to avoid this splitting of rpkm from the same location?
Cheers
Oliviera
Comment