Hi guys,
i have a simple question regarding cufflinks, bam file and gff. Its all connected to the FPKM 0 values for some transcripts but which have reads when inspect the region in the browser.
I fuond a post from 2010 stating that the order of the chromosomes in the GFF file must follow the order of the chromosomes in the bam file. Here is the post:]
"I had this same issue and discussed it with Cole Trapnell. This issue arises because Cufflinks requires a tab delimited header in the SAM file you are using. Without the header, the GTF and SAM files are processed in the same order. So, if annotations for chromosome 4 appear in the GTF file before the annotations for chromosome 2, but the chromosome 2 reads appear before the chromosome 4 reads in the SAM file, all of the genes/transcripts for one of those chromosomes will have FPKMs of 0 because by the time Cufflinks starts calculating FPKMs for one of those chromosomes, the corresponding reads have already been passed by in the SAM file - thus FPKM=0. A tab-delimited header file solves this. This also explains why it works fine without a GTF."
So i would like to ask is that true? I mean if it is then this means one must always go and check and rearrange the GFF before being used at all. And that on top of all other problems with GFF files. Damn tricky.
Thank you for your time and help
i have a simple question regarding cufflinks, bam file and gff. Its all connected to the FPKM 0 values for some transcripts but which have reads when inspect the region in the browser.
I fuond a post from 2010 stating that the order of the chromosomes in the GFF file must follow the order of the chromosomes in the bam file. Here is the post:]
"I had this same issue and discussed it with Cole Trapnell. This issue arises because Cufflinks requires a tab delimited header in the SAM file you are using. Without the header, the GTF and SAM files are processed in the same order. So, if annotations for chromosome 4 appear in the GTF file before the annotations for chromosome 2, but the chromosome 2 reads appear before the chromosome 4 reads in the SAM file, all of the genes/transcripts for one of those chromosomes will have FPKMs of 0 because by the time Cufflinks starts calculating FPKMs for one of those chromosomes, the corresponding reads have already been passed by in the SAM file - thus FPKM=0. A tab-delimited header file solves this. This also explains why it works fine without a GTF."
So i would like to ask is that true? I mean if it is then this means one must always go and check and rearrange the GFF before being used at all. And that on top of all other problems with GFF files. Damn tricky.
Thank you for your time and help