View Single Post
Old 05-13-2011, 05:23 AM   #1
Junior Member
Location: pittsburgh, pa

Join Date: May 2011
Posts: 2
Default Modification of reference genome annotation by cufflinks/cuffdiff?


I am totally new to NGS technologies. We are just now starting an RNA-seq project, and I am trying to figure out how to find differentially expressed genes with my Illumina (100bp) reads in Drosophila.

We aligned our reads to the dm3 genome in galaxy, and I imported a gtf file from ucsc. One thing that was really weird is that some genes I knew to be expressed in the tissue we isolated was listed with a value of 0 FPKM in cuffdiff. However, when I looked at the tophat aligned reads in the ucsc browser, there were plenty of reads.

Looking more closely at the data, I noticed that the chromosomal position reported by cuffdiff for the gene I was looking at had changed. It's previous range was 264127-264127 on the chromosome, but cuffdiff had set it to 264730-364670. So the gene is reported by cuffdiff to be a 100kb gene instead of a 2kb gene. The other weird thing is that many genes in that region of the genome had been changed w.r.t. the reference genome gtf file to have identical boundaries.

Is this some kind of artifact that causes genes to become associated with a different gene's annotation?

Thanks for any help you can provide!

markr is offline   Reply With Quote