07-04-2014, 08:30 AM
Question What do I do with output files from tophat/cufflinks

Hi I am a beginner with RNA-sequencing and I used tophat to align RNA-seq reads from geuvadis to hg19 from UCSC. In tophat, I provided the reference transcript and then used the accepted_hits.bam file from the output as the input file for cufflinks.

I tested cufflinks with both the reference and without the reference transcripts and have the outputs for both of them. So now I am stuck... What exactly can I do now. I mean I have the isoforms and gene fpkm files with the values but how should I approach analyzing them in general? I am not doing a project but just want to know about the different processes I can do with these files as well as the transcripts.gtf file.

Also, what does an FPKM value of 0 mean? I know some other forums mentioned about this meaning that none of the reads mapped to the reference so I created a simple script to filter all of these values out of the isoforms.fpkm_tracking file. is this ok?

Lastly, what can I do to compare both the isoforms/transcripts files from cufflinks with and without the reference annotation?

Thank you so much for the help in advance!!!

