Hello,
I am fairly new to the world of bioinformatics. I have data from an Illumina RNAseq run we did with Arabidopsis RNA. I ran the reads through the Cufflinks/Cuffdiff programs to get transcript levels for genes. I've noticed that for some genes, the program will group a few (2-4) genes/loci together and give them all the same FPKM. But if I search for these genes they are obviously separate genes and not isoforms of one gene. I believe it is because these genes are highly similar in sequence. But if I go and look at my aligned reads with a genome browser, many times most of the reads will only align to one of the genes. However in some instances both or all the genes show equal amounts of reads aligning. So my question is what exactly is the FPKM for these "grouped" genes? Is it for only one of the genes, or is it distributed amongst all the genes in the group? How do I find out what the FPKM for the individual genes are? Thanks in advance to anyone with any ideas!
I am fairly new to the world of bioinformatics. I have data from an Illumina RNAseq run we did with Arabidopsis RNA. I ran the reads through the Cufflinks/Cuffdiff programs to get transcript levels for genes. I've noticed that for some genes, the program will group a few (2-4) genes/loci together and give them all the same FPKM. But if I search for these genes they are obviously separate genes and not isoforms of one gene. I believe it is because these genes are highly similar in sequence. But if I go and look at my aligned reads with a genome browser, many times most of the reads will only align to one of the genes. However in some instances both or all the genes show equal amounts of reads aligning. So my question is what exactly is the FPKM for these "grouped" genes? Is it for only one of the genes, or is it distributed amongst all the genes in the group? How do I find out what the FPKM for the individual genes are? Thanks in advance to anyone with any ideas!
Comment