Hello,
So, I'm doing some downstream analysis on a published RNA-seq data set for yeast: http://downloads.yeastgenome.org/pub...3390610/fastq/
However, after I mapped them to the yeast genome, I noticed using Samtools that, oddly enough, far, far more reads were mapping to the complement of genes, than to the genes themselves. That is, if a gene was on the (+) strand, between nucleotide 500-1000 (for example), I would find that for most of the genes, far more RNA-seq reads would map to that location on the (-) strand than the (+) strand. I found that only ~800 genes would map in a 'canonical' fashion, that is, having more reads than the complementary region, while ~5800 would map in a non-canonical way, where there were more reads complementary to a gene than within the gene.
I tested the script I wrote to make these measurements among other RNA-seq datasets, and did not find the same thing. What could be wrong with my yeast dataset?
I have performed alignment with both SHRiMP and Tophat- both programs gave the same numbers. Changing the library type on Tophat did not affect the outcome.
Thanks for any help!
So, I'm doing some downstream analysis on a published RNA-seq data set for yeast: http://downloads.yeastgenome.org/pub...3390610/fastq/
However, after I mapped them to the yeast genome, I noticed using Samtools that, oddly enough, far, far more reads were mapping to the complement of genes, than to the genes themselves. That is, if a gene was on the (+) strand, between nucleotide 500-1000 (for example), I would find that for most of the genes, far more RNA-seq reads would map to that location on the (-) strand than the (+) strand. I found that only ~800 genes would map in a 'canonical' fashion, that is, having more reads than the complementary region, while ~5800 would map in a non-canonical way, where there were more reads complementary to a gene than within the gene.
I tested the script I wrote to make these measurements among other RNA-seq datasets, and did not find the same thing. What could be wrong with my yeast dataset?
I have performed alignment with both SHRiMP and Tophat- both programs gave the same numbers. Changing the library type on Tophat did not affect the outcome.
Thanks for any help!
Comment