Hi,
I am new to the forum and this is my first post:-). I have a few questions regarding RNA-seq results. Recently, we have performed an RNA-seq experiment and analysed the data using the galaxy.psu.edu server.
1. In some cases we got strange mapping results using TopHat and hg19 full as a reference genome. Have a look at figure "Strange mapping results". As you can see, there are a lot of reads mapped to an intron. Does it happen also in your data?
2. Im my results there are a lot of small RNAs (MIRs and SNORDs). In my opinion this is because their sequences are usually embedded within other genes, for example:
gene name: SNORD24
locus: chr9:136215068-136218280
gene name: RPL7A
locus: chr9:136215068-136218280
Please have a look at the "Snord24 and Rpl7a" figure for details. Do you also get small RNAs in your results?
3. The Snord24 gene is highly upregulated in control conditions, with FPKM value of 5089 (and less than 20 reads mapping to it). On the other hand, FPKM value for Rpl7a is much lower, only about 440. I know that Rpl7a is longer, but still have some doubts about the way FPKM is calculated. This is why I would like to ask you the following questions:
(a) Do Cufflinks and Cuffdiff take the actual gene length or locus length in FPKM calculations?
In the case of SNORD24, the actual gene length is about 70, while locus length is about 3200.
(b) Which reads are taken into account in FPKM calculations: only those mapping to actual gene or all those mapping to a locus?
Thanks for help in advance.
I am new to the forum and this is my first post:-). I have a few questions regarding RNA-seq results. Recently, we have performed an RNA-seq experiment and analysed the data using the galaxy.psu.edu server.
1. In some cases we got strange mapping results using TopHat and hg19 full as a reference genome. Have a look at figure "Strange mapping results". As you can see, there are a lot of reads mapped to an intron. Does it happen also in your data?
2. Im my results there are a lot of small RNAs (MIRs and SNORDs). In my opinion this is because their sequences are usually embedded within other genes, for example:
gene name: SNORD24
locus: chr9:136215068-136218280
gene name: RPL7A
locus: chr9:136215068-136218280
Please have a look at the "Snord24 and Rpl7a" figure for details. Do you also get small RNAs in your results?
3. The Snord24 gene is highly upregulated in control conditions, with FPKM value of 5089 (and less than 20 reads mapping to it). On the other hand, FPKM value for Rpl7a is much lower, only about 440. I know that Rpl7a is longer, but still have some doubts about the way FPKM is calculated. This is why I would like to ask you the following questions:
(a) Do Cufflinks and Cuffdiff take the actual gene length or locus length in FPKM calculations?
In the case of SNORD24, the actual gene length is about 70, while locus length is about 3200.
(b) Which reads are taken into account in FPKM calculations: only those mapping to actual gene or all those mapping to a locus?
Thanks for help in advance.