Hi,
I am just starting to learn using STAR for RNA-seq mapping. I made libraries from low-input, degraded (and likely enriched in short RNAs) samples. Because I ran my samples with someone else in lab, I did paired-end 2x150 bp read (kind of overkill, I know).
Before running STAR, I only took out low quality sequences and no adapter trimming since I read it does this, correct?
I am not too surprised by the low percentage unique mapped, but am confused about why the percent of short unmapped reads increases from 10% to 70% when I process both reads together (R1 and R2), while the multi mapped decreases from 50 to 10% - any help/ideas/suggestions?
R2
Time Speed Read Read Mapped Mapped Mapped Mapped Unmapped Unmapped Unmapped Unmapped
M/hr number length unique length MMrate multi multi+ MM short other
Dec 20 10:43:15 636.6 10610538 95 34.0% 102.7 0.7% 50.9% 4.8% 0.0% 10.4% 0.0%
Dec 20 10:44:15 559.1 18636117 94 33.7% 101.5 0.7% 51.1% 4.8% 0.0% 10.4% 0.0%
Dec 20 10:45:30 552.9 29951091 94 33.7% 101.3 0.7% 51.2% 4.8% 0.0% 10.4% 0.0%
R1
Time Speed Read Read Mapped Mapped Mapped Mapped Unmapped Unmapped Unmapped Unmapped
M/hr number length unique length MMrate multi multi+ MM short other
Dec 20 10:35:34 608.2 10473886 97 35.2% 105.2 0.6% 49.0% 4.6% 0.0% 11.2% 0.0%
Dec 20 10:36:34 679.8 23039300 96 34.9% 104.1 0.6% 49.2% 4.6% 0.0% 11.3% 0.0%
Dec 20 10:37:34 691.4 34952409 96 34.9% 104.4 0.6% 49.2% 4.6% 0.0% 11.3% 0.0%
R1 and R2
Time Speed Read Read Mapped Mapped Mapped Mapped Unmapped Unmapped Unmapped Unmapped
M/hr number length unique length MMrate multi multi+ MM short other
Dec 20 09:23:11 346.1 5960173 194 13.8% 168.1 0.7% 11.7% 3.9% 0.0% 70.7% 0.0%
Dec 20 09:24:13 386.1 13297340 192 13.7% 167.1 0.7% 11.8% 3.9% 0.0% 70.6% 0.0%
Dec 20 09:25:15 400.8 20709447 191 13.6% 166.0 0.7% 11.9% 4.0% 0.0% 70.5% 0.0%
Dec 20 09:26:15 404.5 27640580 191 13.6% 166.2 0.7% 11.8% 4.0% 0.0% 70.6% 0.0%
Dec 20 09:27:15 402.5 34210699 191 13.6% 165.9 0.7% 11.8% 4.0% 0.0% 70.6% 0.0%
I am just starting to learn using STAR for RNA-seq mapping. I made libraries from low-input, degraded (and likely enriched in short RNAs) samples. Because I ran my samples with someone else in lab, I did paired-end 2x150 bp read (kind of overkill, I know).
Before running STAR, I only took out low quality sequences and no adapter trimming since I read it does this, correct?
I am not too surprised by the low percentage unique mapped, but am confused about why the percent of short unmapped reads increases from 10% to 70% when I process both reads together (R1 and R2), while the multi mapped decreases from 50 to 10% - any help/ideas/suggestions?
R2
Time Speed Read Read Mapped Mapped Mapped Mapped Unmapped Unmapped Unmapped Unmapped
M/hr number length unique length MMrate multi multi+ MM short other
Dec 20 10:43:15 636.6 10610538 95 34.0% 102.7 0.7% 50.9% 4.8% 0.0% 10.4% 0.0%
Dec 20 10:44:15 559.1 18636117 94 33.7% 101.5 0.7% 51.1% 4.8% 0.0% 10.4% 0.0%
Dec 20 10:45:30 552.9 29951091 94 33.7% 101.3 0.7% 51.2% 4.8% 0.0% 10.4% 0.0%
R1
Time Speed Read Read Mapped Mapped Mapped Mapped Unmapped Unmapped Unmapped Unmapped
M/hr number length unique length MMrate multi multi+ MM short other
Dec 20 10:35:34 608.2 10473886 97 35.2% 105.2 0.6% 49.0% 4.6% 0.0% 11.2% 0.0%
Dec 20 10:36:34 679.8 23039300 96 34.9% 104.1 0.6% 49.2% 4.6% 0.0% 11.3% 0.0%
Dec 20 10:37:34 691.4 34952409 96 34.9% 104.4 0.6% 49.2% 4.6% 0.0% 11.3% 0.0%
R1 and R2
Time Speed Read Read Mapped Mapped Mapped Mapped Unmapped Unmapped Unmapped Unmapped
M/hr number length unique length MMrate multi multi+ MM short other
Dec 20 09:23:11 346.1 5960173 194 13.8% 168.1 0.7% 11.7% 3.9% 0.0% 70.7% 0.0%
Dec 20 09:24:13 386.1 13297340 192 13.7% 167.1 0.7% 11.8% 3.9% 0.0% 70.6% 0.0%
Dec 20 09:25:15 400.8 20709447 191 13.6% 166.0 0.7% 11.9% 4.0% 0.0% 70.5% 0.0%
Dec 20 09:26:15 404.5 27640580 191 13.6% 166.2 0.7% 11.8% 4.0% 0.0% 70.6% 0.0%
Dec 20 09:27:15 402.5 34210699 191 13.6% 165.9 0.7% 11.8% 4.0% 0.0% 70.6% 0.0%
Comment