SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   RNA Sequencing (http://seqanswers.com/forums/forumdisplay.php?f=26)
-   -   masked or unmasked genome reference for RNAseq (http://seqanswers.com/forums/showthread.php?t=62305)

skly 08-28-2015 12:38 AM

masked or unmasked genome reference for RNAseq
 
Hi guys. I aligned RNAseq data to unmasked genome and hard masked genome, respectively. The mapping rate of the the former was about 92% and the latter rate was about 75%. It seemed that about 20% RNAseq data aligned to the repeating sequences of genome. So, which genome reference was better for RNAseq data, the unmasked genome or the hard masked?
Thanks a lot!

dpryan 08-28-2015 01:47 AM

Use the unmasked (or soft masked) genome. Actually, use that for everything that doesn't explicitly state that it wants a hard masked genome. There are MANY genes that overlap repeat regions, at least partially and you'll be missing alignments to them if you hard mask a genome. Similarly, there's often expression of some repeats (this is mostly just noise), and by using a hard masked genome you'll increase your false-positive alignment rate of sequence originated from such repeats.

skly 08-29-2015 07:47 PM

Thanks dpryan~
I am clear.
Quote:

Originally Posted by dpryan (Post 179767)
Use the unmasked (or soft masked) genome. Actually, use that for everything that doesn't explicitly state that it wants a hard masked genome. There are MANY genes that overlap repeat regions, at least partially and you'll be missing alignments to them if you hard mask a genome. Similarly, there's often expression of some repeats (this is mostly just noise), and by using a hard masked genome you'll increase your false-positive alignment rate of sequence originated from such repeats.



All times are GMT -8. The time now is 04:54 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.