SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Allele specific expression in RNAseq - masked genome rodrigo.duarte88 Bioinformatics 0 05-12-2015 03:04 AM
Un/Masked reads or reference genes ire Bioinformatics 0 07-18-2012 06:21 PM
Masked/Unmasked Reference Genome ytmnd85 General 5 05-31-2009 03:52 PM
Masked or unmasked genome for ChIP-seq analysis? hbbio Bioinformatics 3 04-07-2009 11:14 AM
Reference genome for MAQ - split reference genome by chromosome or not? inesdesantiago Bioinformatics 4 02-18-2009 08:44 AM

Reply
 
Thread Tools
Old 08-28-2015, 12:38 AM   #1
skly
Junior Member
 
Location: China

Join Date: Jul 2010
Posts: 7
Default masked or unmasked genome reference for RNAseq

Hi guys. I aligned RNAseq data to unmasked genome and hard masked genome, respectively. The mapping rate of the the former was about 92% and the latter rate was about 75%. It seemed that about 20% RNAseq data aligned to the repeating sequences of genome. So, which genome reference was better for RNAseq data, the unmasked genome or the hard masked?
Thanks a lot!
skly is offline   Reply With Quote
Old 08-28-2015, 01:47 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Use the unmasked (or soft masked) genome. Actually, use that for everything that doesn't explicitly state that it wants a hard masked genome. There are MANY genes that overlap repeat regions, at least partially and you'll be missing alignments to them if you hard mask a genome. Similarly, there's often expression of some repeats (this is mostly just noise), and by using a hard masked genome you'll increase your false-positive alignment rate of sequence originated from such repeats.
dpryan is offline   Reply With Quote
Old 08-29-2015, 07:47 PM   #3
skly
Junior Member
 
Location: China

Join Date: Jul 2010
Posts: 7
Default

Thanks dpryan~
I am clear.
Quote:
Originally Posted by dpryan View Post
Use the unmasked (or soft masked) genome. Actually, use that for everything that doesn't explicitly state that it wants a hard masked genome. There are MANY genes that overlap repeat regions, at least partially and you'll be missing alignments to them if you hard mask a genome. Similarly, there's often expression of some repeats (this is mostly just noise), and by using a hard masked genome you'll increase your false-positive alignment rate of sequence originated from such repeats.
skly is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:03 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO