SEQanswers

Go Back   SEQanswers > Applications Forums > Epigenetics



Similar Threads
Thread Thread Starter Forum Replies Last Post
mapping on to custom reference sequence PratikC Bioinformatics 14 02-06-2019 04:41 PM
bisulphite converted barcoded fragment library for SOLID ageliki Sample Prep / Library Generation 2 12-23-2011 11:33 AM
adjust solid barcoded adaptors for preparing library from bisulphite converted DNA ageliki SOLiD 1 12-22-2011 04:18 PM
Bisulphite sequencing - Ion Torrent arnaud.kr Ion Torrent 1 11-11-2011 03:06 AM
Samtools (sequence alignment/mapping) wisosonic Bioinformatics 2 03-28-2011 04:31 AM

Reply
 
Thread Tools
Old 07-24-2009, 04:23 AM   #1
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Question Mapping Bisulphite converted sequence

What software are people using to map bisulphite converted sequence?

We've been getting more of this kind of data recently and doing the mapping and QC has proved to have all sorts of odd quirks about it. Our mapping pipeline is just a set of scripts which sit on top of bowtie, but I wondered if anyone had done a more formal tool which took into account things such as:
  • Whether the sequence is expected to be fully converted or not
  • Eliminating preferential mapping of unconverted sequence
  • Working out overall conversion frequencies

Does anyone have any good recommendations or are we all building our own?
simonandrews is offline   Reply With Quote
Old 07-24-2009, 08:56 AM   #2
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Thumbs down

Quote:
Originally Posted by simonandrews View Post
What software are people using to map bisulphite converted sequence?

We've been getting more of this kind of data recently and doing the mapping and QC has proved to have all sorts of odd quirks about it. Our mapping pipeline is just a set of scripts which sit on top of bowtie, but I wondered if anyone had done a more formal tool which took into account things such as:
  • Whether the sequence is expected to be fully converted or not
  • Eliminating preferential mapping of unconverted sequence
  • Working out overall conversion frequencies

Does anyone have any good recommendations or are we all building our own?
BFAST can easily be used to align bisulphite treated sequence (see the reference manual). I don't know of a tool for summarizing the conversion frequencies (beyond personal perl scripts), but if you find one let me know.
nilshomer is offline   Reply With Quote
Old 07-28-2009, 01:13 PM   #3
MadraghRua
Junior Member
 
Location: San Diego, CA

Join Date: Mar 2008
Posts: 6
Default

Check out RMAPBS - its RMAP modified for BS data. There was a recent genome research paper where it featured. Otherwise its build your own from what i can see.
MadraghRua is offline   Reply With Quote
Old 07-29-2009, 12:38 AM   #4
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by MadraghRua View Post
Check out RMAPBS - its RMAP modified for BS data. There was a recent genome research paper where it featured. Otherwise its build your own from what i can see.
Thanks - RMAPBS was one I'd not seen before. However it seems to suffer the same problem as most mapping programs I've seen, which is that it will map unconverted sequence more efficiently than converted sequence. For some applications this won't matter, but for epigenomics work where you're looking at the ratio of converted to unconverted reads in a particular region then it's essential that both forms should map with equal efficiency, even if this means removing some unconverted reads which otherwise could have been mapped.

Maybe the applications of bisulphite conversion are too varied to be handled cleanly in a single program?
simonandrews is offline   Reply With Quote
Old 08-07-2009, 02:47 PM   #5
wei
Junior Member
 
Location: Houston, TX

Join Date: Aug 2009
Posts: 4
Default

check out bsmap
http://www.biomedcentral.com/1471-2105/10/232
wei is offline   Reply With Quote
Old 09-26-2009, 04:09 PM   #6
andrewdsusc
Junior Member
 
Location: California

Join Date: Sep 2009
Posts: 3
Default

Quote:
Originally Posted by simonandrews View Post
Thanks - RMAPBS was one I'd not seen before. However it seems to suffer the same problem as most mapping programs I've seen, which is that it will map unconverted sequence more efficiently than converted sequence. For some applications this won't matter, but for epigenomics work where you're looking at the ratio of converted to unconverted reads in a particular region then it's essential that both forms should map with equal efficiency, even if this means removing some unconverted reads which otherwise could have been mapped.

Maybe the applications of bisulphite conversion are too varied to be handled cleanly in a single program?
When the methylation of interest is CpG methylation, RMAPBS *WILL NOT* bias mapping towards a particular methylation state. It exploits unconverted Cs at non-CpG positions to gain specificity in mapping without using those at CpG positions to gain specificity.
andrewdsusc is offline   Reply With Quote
Old 09-28-2009, 11:26 AM   #7
What_Da_Seq
Member
 
Location: RTP

Join Date: Jul 2008
Posts: 28
Default

BS mode of novoalign and from there Maq pilup and then custom perl scripts.
What_Da_Seq is offline   Reply With Quote
Old 08-29-2011, 11:00 PM   #8
sciencewu
Member
 
Location: china

Join Date: Dec 2010
Posts: 12
Default

bsmap maybe is good for you , but the cost of time is huge.
if you knew the mechnism of bisulfite alignment that many aligners is also ok .
sciencewu is offline   Reply With Quote
Old 08-30-2011, 11:02 AM   #9
bioinfosm
Senior Member
 
Location: USA

Join Date: Jan 2008
Posts: 482
Default

adapter trimming would be an important pre-processing step for BS seq ... rrbsmap, bismark, bsseeker are some tools that work exclusively for bisulphite, but all face the challenge of missing out on a big proportion of alignments..
__________________
--
bioinfosm
bioinfosm is offline   Reply With Quote
Old 09-25-2011, 11:04 AM   #10
volks
Member
 
Location: hd.de

Join Date: Jun 2010
Posts: 81
Default

Quote:
Originally Posted by bioinfosm View Post
rrbsmap, bismark, bsseeker are some tools that work exclusively for bisulphite, but all face the challenge of missing out on a big proportion of alignments..
why ist that ?
volks is offline   Reply With Quote
Old 09-25-2011, 11:44 PM   #11
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by bioinfosm View Post
adapter trimming would be an important pre-processing step for BS seq ... rrbsmap, bismark, bsseeker are some tools that work exclusively for bisulphite, but all face the challenge of missing out on a big proportion of alignments..
Really? Sure, the mapping efficiencies for bisulphite converted sequence are lower than for conventional sequencing, but nearly all of this is due to the loss of information in the conversion process meaning that the read can't be uniquely assigned to the original genome. In addition some aligners specifically choose to ignore unique alignments which couldn't have been found if the methylation state of the sequence was different to ensure that mapping is always fair and unbiased, but other than that I don't see that there's a problem affecting bisulphite aligners which is any worse than deficiencies in conventional aligners.

This isn't to say that there aren't still problems in bisulphite alignment. The issue of samples having a different genetic background to the reference genome leads to systematic methylation miscalls which are difficult to spot and lead to methylation change predictions which are actually genetic changes, but this is more a problem of calling than mapping.
simonandrews is offline   Reply With Quote
Old 10-04-2011, 06:47 PM   #12
aniruddha.otago
Member
 
Location: New Zealand

Join Date: Jan 2010
Posts: 21
Default

Hi simon,

We have done some methylation analysis with RRBS samples (human).. we have compared 3 aligners. (RMAPBS, BSMAP and Bismark) RMAPBS and Bismark came up with reaosanable methylation percentage ( around 40%, but the library is CGI rich). However, BSMAP showed extreme low level of methylation 15-18%. for QC checked ( we have done dynamic trimming and removed adaptor contamination as well) high quality data. Do you have any comments on that. Looks like one can tune methylation by choosing a particular alinger. Will really appreciate if you kindly reply with your views.

Regards,
Aniruddha.
aniruddha.otago is offline   Reply With Quote
Old 10-04-2011, 11:32 PM   #13
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

I'm surprised to hear that you're seeing such variable results from different programs. Were the mapping efficiencies wildly different between runs? You'd need quite a difference in mapping distribution to generate that kind of discrepancy. We've shown on simulated datasets that with bismark we can reliably extract the true methylation level regardless of the level of methylation in the library. The only factors which really influence this are the things you mentioned (adapters or poor quality sequence).

When mapping BS-Seq data it's more important that what you map is accurate than getting really good coverage. If in doubt you should make your mapping parameters more stringent. Mapping and adapter errors tend to drag the predicted methylation level towards 50% so this is especially problematic for low methylation libraries.

If you're seeing differences of 25% in your data then I suspect something more fundamental is going wrong in the way the programs are being run. The only thing which we've ever seen which makes this kind of difference is that some programs have an option to remove any reads containing more than 3 unconverted Cs, which can have a dramatic effect on the overall level, but normally this would only be applied in non-CpG context so this shouldn't be the problem in your case if your library is CpG rich.
simonandrews is offline   Reply With Quote
Old 10-05-2011, 08:32 AM   #14
yxibcm
Junior Member
 
Location: houston

Join Date: Jun 2010
Posts: 6
Default

The new version of bsmap(v2.2) has greatly improved the mapping speed
(28M 76bp PE reads mapped to hg19 genome in about 7 hours, using 8 threads RAM usage: ~9GB)

It also includes RRBS mode.

Best,

Yuanxin

Quote:
Originally Posted by sciencewu View Post
bsmap maybe is good for you , but the cost of time is huge.
if you knew the mechnism of bisulfite alignment that many aligners is also ok .

Last edited by yxibcm; 10-05-2011 at 08:40 AM.
yxibcm is offline   Reply With Quote
Old 10-05-2011, 08:36 AM   #15
yxibcm
Junior Member
 
Location: houston

Join Date: Jun 2010
Posts: 6
Default

Hi Aniruddha,

I'm the developer of BSMAP. Could you provide some details about the BSMAP command line and your input reads? I'm very interested in knowing why BSMAP has low level of methylation.

Also BSMAP support RRBS mode through option "-D" that adds the digestion sites specificity in mapping, or you can run the separate program RRBSMAP. This mode is also much faster memory efficient.

Best,

Yuanxin

Quote:
Originally Posted by aniruddha.otago View Post
Hi simon,

We have done some methylation analysis with RRBS samples (human).. we have compared 3 aligners. (RMAPBS, BSMAP and Bismark) RMAPBS and Bismark came up with reaosanable methylation percentage ( around 40%, but the library is CGI rich). However, BSMAP showed extreme low level of methylation 15-18%. for QC checked ( we have done dynamic trimming and removed adaptor contamination as well) high quality data. Do you have any comments on that. Looks like one can tune methylation by choosing a particular alinger. Will really appreciate if you kindly reply with your views.

Regards,
Aniruddha.

Last edited by yxibcm; 10-05-2011 at 08:43 AM.
yxibcm is offline   Reply With Quote
Old 10-05-2011, 08:40 AM   #16
twu
Developer of GMAP and GSNAP
 
Location: South San Francisco, CA

Join Date: Oct 2011
Posts: 17
Default

GSNAP can align bisulfite sequences, and was used to analyze methylation in the study of twins with multiple sclerosis that appeared in Nature last year. The latest version can handle both stranded and non-stranded bisulfite sequencing protocols.

Regards,

Tom
twu is offline   Reply With Quote
Old 10-05-2011, 09:08 AM   #17
rskr
Senior Member
 
Location: Santa Fe, NM

Join Date: Oct 2010
Posts: 250
Default

Quote:
Originally Posted by twu View Post
GSNAP can align bisulfite sequences, and was used to analyze methylation in the study of twins with multiple sclerosis that appeared in Nature last year. The latest version can handle both stranded and non-stranded bisulfite sequencing protocols.

Regards,

Tom
Last I checked the GSNAP paper's bench marking methodology, he only generated reads that could be mapped, with variants only in the middle portion of reads. I doubt it gets any better for bisulfite converted data.
rskr is offline   Reply With Quote
Old 10-05-2011, 10:13 AM   #18
twu
Developer of GMAP and GSNAP
 
Location: South San Francisco, CA

Join Date: Oct 2011
Posts: 17
Default

I did the benchmarking that way in the original GSNAP paper only to compare it against other programs that couldn't handle more complex reads, not to provide a full assessment of the capabilities of GSNAP. For example, I didn't benchmark its ability to find distant fusions, such as translocations, or its ability to find multiple indels and splicing in a single read. If you want to see an independent evaluation of GSNAP, there was a comparison done by researchers at University of Pennsylvania that appeared in Bioinformatics recently.

Regards,

Tom
twu is offline   Reply With Quote
Old 10-05-2011, 11:02 AM   #19
rskr
Senior Member
 
Location: Santa Fe, NM

Join Date: Oct 2010
Posts: 250
Default

Quote:
Originally Posted by twu View Post
I did the benchmarking that way in the original GSNAP paper only to compare it against other programs that couldn't handle more complex reads, not to provide a full assessment of the capabilities of GSNAP. For example, I didn't benchmark its ability to find distant fusions, such as translocations, or its ability to find multiple indels and splicing in a single read. If you want to see an independent evaluation of GSNAP, there was a comparison done by researchers at University of Pennsylvania that appeared in Bioinformatics recently.

Regards,

Tom
Did they evaluate GSNAP with bisulfite sequencing?
rskr is offline   Reply With Quote
Old 10-05-2011, 11:20 AM   #20
twu
Developer of GMAP and GSNAP
 
Location: South San Francisco, CA

Join Date: Oct 2011
Posts: 17
Default

No, it was a test of RNA-Seq simulated reads. And there are probably features of methylated reads that GSNAP does not assess, such as trying to evaluate various contexts, such as CG, CHG, and CHH. But it will align single-end or paired-end genomic reads having all possible substitutions of C to T with high speed and accuracy, including small and large indels and correctly accounting for T to C mismatches, and is not limited to variants in the middle of reads.

Regards,

Tom
twu is offline   Reply With Quote
Reply

Tags
bisulphite, epigenetics, mapping

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:47 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO