Seqanswers Leaderboard Ad

**krobison** · 06-08-2010, 06:13 AM

Please put an entry in the software Wiki for this tool! Otherwise, I'll have to do it :-)

**brentp** · 06-08-2010, 07:35 AM

done. thanks.

**brentp** · 07-13-2010, 01:46 PM

hi, i've updated MethylCoder with the following:

+ supports paired end reads
+ can use either bowtie or gsnap for the aligner
+ can take either fasta or fastq files as input
+ prints a nice, per-chromsome summary along with the per-base text and binary format and the SAM format.
+ better documented analysis scripts for finding differentially methylated regions between 2 runs of the pipeline. (fisher's exact test)
+ full tracking of the command used to generate each output file.
+ growing test suite.

please let me know if any questions, comments, or feature requests ( [email protected] )
code is available at github as before:

directly from the git repository as: git clone git://github.com/brentp/methylcode.git
and via tarball: http://github.com/brentp/methylcode/tarball/master

**brentp** · 07-06-2011, 06:53 AM

MethylCoder has been published as a bioinformatics applications note:

MethylCoder: Software Pipeline for Bisulfite-Treated Sequences
Brent Pedersen; Tzung-Fu Hsieh; Christian Ibarra; Robert L. Fischer
Bioinformatics 2011; doi: 10.1093/bioinformatics/btr394

PDF Link

Let me know of any questions.

**bisol** · 07-06-2011, 08:12 AM

Hi Brent,

what are the differences in the alignment between basespace and colorspace data i.e. how do you solve the problem that one can't apply the in-silico conversion of C's to T's in reads for colorspace?

**brentp** · 07-07-2011, 06:01 AM

Originally posted by bisol View Post

Hi Brent,

what are the differences in the alignment between basespace and colorspace data i.e. how do you solve the problem that one can't apply the in-silico conversion of C's to T's in reads for colorspace?

Hi Bisol,
I basically side-step the problem. I recommend that you do the following:
1) quality trim your reads
2) map with methylcoder (+bowtie) allowing 0 (you can also try 1) mismatches.
3) map the unmapped reads with solid's SOCS tool: http://solidsoftwaretools.com/gf/project/socs/

MethylCoder does a naive translation of C=>T by converting to base-space, then converting, then converting back to base-space. So it doesn't solve the problem, just tries to provide a solution to quickly map reads with no errors. I welcome suggestions for improvement in that regard.

-Brent

**dychiang** · 07-07-2011, 06:22 AM

Comparison with BisMark?

Brent,

Nice software and publication. Have you tried comparing MethylCode and BisMark on the H1 ES cell line MethylC-seq dataset from Lister et al (2009)?

Thanks,
Derek

**brentp** · 07-07-2011, 06:36 AM

Originally posted by dychiang View Post

Have you tried comparing MethylCode and BisMark on the H1 ES cell line MethylC-seq dataset from Lister et al (2009)?

Hi Derek, there is a comparison to other BS-Seq software here:

methylcode/bench at master · brentp/methylcode

https://github.com/brentp/methylcode/tree/master/bench

Alignment and Tabulation of BiSulfite Treated Reads - brentp/methylcode

It uses some Arabidopsis thaliana data and shows time, (approximate) memory use, and reads mapped.

Felix Kreuger, one of the authors of BisMark suggested some changes to BisMark parameters that I could use to improve its performance, but I have not yet updated the benchmark with those changes.

**fkrueger** · 07-07-2011, 06:38 AM

As both MethylCoder and Bismark employ a very similar strategy, I would imagine that the results are very similar. By the way my last name is spelled Krueger :P.

**brentp** · 07-07-2011, 06:41 AM

Originally posted by fkrueger View Post

By the way my last name is spelled Krueger :P.

As someone who repeatedly has their last name misspelled, I sincerely apologize.

And yes, the results between MethylCoder (with bowtie) and BisMark are quite similar.

**dychiang** · 07-07-2011, 07:36 AM

Originally posted by fkrueger View Post

As both MethylCoder and Bismark employ a very similar strategy, I would imagine that the results are very similar. By the way my last name is spelled Krueger :P.

Brent and Felix -- thanks very much for your helpful replies. The epigenetics sequencing community needs some good benchmarks, such as RGASP, to test the plethora of algorithms being developed.

Will either or both of you be attending the HiTSeq SIG at ISMB next week? I would be delighted to meet up with you.

**fkrueger** · 07-07-2011, 07:52 AM

Hi dychiang,
I won't be attending the HiTSeq SIG but I'll be at ISMB from Sunday til Wednesday. I'm happy to meet up with you, either drop me an email ([email protected]) or find me at my poster (Poster U59: Analysing allele-specific NGS datasets using ASAP)

**brentp** · 07-07-2011, 09:58 AM

Hi Derek, I wont be at that conference, but feel free to send me an email.

**bisol** · 07-07-2011, 11:59 AM

So essentially MethylCoder can map only perfect color space reads which align
perfectly against the genome.

As it has been pointed out in a previous thread you started
(http://seqanswers.com/forums/showthread.php?t=7979), it is generally problematic to
do a naive translation of color to base space, then converting C=>T and translating
back to color space, as a single measurement error in the color space read will be
translated into a false nucleotide sequence. Depending on where in the read this
measurement error occurs, the sequence either can't be mapped anymore (which means a
low mapping efficiency) or it will map to a wrong position in the genome and thus
result in false methylation calls.

You are now suggesting that all unmapped reads should instead be aligned with
SOCS-B, which - even though it is a good tool - is incredibly slow for complex
genomes and many reads.

Therefore, isn't it a quite bold statement to state in the paper that "MethylCoder
is a novel tool that allows ... mapping in both color and nucleotide-space -
something that no other BS-Seq software allows"?

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

MethylCoder: software for bisulfite treated reads

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News