Seqanswers Leaderboard Ad

**PeteH** · 11-02-2011, 04:41 PM

If you are interested in identifying CpG islands I can recommend reading Wu et al. Biostatistics (2010) (http://www.ncbi.nlm.nih.gov/pubmed/20212320). The paper argues that some common definitions of CpG islands are too restrictive (such as the definition used by the UCSC genome browser). The authors develop a hidden Markov model to define CpG islands for arbitrary genomes.

The paper is accompanied by software that implements their method and tables of pre-computed CpG islands using their software for many popular genomes (see http://rafalab.jhsph.edu/CGI/index.html).
Pete

**HelenM** · 11-02-2011, 05:11 PM

Pete,

Great, I think this will be very useful indeed!
I had been trying to find an existing set of CpG Islands for Bos taurus as well.
Many thanks!

**jamal** · 12-05-2011, 05:16 AM

Hi Helen

I used "makeCGI" for Sus scrofa and get .rda file in the result folder. I want to know that if you used this software for Bos taurus and how you extract the result from .rda file.
thank you in advance

Jamal

**cjp** · 12-05-2011, 05:36 AM

The GATK command worked for me (did you make the picard ".dict" file for your reference fasta file?):

% java -Xmx2g -Djava.io.tmpdir=/path/to/tmp -jar /path/to/GenomeAnalysisTK-1.1-23-g8072bd9/GenomeAnalysisTK.jar -T GCContentByInterval -R /path/to/human_g1k_v37.fasta -L 1:1-100000 -o chr1_1_100000_gc.txt

...

% cat chr1_1_100000_gc.txt
1:1-100000 0.38207

Chris

**jamal** · 12-05-2011, 06:08 AM

Hi chris

I didn't make the picard file for my genome. please tell me how can I do that.
and plaese tell me more about GATK.

thanks alot

Jamal

**cjp** · 12-05-2011, 06:36 AM

There is a link here about making the picard dict file for GATK:

http://www.broadinstitute.org/gsa/wiki/index.php/Preparing_the_essential_GATK_input_files:_the_reference_genome

Download the latest picard from here into a new directory (for me $HOME/src on a Linux machine) and unzip it:

http://sourceforge.net/projects/picard/files/latest/download?source=files

Something like this works for me:

java -jar /home/cjp64/src/picard-tools-1.53/CreateSequenceDictionary.jar R=/data/refs/archive/hg19/bowtie/hg19.fasta O=/data/refs/archive/hg19/bowtie/hg19.dict

GATK help starts here (it's on many pages though and is more for doing SNP calls):

http://www.broadinstitute.org/gsa/wiki/index.php/The_Genome_Analysis_Toolkit

Chris

**oria34** · 05-21-2013, 03:07 AM

Hi all,

Did anyone try "makeCGI" recently?

I am having some problems with this package.

First, It finds a lot of troubles reading chromosome/scaffold headers from the the fasta files and crash. I reduced the headers just to chromosome/scaffold (deleting the rest of the stuff) name and it seemed to work but then crashed with a new warning message:

Warning message:
In rm(pattern = "Ngc") : object 'Ngc' not found

Apparently, It doesn't like too much to find "Ns" along the sequence.

IT creates the result file but apparently it is empty.

Any suggestions? I am really new with all these stuff so any advice will be very welcome

Thanks in advance

jamal, Maybe is a bit late, but I have found this to convert RDA to CSV I though it might be useful for other people

**jfeicheng** · 09-21-2014, 06:21 PM

makeCGI

bject 'Ngc' not found

Hi
I've tried this program recently, but I met the same problem like you.

Warning message:
In rm(pattern = "Ngc") : object 'Ngc' not found

I want to know if you find any solutions for this program.
Thank you in advance.

Originally posted by oria34 View Post

Hi all,

Did anyone try "makeCGI" recently?

I am having some problems with this package.

First, It finds a lot of troubles reading chromosome/scaffold headers from the the fasta files and crash. I reduced the headers just to chromosome/scaffold (deleting the rest of the stuff) name and it seemed to work but then crashed with a new warning message:

Warning message:
In rm(pattern = "Ngc") : object 'Ngc' not found

Apparently, It doesn't like too much to find "Ns" along the sequence.

IT creates the result file but apparently it is empty.

Any suggestions? I am really new with all these stuff so any advice will be very welcome

Thanks in advance

jamal, Maybe is a bit late, but I have found this to convert RDA to CSV I though it might be useful for other people

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Programs for GC content and CpG Islands

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News