SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Tools to plot strand cross correlation for a ChIP-seq experiment liz_is Bioinformatics 2 11-28-2012 01:44 AM
Script for calculating the average distribution of tags per gene from ChIP-seq?? Giles General 1 09-12-2011 02:25 AM
ChIP-Seq: A ChIP-Seq Benchmark Shows That Sequence Conservation Mainly Improves Detec Newsbot! Literature Watch 0 05-03-2011 02:00 AM
ChIP-Seq: ChIP-chip versus ChIP-seq: Lessons for experimental design and data analysi Newsbot! Literature Watch 0 03-02-2011 02:50 AM
How do I make next-gen SEQ data non-redundant? PRJ Bioinformatics 18 08-23-2010 07:04 AM

Reply
 
Thread Tools
Old 04-20-2013, 03:11 PM   #1
crazyhottommy
Senior Member
 
Location: Gainesville

Join Date: Apr 2012
Posts: 140
Default How to make an average conservation plot from ChIP-seq data

Hi all,

I want to know how to make an average conservation plot
http://ceas.cbi.pku.edu.cn/ this website can make something I want, but it can only plot
a set of regions at one time, I want to plot two sets of regions, and compare them.

" 2. GC content and evolutionary conservation of each ChIP-region and their
average. CEAS uses PhastCons conservation scores from UCSC Genome
Bioinformatics, which is based on multiz alignment of human, chimp, mouse,
rat, dog, chicken, fugu, and zebrafish genomic DNA. CEAS generates thumbnail
conservation plot for each ChIP-region and the average conservation plot for
all the ChIP-regions, which can be directly used in ChIP-chip biologists'
manuscript.
"


Any python scripts or bioconductor package can do it?

Thanks
crazyhottommy is offline   Reply With Quote
Old 04-22-2013, 02:05 AM   #2
paolo.kunder
Member
 
Location: Milano, Italy

Join Date: Aug 2011
Posts: 93
Default

Hi,

I encountered the same problem few weeks ago,

It seems that there are not many "expert" regarding PhastCons in this forum.

I explain you what I did,

I took my Chip-seq regions, in bed file format, intersect them with PhastCons element in (mouse example)

Code:
http://hgdownload-test.cse.ucsc.edu/goldenPath/mm9/phastCons30way/

and calculated for each genomic position the score of conservation.
Then averaged all the score.

Consider that you will need to transform a bit the PhastCons file format,

for example:

original format ( just replaced some words with empty)
Code:
chrom=chr1 start=3000306
0.006
0.010
0.014
chrom=chrX start=40000306
0.014
chrom=chr9 start=80000306 
0.1
0.2
processed format
Code:
chr1 3000306 3000307 0.006
chr1 3000307 3000308 0.010
chr1 3000308 3000309 0.019
chrX 40000306 40000307 0.014
chr9 80000306 80000307 0.1
chr9 80000307 80000308 0.2
with the following script:
Code:
awk '/^chrom/{split($1,a,"=");split($2,b,"=");next} { printf "%s\t%10d\t%10d\t%f\n",a[2],b[2],b[2]+1,$1;b[2]++}' filename


Another possibility that I am investigating is using circos
HTML Code:
http://circos.ca/
but You will need to study first how the software works,

Cheers,
Paolo
paolo.kunder is offline   Reply With Quote
Old 04-22-2013, 06:16 PM   #3
crazyhottommy
Senior Member
 
Location: Gainesville

Join Date: Apr 2012
Posts: 140
Default

Thank you so much!!



Quote:
Originally Posted by paolo.kunder View Post
Hi,

I encountered the same problem few weeks ago,

It seems that there are not many "expert" regarding PhastCons in this forum.

I explain you what I did,

I took my Chip-seq regions, in bed file format, intersect them with PhastCons element in (mouse example)

Code:
http://hgdownload-test.cse.ucsc.edu/goldenPath/mm9/phastCons30way/

and calculated for each genomic position the score of conservation.
Then averaged all the score.

Consider that you will need to transform a bit the PhastCons file format,

for example:

original format ( just replaced some words with empty)
Code:
chrom=chr1 start=3000306
0.006
0.010
0.014
chrom=chrX start=40000306
0.014
chrom=chr9 start=80000306 
0.1
0.2
processed format
Code:
chr1 3000306 3000307 0.006
chr1 3000307 3000308 0.010
chr1 3000308 3000309 0.019
chrX 40000306 40000307 0.014
chr9 80000306 80000307 0.1
chr9 80000307 80000308 0.2
with the following script:
Code:
awk '/^chrom/{split($1,a,"=");split($2,b,"=");next} { printf "%s\t%10d\t%10d\t%f\n",a[2],b[2],b[2]+1,$1;b[2]++}' filename


Another possibility that I am investigating is using circos
HTML Code:
http://circos.ca/
but You will need to study first how the software works,

Cheers,
Paolo
crazyhottommy is offline   Reply With Quote
Old 07-09-2015, 11:42 AM   #4
szaman
Junior Member
 
Location: Connecticut

Join Date: Jul 2015
Posts: 1
Default Making the step size 200 bp

Hello Paolo,

Your code is extraordinarily useful, however, I am trying to make the step size 200 bp and while doing so, I want to average the phastCon score over each 200 bp region. I am not even sure how to get started (really not even sure which language to use, I only know rudimentary unix, perl, & R). As much as I love one-liner codes, I do not think that it possible in this situation. I am working with hg38.phastCons20way.wigFix (phastcon scores calculates for human genome wide based on 19 species alignments) and it is a huge file. Any guidance would be much appreciated.
szaman is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:07 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO