SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bam to bigwig gokhulkrishnakilaru Bioinformatics 6 11-21-2016 11:54 AM
IGV and bigwig / bigbed Jim Robinson General 0 06-27-2011 08:28 AM
Problem with visualizing bigwig files... Help! anagari Bioinformatics 4 06-13-2011 11:10 AM
convert wig to bigwig khb Bioinformatics 1 12-16-2010 07:02 AM
view .wig files alperyilmaz Bioinformatics 7 02-16-2009 08:13 AM

Reply
 
Thread Tools
Old 01-16-2012, 02:44 PM   #1
francy
Member
 
Location: London

Join Date: Jun 2011
Posts: 19
Default View and manipulate bigWig files

Dear experts,

I have downloaded the FAIRE-seq 'Signal' files that are in bigWig format from the UCSC Open Chromatin downloads to use them for further manipulation: I would like to extract the density values from these bigWig files for specific positions on the genome. Do you know how I can view these files, or convert them to a format that I can view and manipulated in for example cran R?

Thank you.
francy is offline   Reply With Quote
Old 01-16-2012, 03:13 PM   #2
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 836
Default

See the 'Extracting Data from the bigWig Format' section of this page:

http://genome.ucsc.edu/goldenPath/help/bigWig.html
gringer is offline   Reply With Quote
Old 01-18-2012, 12:54 AM   #3
dawe
Senior Member
 
Location: 4530'25.22"N / 915'53.00"E

Join Date: Apr 2009
Posts: 258
Default

Quote:
Originally Posted by francy View Post
Dear experts,

I have downloaded the FAIRE-seq 'Signal' files that are in bigWig format from the UCSC Open Chromatin downloads to use them for further manipulation: I would like to extract the density values from these bigWig files for specific positions on the genome. Do you know how I can view these files, or convert them to a format that I can view and manipulated in for example cran R?

Thank you.
Hi there, you can use bx-python to read BigFile (i.e. bigWig and bigBed). if foo.bigwig is your file you can

Code:
import bx.bbi.bigwig_file
bwh = bx.bbi.bigwig_file.BigWigFile(open("foo.bigwig", "rb"))
data = bwh.get_as_array(chrom, 0, csize)
where chrom is a string for your chromosome and csize is integer for its size. Of course you can get a smaller interval each time. If you need to know chromosome size from the bigwig, it may be (without bx-python):

Code:
def getChromosomeSizesFromBigWig(bwname):
  csize = {}
  fh = open(os.path.expanduser(bwname), "rb")
  # read magic number to guess endianness
  magic = fh.read(4)
  if magic == '&\xfc\x8f\x88':
    endianness = '<'
  elif magic == '\x88\x8f\xfc&':
    endianness = '>'
  else:
    raise IOError("The file is not in bigwig format")
  # read the header
  (version, zoomLevels, chromosomeTreeOffset, 
  fullDataOffset, fullIndexOffset, fieldCount, definedFieldCount, 
  autoSqlOffset, totalSummaryOffset, uncompressBufSize, reserved) = struct.unpack(endianness + 'HHQQQHHQQIQ', fh.read(60))
  if version < 3:
    raise IOError("Bigwig files version <3 are not supported")
  # go to the data
  fh.seek(chromosomeTreeOffset)
  # read magic again
  magic = fh.read(4)
  if magic == '\x91\x8c\xcax':
    endianness = '<'
  elif magic == 'x\xca\x8c\x91':
    endianness = '>'
  else:
    raise ValueError("Wrong magic for this bigwig data file")
  (blockSize, keySize, valSize, itemCount, reserved) = struct.unpack(endianness + 'IIIQQ', fh.read(28))
  (isLeaf, reserved, count) = struct.unpack(endianness + 'BBH', fh.read(4))
  for n in range(count):
    (key, chromId, chromSize) = struct.unpack(endianness + str(keySize) + 'sII', fh.read(keySize + 2 * 4))
    # we have chrom and size
    csize[key.replace('\x00', '')] = chromSize
  return csize
This is based on the specs released along with bigwig paper.

HTH
dawe is offline   Reply With Quote
Old 01-24-2012, 01:05 AM   #4
francy
Member
 
Location: London

Join Date: Jun 2011
Posts: 19
Default

Quote:
Originally Posted by gringer View Post
See the 'Extracting Data from the bigWig Format' section of this page:

http://genome.ucsc.edu/goldenPath/help/bigWig.html
I have tried using this but when I use the script 'bigWigToBedGraph' as described from UCSC downloads (bigWigToBedGraph wgEncodeOpenChromFaireGm12878Sig.bigWig out.bedGraph) the script takes very long time and then tells me that it crashed. I have also tried limiting the output by chromosome or position but it still freezes. Could it be because the bigWig file is too large? Do you know if there is a way to extract only density for certain SNPs from a list?

Thank you.

Last edited by francy; 01-24-2012 at 01:09 AM.
francy is offline   Reply With Quote
Old 01-24-2012, 01:15 AM   #5
francy
Member
 
Location: London

Join Date: Jun 2011
Posts: 19
Default

Quote:
Originally Posted by dawe View Post
Hi there, you can use bx-python to read BigFile (i.e. bigWig and bigBed). if foo.bigwig is your file you can
Thank you for this tip, I am trying the px-python now...In particular, I am trying to understand if bx-python allows extracting single SNPs density. If you have any idea could you please let me know?
Thank you.

Quote:
Originally Posted by dawe View Post
This is based on the specs released along with bigwig paper.
I am not too good with programming yet and I am having trouble understanding this script, could you please point me to the bigwig paper you are referring to so I can read more about this?
Thank you again.
francy is offline   Reply With Quote
Old 01-24-2012, 01:20 AM   #6
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 836
Default

If the output file size is a problem, then you can pipe to standard out by telling the program that '/dev/fd/1' is the output file:

Code:
$ bigWigToBedGraph -chrom=v31.005068 -start=11200 -end=20000 Irr_day3_B.bw /dev/fd/1 | head
v31.005068	11253	11286	1
v31.005068	11300	11328	1
v31.005068	11328	11348	2
v31.005068	11348	11376	1
v31.005068	11533	11566	1
v31.005068	11757	11805	2
v31.005068	11817	11833	1
v31.005068	11833	11839	3
v31.005068	11839	11846	4
v31.005068	11846	11872	6
gringer is offline   Reply With Quote
Old 01-24-2012, 02:55 AM   #7
francy
Member
 
Location: London

Join Date: Jun 2011
Posts: 19
Default

Quote:
Originally Posted by gringer View Post
If the output file size is a problem, then you can pipe to standard out by telling the program that '/dev/fd/1' is the output file:
Dear Gringer, thank you very much for your help.
The script still hangs when I try this as you suggested:

Code:
 bigWigToBedGraph -chrom=chr1 wgEncodeOpenChromFaireGm12878Sig.bigWig /dev/fd/1 | head
The bigWig file that I have downloaded is 2.7 GB...
francy is offline   Reply With Quote
Old 01-24-2012, 03:22 AM   #8
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 836
Default

Quote:
Originally Posted by francy View Post
The script still hangs when I try this as you suggested:

Code:
 bigWigToBedGraph -chrom=chr1 wgEncodeOpenChromFaireGm12878Sig.bigWig /dev/fd/1 | head
The bigWig file that I have downloaded is 2.7 GB...
It may be faster if you specify both -start and -end. Assuming you don't have chromosomes with more than 1GB, this should work:

Code:
$ time bigWigToBedGraph -chrom=chr1 -start=1 -end=1000000000 wgEncodeOpenChromFaireGm12878Sig.bigWig /dev/fd/1 | head
chr1	9999	10000	0.0028
chr1	10000	10005	0.0029
chr1	10005	10009	0.003
chr1	10009	10013	0.0031
chr1	10013	10018	0.0032
chr1	10018	10022	0.0033
chr1	10022	10027	0.0034
chr1	10027	10031	0.0035
chr1	10031	10036	0.0036
chr1	10036	10041	0.0037

real	0m7.691s
user	0m5.708s
sys	0m1.776s
But then I re-ran this with no start/end points, and it took a similar length of time:
Code:
$ time bigWigToBedGraph -chrom=chr1 wgEncodeOpenChromFaireGm12878Sig.bigWig /dev/fd/1 | head
chr1	9999	10000	0.0028
chr1	10000	10005	0.0029
chr1	10005	10009	0.003
chr1	10009	10013	0.0031
chr1	10013	10018	0.0032
chr1	10018	10022	0.0033
chr1	10022	10027	0.0034
chr1	10027	10031	0.0035
chr1	10031	10036	0.0036
chr1	10036	10041	0.0037

real	0m7.671s
user	0m5.860s
sys	0m1.596s
gringer is offline   Reply With Quote
Reply

Tags
bigwig, data format, ucsc

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:28 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO