Seqanswers Leaderboard Ad

**gringer** · 01-16-2012, 04:13 PM

See the 'Extracting Data from the bigWig Format' section of this page:

http://genome.ucsc.edu/goldenPath/help/bigWig.html

**dawe** · 01-18-2012, 01:54 AM

Originally posted by francy View Post

Dear experts,

I have downloaded the FAIRE-seq 'Signal' files that are in bigWig format from the UCSC Open Chromatin downloads to use them for further manipulation: I would like to extract the density values from these bigWig files for specific positions on the genome. Do you know how I can view these files, or convert them to a format that I can view and manipulated in for example cran R?

Thank you.

Hi there, you can use bx-python to read BigFile (i.e. bigWig and bigBed). if foo.bigwig is your file you can

Code:

import bx.bbi.bigwig_file
bwh = bx.bbi.bigwig_file.BigWigFile(open("foo.bigwig", "rb"))
data = bwh.get_as_array(chrom, 0, csize)

where chrom is a string for your chromosome and csize is integer for its size. Of course you can get a smaller interval each time. If you need to know chromosome size from the bigwig, it may be (without bx-python):

Code:

def getChromosomeSizesFromBigWig(bwname):
  csize = {}
  fh = open(os.path.expanduser(bwname), "rb")
  # read magic number to guess endianness
  magic = fh.read(4)
  if magic == '&\xfc\x8f\x88':
    endianness = '<'
  elif magic == '\x88\x8f\xfc&':
    endianness = '>'
  else:
    raise IOError("The file is not in bigwig format")
  # read the header
  (version, zoomLevels, chromosomeTreeOffset, 
  fullDataOffset, fullIndexOffset, fieldCount, definedFieldCount, 
  autoSqlOffset, totalSummaryOffset, uncompressBufSize, reserved) = struct.unpack(endianness + 'HHQQQHHQQIQ', fh.read(60))
  if version < 3:
    raise IOError("Bigwig files version <3 are not supported")
  # go to the data
  fh.seek(chromosomeTreeOffset)
  # read magic again
  magic = fh.read(4)
  if magic == '\x91\x8c\xcax':
    endianness = '<'
  elif magic == 'x\xca\x8c\x91':
    endianness = '>'
  else:
    raise ValueError("Wrong magic for this bigwig data file")
  (blockSize, keySize, valSize, itemCount, reserved) = struct.unpack(endianness + 'IIIQQ', fh.read(28))
  (isLeaf, reserved, count) = struct.unpack(endianness + 'BBH', fh.read(4))
  for n in range(count):
    (key, chromId, chromSize) = struct.unpack(endianness + str(keySize) + 'sII', fh.read(keySize + 2 * 4))
    # we have chrom and size
    csize[key.replace('\x00', '')] = chromSize
  return csize

This is based on the specs released along with bigwig paper.

HTH

**francy** · 01-24-2012, 02:05 AM

Originally posted by gringer View Post

See the 'Extracting Data from the bigWig Format' section of this page:

http://genome.ucsc.edu/goldenPath/help/bigWig.html

I have tried using this but when I use the script 'bigWigToBedGraph' as described from UCSC downloads (bigWigToBedGraph wgEncodeOpenChromFaireGm12878Sig.bigWig out.bedGraph) the script takes very long time and then tells me that it crashed. I have also tried limiting the output by chromosome or position but it still freezes. Could it be because the bigWig file is too large? Do you know if there is a way to extract only density for certain SNPs from a list?

Thank you.

**francy** · 01-24-2012, 02:15 AM

Originally posted by dawe View Post

Hi there, you can use bx-python to read BigFile (i.e. bigWig and bigBed). if foo.bigwig is your file you can

Thank you for this tip, I am trying the px-python now...In particular, I am trying to understand if bx-python allows extracting single SNPs density. If you have any idea could you please let me know?
Thank you.

Originally posted by dawe View Post

This is based on the specs released along with bigwig paper.

I am not too good with programming yet and I am having trouble understanding this script, could you please point me to the bigwig paper you are referring to so I can read more about this?
Thank you again.

**gringer** · 01-24-2012, 02:20 AM

If the output file size is a problem, then you can pipe to standard out by telling the program that '/dev/fd/1' is the output file:

Code:

$ bigWigToBedGraph -chrom=v31.005068 -start=11200 -end=20000 Irr_day3_B.bw /dev/fd/1 | head
v31.005068	11253	11286	1
v31.005068	11300	11328	1
v31.005068	11328	11348	2
v31.005068	11348	11376	1
v31.005068	11533	11566	1
v31.005068	11757	11805	2
v31.005068	11817	11833	1
v31.005068	11833	11839	3
v31.005068	11839	11846	4
v31.005068	11846	11872	6

**francy** · 01-24-2012, 03:55 AM

Originally posted by gringer View Post

If the output file size is a problem, then you can pipe to standard out by telling the program that '/dev/fd/1' is the output file:

Dear Gringer, thank you very much for your help.
The script still hangs when I try this as you suggested:

Code:

 bigWigToBedGraph -chrom=chr1 wgEncodeOpenChromFaireGm12878Sig.bigWig /dev/fd/1 | head

The bigWig file that I have downloaded is 2.7 GB...

**gringer** · 01-24-2012, 04:22 AM

Originally posted by francy View Post

The script still hangs when I try this as you suggested:

Code:

 bigWigToBedGraph -chrom=chr1 wgEncodeOpenChromFaireGm12878Sig.bigWig /dev/fd/1 | head

The bigWig file that I have downloaded is 2.7 GB...

It may be faster if you specify both -start and -end. Assuming you don't have chromosomes with more than 1GB, this should work:

Code:

$ time bigWigToBedGraph -chrom=chr1 -start=1 -end=1000000000 wgEncodeOpenChromFaireGm12878Sig.bigWig /dev/fd/1 | head
chr1	9999	10000	0.0028
chr1	10000	10005	0.0029
chr1	10005	10009	0.003
chr1	10009	10013	0.0031
chr1	10013	10018	0.0032
chr1	10018	10022	0.0033
chr1	10022	10027	0.0034
chr1	10027	10031	0.0035
chr1	10031	10036	0.0036
chr1	10036	10041	0.0037

real	0m7.691s
user	0m5.708s
sys	0m1.776s

But then I re-ran this with no start/end points, and it took a similar length of time:

Code:

$ time bigWigToBedGraph -chrom=chr1 wgEncodeOpenChromFaireGm12878Sig.bigWig /dev/fd/1 | head
chr1	9999	10000	0.0028
chr1	10000	10005	0.0029
chr1	10005	10009	0.003
chr1	10009	10013	0.0031
chr1	10013	10018	0.0032
chr1	10018	10022	0.0033
chr1	10022	10027	0.0034
chr1	10027	10031	0.0035
chr1	10031	10036	0.0036
chr1	10036	10041	0.0037

real	0m7.671s
user	0m5.860s
sys	0m1.596s

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 49 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

View and manipulate bigWig files

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News