SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Updated How to convert .txt file to .bed .GFF or .BAR file format, forevermark4 Bioinformatics 2 06-30-2014 05:02 AM
PeakSeq mappability map Wonko Bioinformatics 2 10-21-2013 02:48 AM
PubMed: Fast computation and applications of genome mappability. Newsbot! Literature Watch 0 01-26-2012 11:50 PM
what is the file size for a 30X human genome sequencing file, raw and BAM? RNA-seq Illumina/Solexa 2 04-15-2011 11:27 AM
Mappability or "why my mapping is biased?" polivares General 8 03-12-2011 08:48 AM

Reply
 
Thread Tools
Old 09-25-2012, 09:10 PM   #1
omiguele
Junior Member
 
Location: Mexico City

Join Date: Sep 2012
Posts: 7
Default Mappability File

Hi Im kind of new in linux and I am trying to create a Mappability File using the executables and instructions that are part of peakseq that are here http://archive.gersteinlab.org/proj/...lity_Map/Code/ Finally when I do "python compile.py" I get this /bin/sh: 1: chr2hash: not found
/bin/sh: 1: oligoFindPLFFile: not found
/bin/sh: 1: mergeOligoCounts: not found

I know this has to do with put them in my PATH but I dont understand what does that mean. Could someone please explain that to me and what should I do to solve this problem?

Thanks I really appreciate your time
omiguele is offline   Reply With Quote
Old 09-26-2012, 04:39 AM   #2
TiborNagy
Senior Member
 
Location: Budapest

Join Date: Mar 2010
Posts: 329
Default

PATH is an environment variable. When you run a program, it was looked for in your current directory and in PATH. PATH contains other directories where runnable programs can be found. You can see the value of it with the following command: echo $PATH.
You can add new directory temporary with this:
export PATH=$PATH:/yourdirectory
TiborNagy is offline   Reply With Quote
Old 09-26-2012, 10:00 AM   #3
omiguele
Junior Member
 
Location: Mexico City

Join Date: Sep 2012
Posts: 7
Default

Thanks I tried what you kindly answered, but I got the same message for the three executables...do you have any idea why isnt it working?

Again thank you
omiguele is offline   Reply With Quote
Old 09-26-2012, 10:09 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,884
Default

See this page for help with adding new directories to PATH variable: http://www.cyberciti.biz/faq/unix-linux-adding-path/

You appear to be using sh/bash shell.

Did you do the following per "TiborNagy's" directions (change to the directory where you issued the python command):

1. Type "pwd" (without quotes). This should print the full directory path
2. Type "export PATH=$PATH:copy_and_paste_the_path_string_printed_after_step_1" (without quotes, replace the correct string)
3. Try the python command again.

Are you still getting errors after doing the above?

Last edited by GenoMax; 09-26-2012 at 10:16 AM.
GenoMax is offline   Reply With Quote
Old 09-26-2012, 01:33 PM   #5
omiguele
Junior Member
 
Location: Mexico City

Join Date: Sep 2012
Posts: 7
Default Creating Mappability File

Thank you Im not having problems with that now. But I wonder if you could help me with this.Once all files were produced using python compile.py, what are the files that CountMap.py takes as input?... I tried all the different ones for a chromosome but I get this " SyntaxError: Non-ASCII character '\xac' " Do I need to process this files before using CountMap.py? And if thats the case which program should I use?

Thank you
omiguele is offline   Reply With Quote
Old 09-27-2012, 03:42 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,884
Default

It sounds like there is some kind of formatting issue with your data files.

Have you tried to download the test data set from the PeakSeq page and checked to see if that works.

There is a pipeline that uses the PeakSeq programs. It shows example input strings for the various python program. http://array.mbb.yale.edu/pipeline/scoring.html

Here is the example for CountMap.py input from the pipeline page. You will need to change file paths to match your own.

~/solexa/bin/countMaps.py [-c <count dir>] [-o <output dir>] [-w
<window size>] [-h]

-c directory where you did the count
-o directory where you want the mappability file to be (if not given, ~/solexa/mappability)
-w window size (if not given, 1,000,000)
-h help
GenoMax is offline   Reply With Quote
Old 12-29-2012, 11:37 PM   #7
mjp
Member
 
Location: USA

Join Date: Mar 2011
Posts: 25
Default

Does anybody know how was the 'mappability' file created from paper by Hesselberth et.al on "Global mapping of protein-DNA interactions in vivo by digital genomic footprinting"?

They came up with BED file structured like this (fragment of file):
chr1 5 6 . 100
chr1 6 7 . 100
chr1 7 8 . 100
chr1 8 9 . 100
chr1 9 10 . 100
chr1 10 11 . 100
chr1 11 12 . 10
chr1 12 13 . 10
chr1 13 14 . 10
chr1 14 15 . 10
chr1 15 16 . 10
chr1 16 17 . 100
chr1 17 18 . 100
chr1 18 19 . 100
chr1 19 20 . 100
chr1 20 21 . 100

I'm trying to understand how was the 5th column created or in other words: what is the meaning of '10' and '100' ?
They call this mappability data but they seem to store there only the unmapped bases (?)

http://noble.gs.washington.edu/proj/footprinting/

Any help on how to create this file would be appreciated.
mjp is offline   Reply With Quote
Reply

Tags
mappability, peakseq, ubuntu

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:54 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO