Seqanswers Leaderboard Ad

**westerman** · 01-06-2010, 07:51 AM

You need to set the GOLDENPATH environmental variable to point to the directory containing those files. See the documentation. Play around a bit.

**sulicon** · 02-28-2011, 03:06 PM

I have met the same problem.

I have put the genomic sequences under "chromosomes" directory, and the annotation files, refGene.txt and snp131.txt in the neighboring folder called "database". According to the document from Roche, the annotation files could be found this way. But I failed to let it work.

Moreover, I have set the parent directory of the above directories, hg19, as the "GOLDENPATH":
echo $GOLDENPATH
/home/sulicon/data/hg19
It didn't work neither...

I have also tried to set "GOLDENPATH" as "/home/sulicon/data", and used "hg19", as the name of reference genome that would be used. Unfortunately, gsMapper wasn't able to recognize the folder structure..

Any suggestion is appreciated.

**sklages** · 03-01-2011, 02:51 AM

The GOLDENPATH is pointing to "/home/sulicon/data" which contains a subfolder "hg19" which contains the subfolders "chromosomes" (single fa files) and "database" (containing snp131.txt, refLink.txt, refGene.txt and productName.txt), correct?

And how did you start the mapper?

Minimal example (aasuming EST data as input):
$ runMapping -cdna -gref hg19 READS.sff

This should work ...

Sven

**sulicon** · 03-01-2011, 11:03 AM

Hi Sven,

Thanks very much! I have tried what you said:

$ echo $GOLDENPATH
/home/shuli/data
$ ls /home/sulicon/data/hg19
chromosomes database
$ runMapping -cdna -gref hg19 /path/to/reads/reads.sff
Error: Reference file/directory does not exist: hg19

I have noticed you mentioned "single fa files" should be put into the "chromosomes" folder, whereas I have put fasta files each corresponding to a chromosome. Maybe this is the problem? Will have a try later on...

**sklages** · 03-01-2011, 11:29 AM

Originally posted by sulicon View Post

Hi Sven,

Thanks very much! I have tried what you said:

$ echo $GOLDENPATH
/home/shuli/data
$ ls /home/sulicon/data/hg19
chromosomes database
$ runMapping -cdna -gref hg19 /path/to/reads/reads.sff
Error: Reference file/directory does not exist: hg19

I have noticed you mentioned "single fa files" should be put into the "chromosomes" folder, whereas I have put fasta files each corresponding to a chromosome. Maybe this is the problem? Will have a try later on...

1) This is how my (probably too fat) UCSC dir tree looks like,

404 Not Found

https://ws.molgen.mpg.de/ws/672764/hg19.txt

2) you set GOLDENPATH to /home/shuli/data and were using a different path to store the data /home/sulicon/data. You are telling gsMapper to look in /home/shuli/data/hg19 which is probably not the correct dir ..

hth, Sven

**sulicon** · 03-01-2011, 11:44 AM

Thanks again.
The GOLDENPATH variable is corrected now but the reference seq still can't be recognized...

The following is the structure of my hg19 directory. It looks similar with yours.

Code:

$ tree hg19
hg19
|-- chromosomes
|   |-- chr1.fa
|   |-- chr10.fa
|   |-- chr11.fa
|   |-- chr11_gl000202_random.fa
|   |-- chr12.fa
|   |-- chr13.fa
|   |-- chr14.fa
|   |-- chr15.fa
|   |-- chr16.fa
|   |-- chr17.fa
|   |-- chr17_ctg5_hap1.fa
|   |-- chr17_gl000203_random.fa
|   |-- chr17_gl000204_random.fa
|   |-- chr17_gl000205_random.fa
|   |-- chr17_gl000206_random.fa
|   |-- chr18.fa
|   |-- chr18_gl000207_random.fa
|   |-- chr19.fa
|   |-- chr19_gl000208_random.fa
|   |-- chr19_gl000209_random.fa
|   |-- chr1_gl000191_random.fa
|   |-- chr1_gl000192_random.fa
|   |-- chr2.fa
|   |-- chr20.fa
|   |-- chr21.fa
|   |-- chr21_gl000210_random.fa
|   |-- chr22.fa
|   |-- chr3.fa
|   |-- chr4.fa
|   |-- chr4_ctg9_hap1.fa
|   |-- chr4_gl000193_random.fa
|   |-- chr4_gl000194_random.fa
|   |-- chr5.fa
|   |-- chr6.fa
|   |-- chr6_apd_hap1.fa
|   |-- chr6_cox_hap2.fa
|   |-- chr6_dbb_hap3.fa
|   |-- chr6_mann_hap4.fa
|   |-- chr6_mcf_hap5.fa
|   |-- chr6_qbl_hap6.fa
|   |-- chr6_ssto_hap7.fa
|   |-- chr7.fa
|   |-- chr7_gl000195_random.fa
|   |-- chr8.fa
|   |-- chr8_gl000196_random.fa
|   |-- chr8_gl000197_random.fa
|   |-- chr9.fa
|   |-- chr9_gl000198_random.fa
|   |-- chr9_gl000199_random.fa
|   |-- chr9_gl000200_random.fa
|   |-- chr9_gl000201_random.fa
|   |-- chrM.fa
|   |-- chrUn_gl000211.fa
|   |-- chrUn_gl000212.fa
|   |-- chrUn_gl000213.fa
|   |-- chrUn_gl000214.fa
|   |-- chrUn_gl000215.fa
|   |-- chrUn_gl000216.fa
|   |-- chrUn_gl000217.fa
|   |-- chrUn_gl000218.fa
|   |-- chrUn_gl000219.fa
|   |-- chrUn_gl000220.fa
|   |-- chrUn_gl000221.fa
|   |-- chrUn_gl000222.fa
|   |-- chrUn_gl000223.fa
|   |-- chrUn_gl000224.fa
|   |-- chrUn_gl000225.fa
|   |-- chrUn_gl000226.fa
|   |-- chrUn_gl000227.fa
|   |-- chrUn_gl000228.fa
|   |-- chrUn_gl000229.fa
|   |-- chrUn_gl000230.fa
|   |-- chrUn_gl000231.fa
|   |-- chrUn_gl000232.fa
|   |-- chrUn_gl000233.fa
|   |-- chrUn_gl000234.fa
|   |-- chrUn_gl000235.fa
|   |-- chrUn_gl000236.fa
|   |-- chrUn_gl000237.fa
|   |-- chrUn_gl000238.fa
|   |-- chrUn_gl000239.fa
|   |-- chrUn_gl000240.fa
|   |-- chrUn_gl000241.fa
|   |-- chrUn_gl000242.fa
|   |-- chrUn_gl000243.fa
|   |-- chrUn_gl000244.fa
|   |-- chrUn_gl000245.fa
|   |-- chrUn_gl000246.fa
|   |-- chrUn_gl000247.fa
|   |-- chrUn_gl000248.fa
|   |-- chrUn_gl000249.fa
|   |-- chrX.fa
|   |-- chrY.fa
|   `-- chromFa.tar.gz
`-- database
    |-- refGene.txt
    |-- refLink.txt
    `-- snp131.txt

**sklages** · 03-01-2011, 12:12 PM

what about "productName.txt"?

**sulicon** · 03-01-2011, 01:15 PM

I don't have this file. Is it required? And the problem is that even the reference genome can't be recognized:
"Error: Reference file/directory does not exist: hg19"

Maybe I need the "bigZips" folder as you did?

**sklages** · 03-01-2011, 01:47 PM

As Roche stated in their manual that the suite recognizes the UCSC directory structure I went the lazy way, I just used the whole tree. I have not really tested which files/folder can be omitted ..

Though it is very strange that you get an error message stating that the file has not been found. .. it sounds as if there is still a "mismatch" between the GOLDENPATH path and the actual data location ..

Maybe it is best to try the whole tree and (if you are patient) remove the parts not necessary for your mapping (but probably it is not worth removing files).

**sulicon** · 03-03-2011, 03:38 PM

It turns out that the reason for this is I've forgot to 'export' the GOLDENPATH variable... Everything is OK now.

**G.Chevignon** · 04-18-2011, 01:19 AM

Hello everybody

I want to achieve a mapping of reads 454 on a genome with a threshold of 5 or 10 reads to the formation of consensus. This setting is in the GUI version of GSMapper "Minimum contig depht" but I can not find it in the CLI version ofGSMapper.

This parameter is there in the CLI version?

Thank you for your help

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 39 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Running 454 mapper by command line CLI

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News