SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Reply
 
Thread Tools
Old 08-25-2010, 03:44 AM   #1
Esther
Junior Member
 
Location: Netherlands

Join Date: Jul 2010
Posts: 5
Default Lastz help

Hi all,

I'm trying to get lastz to compare some sequences to the human reference genome. This is the command I'm using:

lastz /path/to/data/fastafile.fasta[multiple] /path/to/reference/reference_human.fa --format=difference --identity=90 --coverage=50 --output=/path/to/data/lastzout.fa

This gives the following error:

FAILURE: call to realloc failed to allocate 135087680 bytes, for add_segment

Does anyone know what this means and what I can do about it?
Esther is offline   Reply With Quote
Old 10-12-2010, 01:23 AM   #2
david2
Member
 
Location: Switzerland

Join Date: May 2010
Posts: 19
Default

Hi Esther,

I am getting the identical error message when trying to align 454 run with ~8000 reads on an 80kb reference sequence, using:

./lastz brca1.fasta 454run.fna --format=sam --step=10 --seed=match12 --notransition --exact=20 --noytrim --match=1,5 --ambiguous=n --coverage=90 --identity=95

Is there anyone out there who knows what we do wrong?

I would like to replicate the Lastz parameters used on the public Galaxy server, does anyone know where I can find which parameters they use?

Cheers,

David
david2 is offline   Reply With Quote
Old 01-27-2014, 02:50 AM   #3
amitbik
Member
 
Location: Italy

Join Date: May 2013
Posts: 50
Default

I am using lastz for piler and my command is

$lastz unigenes.fa[multiple][unmask] --format=maf > unilastz.maf
FAILURE: bad fasta character in unigenes.fa, >Contig10: Y

my contig10 is
>Contig10
TGAAAACTAATATATATATATGTAGCATACCTGGATTCAACTTTACTGGTGGTTCAACAT
GAATCGTTCAAACAATTCCGCCTGTGCTTTTAGTTTCTCCAATTCGCCATTAGTCTTCTT
ATTTTCCTCCTTTTGAGCCTCCAACTCCATCTTGAGGGTCTGAATTTCTTCTTTAGCACT
CTCAAGCTGAGTTTGAGTCTCATTTATCCCAAAATAGCTATAGCGACGGGGCACCTTCAC
CCCCCAACCCATACCTTTTTGATACCCTGATCTCACACCGAGAGTCTCAACCATAATCTC
CTCTTCAGTCTTCGAATTTTCTCCATCCATCATAGACGATTCCTCCATTTCAGCCATTTT
CTCCTGAAATTTATTATTAATATATGTGAAATTATGGTAAAAGAATAGTTCCAGAATTAT
GGTAAAAGAATTGTTGAGGTGTTGAAACACAAGAGGTTTGGTGCAAAATTCCCTATTTTG
TGTAGAGCAACTTACATATGTCATTTCAGCCTCTTCGTTCACGAATTTATTCACGAATTG
ACCCTTCCTGCGAACATGATTATCCTTCCACAACAGAATCCTACTACGATTAACCTATAA
GGAAAACATATACAATTACTAAGTTTCTTACCTCATCACAATATCCAGTATAGATATCTG
AATTCAAAAATAGAGCACAAATATAGAAATATTAGATTTCAAAAATTGAAAGTACATACC
ATGCTCCGGTAGCGAACAAGGTTTCTGGTCCTTCCTGATGTGTGGGGAAAAAGCATGCAT
TTTCTATTTGCTTTATTGATGGCAGACCTCATCTACATCATAAACAATGAACAACAAATC
AATTTTTTTTATTTAATAAGAACCTTTGGCCAAATATAATCCCAGCATCCATACAATTAT
CTTTTGTCGCAGAAAAAAAGGAGAAACAGAGAGAGTGTTAGAATAAAAAGGCATACCTGA
AAACGATCAGACTCAACGTGCTCACATAGTGCATGCCAACTCTCAACTGTGAGTTCTCTC
CGCGGAATTGAATTTCTAGGTTGAAGCCCCTTGTTCTTCATATTTGTGTGGATCTTGCCC
AACCTATATCTCCAATTTCTGTAATTTGTCTGGAGTGCTTGATCTATTATAGTCTGGATT
CGCCAACTTGACAGGTCTTCAATCTCGAACCATTGCTAGTAATCAAAATATCTCTATCAT
TACTTGTATATATAAAATAAAAAGAAAAACACTTGGCTTATTTGTTCGAGGAGAATAAAA
AACTCACAAACTAAGGTGATTAAAAAGTAAAAATTACCAATAAATTTTCTAGACAAGCTT
GCTTATTTGTACTAGAAACTTCCTTCCACTTTGACACGCCTTCAATATCCGCATGTGCTC
TTACAGTAACCCCAATAGCATCAGCAAACTCTCCGGTATATTGTGTTGGTATTGTCTAAC
AAGGATTAAAAACTAATTTCATCTTCCCATTCTGCGAAACAACATAATAATTAGAGAGTT
AAATTCATAAAACAAGATCTCCAATTTTCACAGATTAGAATATTGTACCCGTTTCACCAT
CTTATCGAGGTGCAGGCCTCTACCCGGACCGTGAGTATTCTGTTTAGGCTTTGATCCTAA
ATAAAAAGTTATTGAAATAGTGTTAATGTTCCCTAAAGATACTAAACTAAATAAATATGT
TCCGCAAAGAATAAAGAATAGAATTAAAAAAAAAAAATGAAACCAAACAACATACCAATG
GTTGTTCCTGCAGATCCTTCTAGAGTTGCTGAGACAGGAATAGATGGAGGCTGACGCACA
CTTAAATTAGGAGATGAAGCAGTGGACCTTAGAGTATTTCCAGCAGTGGTGGATGTGGAA
GTTGACATATTAGATGCTACATATTTGGATTTTGAAGCAGCTGGAATTGGTTGTTTATTA
ACCACTGAAGTGGAAGGCCTGGAAGTTGACGGCTGGAATTTCAAAGTTGGTGGAATTGAT
GCAACATTTCCAGACGTAACATGTACCTGGGGCTGTTTATTAACCACTGTAGTAGAAGGT
CTGGAAGTTGACGGTTGGAATTTCGAAGTTGGTGGAATTGATGCAACATTTCTAGTCGTA
ACATGTACTTGGGGTCGTACAAACCTCTTGAACTTTGGTCCTCCTGACATCTGAAAAAAT
AAAAGACAAAAAGATATCAGTATGCATGATATAAAGATGAAGAGTTAAAGTATATGCTAC
ATATTCAGTAAAGTGTATTACATTGAATAGGCCGAAAAACAGTTAAAAAAGAAAAACCAA
AGCTATAACAGTTCTTGCATATTCTCTTATAACTAATAAAGAATGAGTTAATAAAAATAA
ATCCTCCTCTTCTTCTTCTTCTTCTTGTTCATTTCTTCTTCTTCTTCTTCCTCCTCCTCC
TCTTCTTCTTCCTCCTCTTCTCCCTCCTCCTCATATGATTGTTCAGTATTTTGATTTCTG
GACTTTCTAATGCTTTCAAAGATCATCCTATCAAGCTCTTCAGGTTCTAAATCGTCCCTT
GCCAATTGCCGTCGACCATTCTGCTCACTTGCTTCTTCAATTGGATCAACACGAGTAGAT
TCATATTGCTGAACAACCTCTGCCACAAATTCTTGTTCATTAATGGTTGTTTTGTCCACA
ACATCATTTACAGCTGATTCACAAACATCCCACAAATGCCTATGTGAAACTCTCTGCACA
ACCTTCCAATTTTTACCCAGTTTTGTGTCATTCACATAAAACACTTGTTGTGCCTGGACA
GGTAATATAAATGGTTGCTCCTTGTACCACCTAGAGCTAACATCAACACTAATGAAATGT
TTATCCTCTTGGATAAATTTATTACTTCCAGAGTTGAACCACTCACAATCAAACAAAACC
ACTTTTCTACCACCAATATATGTTAACTCCACAACCTTTCTCAAGAAACCGAAAAACTCA
ACCGGCTTCCTCTTATGAAATCCTCCAACTACAAGGCCACTATTTTGAGTTTTCGTCGTC
TGTCACAATCCCTTGTATGAAACCGGACCTGATTCACCATGCATCCATTATAAAAATCTG
AGCTCAGCCTCGAAGGACCATTTGCTAAAGCCCACAGCTCAGTTGTAATTTTGTCTGCTT
TCTGAAAATGCAACAGACCCATCTACAAGATAATATAATAGTTAGAACAAAGTATTCTGC
TAAAATCATTTAAGTTCAGAGTATAAGTTATCAACTAAAGGAAATAAACTAACATGTTTT
TAAACCACTGTGGGAAGGTGTTTTTCTGCCTTGGGTCAAGTTCATCCATGTTACCCACTT
CTTCTACGGTAATTCCTTGCAATTCAGCTCTATGCATTCTGAAAAGGGCATAACCATTAG
ATTACAAATAAGTTTATGGTTGTCAAGTGTCAACAAAACACTATGTATTACCACAGTGAA
CGGGTACATCATACTGCCTCAATCCAAACATGGAATAACATCATAGTTACTTAAGTTTTA
CTATCAACAGTAACTATCTAAAATCAGCAGCAGGTATATGCCTAAAGATACTGGAACAAA
TCAGTAACTATCTAAAATCAGCAGCAGGTATATGCCCAAAGATACTGGTAGTCATAGTTC
AGACAAAACTCGTAGACATAAGATAACAATAGAGAGAGATAAGTATTTCTTACTTTAGAT
ATGGCTCAACTTCATCACAATTGCACATTATGTACCAATGTGCCTCATCTAATTCTCCTC
TAGTCAAAGAACACTTGGGGCTAACTTCCCCAAATGGACGAGTTCTCATTGAAAATACAG
AAATTCCTTCCTTTTGTTGACCATAGACACCATCAAAATTCCGTGGAAGGCAATTAAATT
TAGTTTCCATGTCCATAGGCAAATGCATAGAAAAAAATGTGATGTTCTCATGCACCAAGT
ATTGTTTAGCGATGGAAGCTTCAGGATGATTTCTATTTCGGATAAAGCCTGTGAGAGATC
CCAGAAACCTAACAATAAGAAGGAAAGTAAACACCGTAGAATCAGAATCATGATACAGTT
AAAATCAGAATCATTGCATCATATATATCTCAAAAATTTCACCTGGCAGAAGCCGCATAA
GATGTTCCTTAGACACAATATATATCTCAAAAATTAAGGCTTGAAATCAAGTAAAAATGC
AAAATAAATCAAGGCCAAATAAAATGTTACTCTGATTTCATACCACAAACAACACAACTT
GTATGAGTACAGATCAGCAGCCAACTTAATAGACCAGAACACAAACTTAATAGACCAGAA
CACAGAATCACAAACTTATTCGACAATCACACGAAGGGTCACAATAAAAACTAACCTTTC
AAAGGGATACATCCAGCGAAATTGCACGGGTCCTGCAATCTTAGCCTCTTGGGCCAAGTG
GACAGATAAATGTACCATAATTGTGAAAAAGTAAGGTGGAAATATCTTCTCCAACTTACA
TAAGATAATTGGTATATTTTCATCCAACCGATTCACATCTTCGTCATACAAGGTCTTGGA
ACATAATTGCTTAAAGTAATCTCCTAATTCACATAAAGCGTCCTTCACTTCCTTGGTCAA
GTAACCACGAATTCCAGCAGGAATTAATCTCTGGAGTAGAATATGACAGTCATGGGACTT
CAGTCGTGATAATTTATCCTTTTTCACACGTGCTGAGATGTTAGAAGCAAACCCATCAGG
AAACTTAACTAACTTCAGAAACTCCAAGACTTCTCCTCTTTCACTTGGGGATAAGGTAAA
ACATGCAGGTGGCCTGAATATTTTATTGCCATTATCTTGCAGGTGCAGCTCCTTCCTAAT
ACCCATATCGCGTAAATCTTCTCTTGCAACATTAGAGTCTTTTGTTTTCCCTTTAATATC
AAATAATGTACCAAGGACATTATCACATATATTCTTCTCAACGTGCATCGGATCGAGATT
GTGCCTTATCTTCAACATGTGCCAATATGGCAGCTGGTAGAAACCACTTATTTTATCCCA
ATTTACTTGATCCGAATCACGCTTTCTTTTACGATTTCCAGCTGCAATGAACACATCTTG
CAATGAATTAAGCCAATACAATATTTCTTCACCTGAAAGTTGTTTAGGTTTCTCTCTATG
CTCCTCTTTACCATTGAAAGTTAAACTTGTTCCTGCGGAATGGGTGTCTTTCGGGTAGAA
AACGACGATGTCCCATGTAACATATCTTTGTTCGTAATTGTGTAGAATCTGTCTCATCAA
GACAGCATGGGCATGCATAATAACCTGATGCTTGCCACCCTGACAAAAGACTATAACCAG
GCAAGTCGCTTATGGTCCACAACAACACAACACGCATTAGAAACTTCTCTTTTGTGAGAG
CATCATAAGTTTCGATCCCAACATCCCATAACTCCTTCAAATCTTCGATTAAGGGCTGCA
AATACACATCAATGTTTTTCCCAGGATAATTTGGGCCCGGTATAAGCAGCGACAACATAA
GAAATGGTTCACGCATACAATCATAAGAGGGCAGATTATAAACAACAAGAATCACGGGCC
AAATGGTATGCGAATTCCCGATGTCTGACCATGGAGTGAATCCATCACATGCTAAACCCA
ACCTAACATTCCGTGCATCCTTAGAAAACTCAGGAAATAGACTATCAAGGTGTTTCCAAG
CCAATGAATCGGCTGGATGTCTTGAGACTCCATCCTCAACTTTGCGTTTATCTTTGTGCC
AACGCATTTTTGCAGCCGTTTTGGTGAATGCAAAGAGTCTTTGCAACCGGGGCTTCAAGG
GGAAGTATCTCAATTTCTTATGAGCAACTCTCTTTATCTTACTACCAAGCGGCTGCCACC
TAGATTCTTTACAAACTGGACACTCGTTCATATTTTCATTCTCTTTCCGGAATAATACAC
AACCGTTCTTACATGCATCAATCTCTTCATACCCTAGTCCAAGTTCACGCAGCTCCTTCT
CAGCACCATATAATGACGATGGAAGAGTTTCACCTTCGGGTAAGATGCCTTTTATAAAGT
CCAGCAGCATACTGAAAGACTTGTCCGTCCATTTATCCTTAACTTTCAGATGTAGCAGTT
TTAGGGTAGCTAAAAGTGATGAAGTCTTACAGCCCGGATATAATTCATTTCTGGCATCAC
CCACCAACTTCTGGAATCTTTCTTCTTCTTCTTCTTCATAGTCGCTGTCCACATGTTGAC
CTTCAACCAATTCATCCAAGTACCCATTGATCGAACACATTAAATTCGACATCATCATTA
CCTTGATCATTTTCCTCATAATCATCAACACCCTCTGTTTCTGCTAGGAACTCATTTATC
TCGTCCTCGGATGGTAAACGAACATCATTATCAGTATCCTCATCATTTATATCAAGCTCT
TTCTCTCCATGATTGTTCCAAAGTACATATGTACGAGAAAATCCATTAACGTGTAAATCC
TTTTCAACCTCATCTATGGGCTTGTAGAATGAGTTGTTGCAAACCACACATGGACATTTA
ATCATGTTATCACTCCCCATCACATTATCAGCAGCAAACATAAGAAAACATTGCACACCA
TCCATATAAGAATTCGAAAACTTATTATGCTCTCTAATCCAACTTTTGTCCATTATCATT
ACGTTGTACCTAAGTAATAAATAGCTCAGGAAAAATATAACTCTGGCAAGTAGCTCAGGC
CTCAAATTAATAAATACCATATAAACATACACAAGTCAAATAATATCTACATAAGTAGAG
GAACAATCTGAAGCCAAATAATAAACATAGATACAAATATTCATCTAGAACTTATCAATA
TACACATACAGAAACGTGCCTAAATCATAAATTTTAGGAGAATTGAACTACATCAAAGTT
TGAAGTACATCATACCTTCTCTAACAAAGGACACAGCACAAACTTCGGATTTGGATAATG
ATTTGATCAAAGCTTGCTCCTCCAGATTCAAACGCAGCCACACTTCACAGTTGCTAGCCG
ATTCAAACGCGCCACCAGATTCAAACGCGCCACCAGATTGAAACACGATCCATCAGACGC
CTATCATTAACCTTCCTTTCCTGAATCTCCGATTGATAGTGTTGTACTGTTGTGTCGTGT
GATTAGATTTAGATCGATCGAAATTAGGGTTTTAGTAAAAGGGATTTTTGTTTGTGACTG
TGGGTTTCTGTCTAAGTAACGAAGAAGATGAAGTTATCTGATTCCCTTTTTAAAACTTAT
CCTTAATTGCAAAAAAGCCCAAAAGTGTTTCTTATTAACAAATATGCCGCCGAATTTTCT
TTCTTTCACAATTTTCAAATTTTATCATCTAAGAGCCTATTTTTCTATGATAATTTATTA
TATTTGTCATTCTAAAATATTTTTATTTGATAATTTAGTCTAAAAATGTCATCGTAAAGT
TTAATTGTGGTGACAAGTTTTAAGGTGCCACCTAATAGTTACTTTTATTTGATAATTTAY
CCTAATGTCATTTAAAACTAGTCATCTAACATCCTTTT

can anyone tell me why it is showing error and where it is wrong..??

Thanks....
amitbik is offline   Reply With Quote
Old 01-27-2014, 03:05 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

As the error says you have the character "Y" in the second last line of the contig you have posted above.

You may have to specify option "--ambiguous=iupac" for lastz. That comes with a caveat as explained in the README so be sure to check the explanation for that specific option here: http://www.bx.psu.edu/~rsharris/last...z-1.02.23.html
GenoMax is offline   Reply With Quote
Old 01-27-2014, 05:39 AM   #5
amitbik
Member
 
Location: Italy

Join Date: May 2013
Posts: 50
Default

Thank U ... GenoMax
amitbik is offline   Reply With Quote
Old 01-31-2014, 05:14 AM   #6
amitbik
Member
 
Location: Italy

Join Date: May 2013
Posts: 50
Default

Hi,
I am using lastz with this command but it is running from last 4 days and it is around 73.4 gb now. I am not able to understand why it is still running or i did wrong. Can any body guide me ?

lastz unigenes.fa[multiple] unigenes.fa --notrivial --format=maf --ambiguous=iupac > unilastz.maf

Thanks..
amitbik is offline   Reply With Quote
Old 01-31-2014, 05:27 AM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Are you searching the "unigene.fa" file against itself? If so how big is that file?

As indicated in the help page for lastz comparing entire chromosomes only takes a few hours. So 4 days seems long.

At this point in time you are too far into the computation to cancel and investigate other options like (--notransition or --step) which are supposed to lower sensitivity and run times. If you are using a cluster you can always start another job with these options and see if second job finishes before the first one.
GenoMax is offline   Reply With Quote
Old 01-31-2014, 07:33 PM   #8
amitbik
Member
 
Location: Italy

Join Date: May 2013
Posts: 50
Default

Thank U ... GenoMax

ya I am searching the "unigene.fa" file against itself and my file size is 334mb. I am running lastz on my system which is 16gb of ram and 8cpu.
amitbik is offline   Reply With Quote
Old 03-28-2014, 05:20 AM   #9
Bob-Harris
Member
 
Location: USA

Join Date: Mar 2014
Posts: 13
Default

Quote:
Originally Posted by Esther View Post
Hi all,
FAILURE: call to realloc failed to allocate 135087680 bytes, for add_segment
Howdy, Ester,

Sorry for this late reply. I am new to seqanswers today.

Are you still having this problem? If so, please email me so I can figure out why this is happening (you can find my email address in the lastz readme).

That error report is a failsafe and probably means that memory fragmentation is happening. I've never actually seen that error manifest in my own use. My best guess as to the cause is very high repeat content. But that's just a guess.

Bob H
(lastz author)
Bob-Harris is offline   Reply With Quote
Old 03-28-2014, 05:24 AM   #10
Bob-Harris
Member
 
Location: USA

Join Date: Mar 2014
Posts: 13
Default

Quote:
Originally Posted by david2 View Post
I am getting the identical error message when trying to align 454 run with ~8000 reads on an 80kb reference sequence, using:

./lastz brca1.fasta 454run.fna --format=sam --step=10 --seed=match12 --notransition --exact=20 --noytrim --match=1,5 --ambiguous=n --coverage=90 --identity=95
Howdy, David,

I'm very surprised this could be happening with such a small problem. Could you email me directly (my address is in the lastz readme)? I'd like to run the same sequences here so I can figure out how that memory allocation condition is being triggered.

Bob H
(lastz author)
Bob-Harris is offline   Reply With Quote
Old 04-10-2015, 08:43 AM   #11
Bob-Harris
Member
 
Location: USA

Join Date: Mar 2014
Posts: 13
Default

Quote:
Originally Posted by Bob-Harris View Post
That error report is a failsafe and probably means that memory fragmentation is happening. I've never actually seen that error manifest in my own use. My best guess as to the cause is very high repeat content. But that's just a guess.
I'm expanding on my earlier answer, since other users may encounter this problem and google leads them to this page.

My best guess when this happens is that the input sequences contain a lot of unmasked repetitive sequence. The simplest (probable) solution is to split the target into several runs, for example, have a separate run for each of some subset of the target sequences. The other possibility would be to softmask repeats the sequences (assuming they aren't already), but this can be time consuming as well.

In detail. Internally, lastz processes one query at a time. It identifies HSPs (high scoring ungapped alignments) between the query and all the target sequences, collects these into a table, and then proceeds to the gapped alignments stage. There is one segment in the table for each HSP. The amount of memory allocated for the table grows as more segments are needed. If allocation fails, a message is generated indicating a memory allocation failure in add_segment.

Even if such a large memory request could be satisfied, this wouldn't really solve the problem. Most likely the program wouldn't finish in a timely fashion.

The cause of such a large number of HSPs is usually something like an un-masked microsatellite, a long sequence with a short repeat period, e.g. AGTAGTAGTAGTAGT. These are problematic since, in the HSP finding stage, every shift by a multiple of the period looks as good as any other. If your lastz parameters allow for high divergence then highly-diverged microsatellites will have a similar effect.

If you are running several queries, it is often just a few queries that contain the problematic element. You can find out which query is being processed at the point of failure by adding --progress to the command line.

Bob H
(lastz author)
Bob-Harris is offline   Reply With Quote
Old 07-29-2015, 05:21 PM   #12
skothap
Junior Member
 
Location: indianapolis

Join Date: Jul 2015
Posts: 2
Default Lastz output problem

Hi all,

I am graduate student in bioinformatics.I am working with Lastz, trying to compare two dna sequence files of sparrow genome.

my command is
lastz GCF_000385455.1_Zonotrichia_albicollis-1.0.1_genomic.fa[multiple] Capensis_both_Out.txt.scafSeq --format=difference --output=/net/home/skothapalli/result.fa

output was
lastz: No match.
what does this mean? Can someone help me out!

thanks in advance,
Sam
skothap is offline   Reply With Quote
Old 07-30-2015, 03:52 AM   #13
Bob-Harris
Member
 
Location: USA

Join Date: Mar 2014
Posts: 13
Default re: Lastz output problem

Quote:
Originally Posted by skothap View Post
output was
lastz: No match.
what does this mean?
This is apparently a message from your shell. The string "no match" isn't something lastz ever outputs.

I've seen this reported a couple times before, but I've never found a machine that does it. There is more information about that in this thread (search down the page for "no match"):
http://seqanswers.com/forums/archive...p/t-20113.html
My second post in that thread (05-15-2014, 07:28 AM) contains a workaround.

Quote:
Originally Posted by skothap View Post
my command is
lastz GCF_000385455.1_Zonotrichia_albicollis-1.0.1_genomic.fa[multiple] Capensis_both_Out.txt.scafSeq --format=difference --output=/net/home/skothapalli/result.fa
The command looks fine, but just FYI the output format won't be fasta.

Bob H (lastz author)
Bob-Harris is offline   Reply With Quote
Old 07-30-2015, 06:05 AM   #14
skothap
Junior Member
 
Location: indianapolis

Join Date: Jul 2015
Posts: 2
Default

Hi Bob,

I ran ls -al 'which lastz' and my output was
skothapalli@bioinformatics:/net/common/data/sparrow_project$ ls -al `which lastz`
-rwxr-xr-x. 1 root root 435735 Jun 17 14:33 /usr/local/bin/lastz

and I am working in the following path:
/net/common/data/sparrow_project

So,do I have to change path somewhere?

thanks in advance,
Sam
skothap is offline   Reply With Quote
Old 07-30-2015, 07:05 AM   #15
Topulaneus-Hattum
Member
 
Location: Planet Earth, approximately

Join Date: Feb 2014
Posts: 14
Default

Well I just typed a longer reply but when I submitted it my browser failed. If this post is successful I will try again.
Topulaneus-Hattum is offline   Reply With Quote
Old 07-30-2015, 07:13 AM   #16
Topulaneus-Hattum
Member
 
Location: Planet Earth, approximately

Join Date: Feb 2014
Posts: 14
Default

Sam, I really don't have any idea why your shell is giving you that error.

From what you posted, it looks like the shell is finding the executable file, which ought to rule out path problems. That file has a reasaonble size, and it's set for "x" for all users. That all looks correct. My only other guess is that the executable was built for a different machine, but that's truly a wild ass guess (and I'd hope you'd get a more informative error message than "no match").

It's apparently a rare situation (3 reports in 5 years) and I've never been able to reproduce it. I've tried searching for what instances the shell would report "no match" but I've had no luck.

What shell are you using?

As anothe data point, if you try wrapping your command in in an executable text file, does that work?

Bob H (lastz author)
Topulaneus-Hattum is offline   Reply With Quote
Old 07-30-2015, 07:30 AM   #17
Topulaneus-Hattum
Member
 
Location: Planet Earth, approximately

Join Date: Feb 2014
Posts: 14
Default

With a little more searching I've found a plausible answer. I think your shell is trying to interpret [multiple] as a regular expression, as part of command-line filename expansion.

If that's the case, I don't know how to prevent it (my shell isn't doing that). You might try putting quotes or double quotes around [multiple], but I have my doubts that that will change anything.

There's an alternative way to get that multiple option into lastz, without binding it to the file name. --action1=[multiple] But I think the shell is still going to try to expand anything in square brackets. It's going to take someone who understands your shell to solve this. Sorry.

Bob H (lastz author)
Topulaneus-Hattum is offline   Reply With Quote
Old 07-31-2015, 10:57 AM   #18
Topulaneus-Hattum
Member
 
Location: Planet Earth, approximately

Join Date: Feb 2014
Posts: 14
Default

I know this doesn't help you, but the next release of lastz (something greater than 1.03.73) will have a workaround for this, so that you can set that kind of file modifier without having to use square brackets. Under the unconfirmed idea that that is what the problem is.

Bob H (lastz author)
Topulaneus-Hattum is offline   Reply With Quote
Old 08-24-2015, 07:36 AM   #19
ereaye
Junior Member
 
Location: Switzerland

Join Date: Aug 2015
Posts: 3
Default multiple sequences in target fasta

Hello,

I am also having some problems with the option "[multiple]". I am trying to run the following command:

>lastz chicken.chrUn.fa[multiple] turkey.chr1.fa K=3000 L=2200 H=2000 M=50 format=axt > chicken.chrUn.turkey.chr1.axt

where the file chicken.chrUn.fa contains hundreds of sequences/scaffolds and the file turkey.chr1.fa contains one single sequence (always softmasked). I tried this with small test files and everything worked fine but when running in the actual data I get this error:

searching for matches in turkey.chr1.fa
processing anchor #1 (of 63651) (8212754/78681682) 16394104/78681682
alignment block score=294106 at (8212350/78681278) 16393700/78681278 length 3543/3485
processing anchor #2 (of 63651) (15263154/71894445) 30514955/71894445
alignment block score=224990 at (15261448/71892751) 30513249/71892751 length 2933/3040
processing anchor #3 (of 63651) (19232279/25722160) 38458396/25722160
alignment block score=497244 at (19226538/25716168) 38452655/25716168 length 6286/6534
processing anchor #4 (of 63651) (12774217/202502915) 25532709/202502915
alignment block score=288391 at (12773292/202502037) 25531784/202502037 length 3719/3839
processing anchor #5 (of 63651) (17840002/111698882) 35673300/111698882
FAILURE: lookup_partition could not locate position 17833297 in chicken.chrUn.fa

and it repeats consistently across all turkey chromosomes every time pointing to different locations in chicken.chrUn.fa. I do want to take advantage of the possibility to have more than one sequence in the target fasta but I continue to fail. So I wonder if someone would know how to fix this.

Thanks!
ereaye is offline   Reply With Quote
Old 08-24-2015, 09:36 AM   #20
Bob-Harris
Member
 
Location: USA

Join Date: Mar 2014
Posts: 13
Default

Howdy, ereaye,

This indicates some sort of internal error in lastz. Internally it concatenates all the target sequences into one, is trying to convert a position in that concatenated sequence back to the original sequence coordinates, and something has gone wrong with that.

I'd like to reproduce this here. Are these standard chicken and turkey assemblies? If so could you tell me which release/assembly they are? Or if perhaps turkey isn't available online, would it be possible to make that chromosome available somewhere?

Also, what version of lastz are you running (lastz --version).

Thanks,
Bob H (lastz author)
Bob-Harris is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:56 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO