Seqanswers Leaderboard Ad

**GenoMax** · 02-25-2015, 05:09 AM

get longest orf from emboss getorf - SEQanswers

http://seqanswers.com/forums/showthread.php?t=39074

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

**dena.dinesh** · 02-25-2015, 08:02 AM

Hi genomax,

I downloaded the the python script and corresponding xml files from the GitHub by right clicking it and the saved the link as it is. Later I installed Biopython.
But when i ran the command, it got the following error

Code:

File "get_orfs_or_cdss.py", line 4
    <!DOCTYPE html>
    ^
SyntaxError: invalid syntax

This is how I gave the command,

Code:

python get_orfs_or_cdss.py $input_fasta smed_dd_v4.fasta $input_format FASTA $table 1 $ftype CDS $ends open $min_len 30 $strand both $mode top $out_nuc_file dd_nucleotide.fasta $out_prot_file dd_prot.fasta

Kindly guide me

**maubp** · 02-25-2015, 12:17 PM

You didn't download the Python script, but an HTML file showing the Python script with nice colours etc. You need to use the "raw" link on GitHub, i.e.

https://raw.githubusercontent.com/peterjc/pico_galaxy/master/tools/get_orfs_or_cdss/get_orfs_or_cdss.py

The resulting get_orfs_or_cdss.py file should be plain text and start with:

Code:

#!/usr/bin/env python
"""Find ORFs in a nucleotide sequence file.

...

If it was unclear, in place of $input_fasta you would put the filename of your input FASTA file (and so on). i.e.

Code:

python get_orfs_or_cdss.py smed_dd_v4.fasta FASTA 1 CDS open 30 both top dd_nucleotide.fasta dd_prot.fasta

(And yes, I know this is not a very friendly command line interface - it was written primarily for use via Galaxy and I have not yet had reason/time to go back and make this more Unix-like. Sorry)

**dena.dinesh** · 02-25-2015, 12:27 PM

Originally posted by maubp View Post

You didn't download the Python script, but an HTML file showing the Python script with nice colours etc. You need to use the "raw" link on GitHub, i.e.

https://raw.githubusercontent.com/peterjc/pico_galaxy/master/tools/get_orfs_or_cdss/get_orfs_or_cdss.py

The resulting get_orfs_or_cdss.py file should be plain text and start with:

Code:

#!/usr/bin/env python
"""Find ORFs in a nucleotide sequence file.

...

If it was unclear, in place of $input_fasta you would put the filename of your input FASTA file (and so on). i.e.

Code:

python get_orfs_or_cdss.py smed_dd_v4.fasta FASTA 1 CDS open 30 both top dd_nucleotide.fasta dd_prot.fasta

(And yes, I know this is not a very friendly command line interface - it was written primarily for use via Galaxy and I have not yet had reason/time to go back and make this more Unix-like. Sorry)

Hi Maubp,

You mean to say that I have to copy the code from the plain text to a editor and save it as a pythin script and later run it as python program? Am I right?

**GenoMax** · 02-25-2015, 12:36 PM

Right click on the link Peter provided and then choose "save as" (or "save link as"). That will save the script file locally. You can then run it.

**dena.dinesh** · 02-25-2015, 12:38 PM

Originally posted by GenoMax View Post

Right click on the link Peter provided and then choose "save as" (or "save link as"). That will save the script file locally. You can then run it.

Hi Genomax,

I tried exactly what you said, but it throwed me an error as I stated above

**GenoMax** · 02-25-2015, 12:39 PM

Did you modify/try the command as Peter showed?

**dena.dinesh** · 02-25-2015, 12:44 PM

Originally posted by GenoMax View Post

Did you modify/try the command as Peter showed?

No. I didnt modify any command. i just ran after saving the link. What has to modified?

**GenoMax** · 02-25-2015, 12:47 PM

Code:

$ python get_orfs_or_cdss.py smed_dd_v4.fasta FASTA 1 CDS open 30 both top dd_nucleotide.fasta dd_prot.fasta

**dena.dinesh** · 02-25-2015, 12:55 PM

Originally posted by GenoMax View Post

Code:

$ python get_orfs_or_cdss.py smed_dd_v4.fasta FASTA 1 CDS open 30 both top dd_nucleotide.fasta dd_prot.fasta

I tried the above command but it stiil shows syntax error

**GenoMax** · 02-25-2015, 01:02 PM

We will have to wait for Peter to chime in then.

**maubp** · 02-25-2015, 01:51 PM

Originally posted by dena.dinesh View Post

Hi Maubp,

You mean to say that I have to copy the code from the plain text to a editor and save it as a pythin script and later run it as python program? Am I right?

That should work but is unnecessarily complicated. As GenoMax suggested, right clicking on the link https://raw.githubusercontent.com/pe...rfs_or_cdss.py in your browser should give you a save option. I'm puzzled what went wrong, perhaps this depends on your web-browser?

The simplest approach would be to download it at the command line with:

Code:

$ wget https://raw.githubusercontent.com/peterjc/pico_galaxy/master/tools/get_orfs_or_cdss/get_orfs_or_cdss.py

Check this worked with:

Code:

$ head get_orfs_or_cdss.py 
#!/usr/bin/env python
"""Find ORFs in a nucleotide sequence file.

get_orfs_or_cdss.py $input_fasta $input_format $table $ftype $ends $mode $min_len $strand $out_nuc_file $out_prot_file

Takes ten command line options, input sequence filename, format, genetic
code, CDS vs ORF, end type (open, closed), selection mode (all, top, one),
minimum length (in amino acids), strand (both, forward, reverse), output
nucleotide filename, and output protein filename.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 55 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 51 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 45 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Finding the Longest ORF for all sequences in EMBOSS

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News