View Single Post
Old 08-08-2014, 01:00 AM   #2
ebioman
Member
 
Location: Switzerland

Join Date: Aug 2013
Posts: 41
Default A one liner for a simple genome

This is pretty old but the problem might still persist.

I took the final result file with the structure:

Code:
GLIMMER (ver. 3.02; iterated)  predictions:
 orfID      start     end  frame  score
--------    -----    -----  --    -----
>FASTA_HEADER
orf00003      560       63  -3     3.04
orf00004      865      752  -2     5.42
orf00010     2199     4055  +3     3.14
orf00027     4028     9019  +2     3.06
I believe that might do it:

Code:
grep ^orf output.glimmer.raw | awk '{OFS="\t"}{strang = "+"}{if($4 < 0) strang="-"}{gsub(/[+-]/," ")}{print "FASTA_HEADER", "GLIMMER", "gene" , $2 , $3, $5, strang , $4, "ID="$1"; NOTE:GLIMMER ORF prediction;"}'
Result:
Code:
FASTA_HEADER        GLIMMER gene    560     63      3.04    -       3       ID=orf00003; NOTE:GLIMMER ORF prediction;
FASTA_HEADER        GLIMMER gene    865     752     5.42    -       2       ID=orf00004; NOTE:GLIMMER ORF prediction;
FASTA_HEADER        GLIMMER gene    2199    4055    3.14    +       3       ID=orf00010; NOTE:GLIMMER ORF prediction;
FASTA_HEADER        GLIMMER gene    4028    9019    3.06    +       2       ID=orf00027; NOTE:GLIMMER ORF prediction;
This has to be adapted especially if you have many Chromosomes/Contigs. I did have only one circular genome.
ebioman is offline   Reply With Quote