SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Add count numbers to headers in a fasta file (http://seqanswers.com/forums/showthread.php?t=31634)

Giorgio C 07-03-2013 02:32 PM

Add count numbers to headers in a fasta file
 
Hi all,

I have a fasta file with the same header for each sequence, I would like to add natural numbers at the end of each line:

>OakDna
ACTCTAAATCAGTGCGAG...
>OakDna
AAAAACCCTTTACACTTT...
>OakDna
CTCTAAACCTTTAACCTT..
etc.

I want something like this:

>OakDna_1
ACTCTAAATCAGTGCGAG...
>OakDna_2
AAAAACCCTTTACACTTT...
>OakDna_3
CTCTAAACCTTTAACCTT..
etc.
>OakDna_n
ACTCATCCAAAACTTTTT..

Where n is the last number of the sequence in the file.

Any quick suggestion?

Thanks in advance,
Giorgio

Heisman 07-03-2013 04:42 PM

Google is your friend in situations like these:

http://www.linuxquestions.org/questi...f-line-803625/

Giorgio C 07-03-2013 05:56 PM

I tried to google it, but couldn't find what I was looking for. Btw the link you posted seems to be good...THANKS a lot!

gsgs 07-04-2013 06:30 AM

I'd just write a simple program to do it

5 min ?

wieni 07-04-2013 07:00 AM

Here a quick and dirty solution in python - was still missing :-)


#!/usr/bin/env python

import re
import string
import sys


infile = open(sys.argv[1])
data = infile.readlines()
infile.close()

outfile = open(sys.argv[2], "w")
c = 1
l = 1
for i in data:
i = re.sub("\n|\r", "", i)
if c%2 != 0:
outfile.write(i+"_" +str(l) +"\n")
l+=1
else:
outfile.write(i +"\n")
c += 1
outfile.close()


save the upper code in a file called for example "numberFasta.py"
on a terminal call the program with: python numberFasta.py <yourInfile> <outfilename>

wieni 07-04-2013 07:00 AM

ah..and correct the indention - was lost here...

Heisman 07-04-2013 10:19 AM

Quote:

Originally Posted by wieni (Post 109536)
Here a quick and dirty solution in python - was still missing :-)


#!/usr/bin/env python

import re
import string
import sys


infile = open(sys.argv[1])
data = infile.readlines()
infile.close()

outfile = open(sys.argv[2], "w")
c = 1
l = 1
for i in data:
i = re.sub("\n|\r", "", i)
if c%2 != 0:
outfile.write(i+"_" +str(l) +"\n")
l+=1
else:
outfile.write(i +"\n")
c += 1
outfile.close()


save the upper code in a file called for example "numberFasta.py"
on a terminal call the program with: python numberFasta.py <yourInfile> <outfilename>

You can use the "code" tags to make this work (surround the code with [code ] and [/code ] (but no spaces):

Code:

#!/usr/bin/env python

import re
import string
import sys


infile = open(sys.argv[1])
data = infile.readlines()
infile.close()

outfile = open(sys.argv[2], "w")
c = 1
l = 1
for i in data:
    i = re.sub("\n|\r", "", i)
    if c%2 != 0:
        outfile.write(i+"_" +str(l) +"\n")
        l+=1
    else:
        outfile.write(i +"\n")
    c += 1
outfile.close()


syfo 07-08-2013 03:06 AM

A short one:

Code:

awk '/^>/{$0=$0"_"(++i)}1' infile


All times are GMT -8. The time now is 06:57 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.