SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Any script to format headers in fasta files? Shishir Bioinformatics 2 02-05-2013 07:52 AM
what are the headers on the nucmer .snp file? Kirstin General 1 09-24-2012 04:34 AM
samtools sam file no headers Irina Pulyakhina Bioinformatics 1 11-02-2011 07:34 AM
finding exon numbers in fasta exon file efoss Bioinformatics 1 10-20-2011 04:57 PM
Replacing FASTA headers for TopHat & Cufflinks brachysclereid Bioinformatics 2 02-16-2011 05:44 AM

Reply
 
Thread Tools
Old 07-03-2013, 03:32 PM   #1
Giorgio C
Member
 
Location: ITALY

Join Date: Oct 2010
Posts: 89
Default Add count numbers to headers in a fasta file

Hi all,

I have a fasta file with the same header for each sequence, I would like to add natural numbers at the end of each line:

>OakDna
ACTCTAAATCAGTGCGAG...
>OakDna
AAAAACCCTTTACACTTT...
>OakDna
CTCTAAACCTTTAACCTT..
etc.

I want something like this:

>OakDna_1
ACTCTAAATCAGTGCGAG...
>OakDna_2
AAAAACCCTTTACACTTT...
>OakDna_3
CTCTAAACCTTTAACCTT..
etc.
>OakDna_n
ACTCATCCAAAACTTTTT..

Where n is the last number of the sequence in the file.

Any quick suggestion?

Thanks in advance,
Giorgio
Giorgio C is offline   Reply With Quote
Old 07-03-2013, 05:42 PM   #2
Heisman
Senior Member
 
Location: St. Louis

Join Date: Dec 2010
Posts: 535
Default

Google is your friend in situations like these:

http://www.linuxquestions.org/questi...f-line-803625/
Heisman is offline   Reply With Quote
Old 07-03-2013, 06:56 PM   #3
Giorgio C
Member
 
Location: ITALY

Join Date: Oct 2010
Posts: 89
Default

I tried to google it, but couldn't find what I was looking for. Btw the link you posted seems to be good...THANKS a lot!
Giorgio C is offline   Reply With Quote
Old 07-04-2013, 07:30 AM   #4
gsgs
Senior Member
 
Location: germany

Join Date: Oct 2009
Posts: 140
Default

I'd just write a simple program to do it

5 min ?
gsgs is offline   Reply With Quote
Old 07-04-2013, 08:00 AM   #5
wieni
Junior Member
 
Location: Germany

Join Date: Jun 2012
Posts: 6
Default

Here a quick and dirty solution in python - was still missing :-)


#!/usr/bin/env python

import re
import string
import sys


infile = open(sys.argv[1])
data = infile.readlines()
infile.close()

outfile = open(sys.argv[2], "w")
c = 1
l = 1
for i in data:
i = re.sub("\n|\r", "", i)
if c%2 != 0:
outfile.write(i+"_" +str(l) +"\n")
l+=1
else:
outfile.write(i +"\n")
c += 1
outfile.close()


save the upper code in a file called for example "numberFasta.py"
on a terminal call the program with: python numberFasta.py <yourInfile> <outfilename>
wieni is offline   Reply With Quote
Old 07-04-2013, 08:00 AM   #6
wieni
Junior Member
 
Location: Germany

Join Date: Jun 2012
Posts: 6
Default

ah..and correct the indention - was lost here...
wieni is offline   Reply With Quote
Old 07-04-2013, 11:19 AM   #7
Heisman
Senior Member
 
Location: St. Louis

Join Date: Dec 2010
Posts: 535
Default

Quote:
Originally Posted by wieni View Post
Here a quick and dirty solution in python - was still missing :-)


#!/usr/bin/env python

import re
import string
import sys


infile = open(sys.argv[1])
data = infile.readlines()
infile.close()

outfile = open(sys.argv[2], "w")
c = 1
l = 1
for i in data:
i = re.sub("\n|\r", "", i)
if c%2 != 0:
outfile.write(i+"_" +str(l) +"\n")
l+=1
else:
outfile.write(i +"\n")
c += 1
outfile.close()


save the upper code in a file called for example "numberFasta.py"
on a terminal call the program with: python numberFasta.py <yourInfile> <outfilename>
You can use the "code" tags to make this work (surround the code with [code ] and [/code ] (but no spaces):

Code:
#!/usr/bin/env python

import re
import string
import sys


infile = open(sys.argv[1])
data = infile.readlines()
infile.close()

outfile = open(sys.argv[2], "w")
c = 1
l = 1
for i in data:
    i = re.sub("\n|\r", "", i)
    if c%2 != 0:
        outfile.write(i+"_" +str(l) +"\n")
        l+=1
    else:
        outfile.write(i +"\n")
    c += 1
outfile.close()
Heisman is offline   Reply With Quote
Old 07-08-2013, 04:06 AM   #8
syfo
Just a member
 
Location: Southern EU

Join Date: Nov 2012
Posts: 103
Default

A short one:

Code:
awk '/^>/{$0=$0"_"(++i)}1' infile
syfo is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:15 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO