![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Any script to format headers in fasta files? | Shishir | Bioinformatics | 2 | 02-05-2013 07:52 AM |
what are the headers on the nucmer .snp file? | Kirstin | General | 1 | 09-24-2012 04:34 AM |
samtools sam file no headers | Irina Pulyakhina | Bioinformatics | 1 | 11-02-2011 07:34 AM |
finding exon numbers in fasta exon file | efoss | Bioinformatics | 1 | 10-20-2011 04:57 PM |
Replacing FASTA headers for TopHat & Cufflinks | brachysclereid | Bioinformatics | 2 | 02-16-2011 05:44 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: ITALY Join Date: Oct 2010
Posts: 89
|
![]()
Hi all,
I have a fasta file with the same header for each sequence, I would like to add natural numbers at the end of each line: >OakDna ACTCTAAATCAGTGCGAG... >OakDna AAAAACCCTTTACACTTT... >OakDna CTCTAAACCTTTAACCTT.. etc. I want something like this: >OakDna_1 ACTCTAAATCAGTGCGAG... >OakDna_2 AAAAACCCTTTACACTTT... >OakDna_3 CTCTAAACCTTTAACCTT.. etc. >OakDna_n ACTCATCCAAAACTTTTT.. Where n is the last number of the sequence in the file. Any quick suggestion? Thanks in advance, Giorgio |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: St. Louis Join Date: Dec 2010
Posts: 535
|
![]()
Google is your friend in situations like these:
http://www.linuxquestions.org/questi...f-line-803625/ |
![]() |
![]() |
![]() |
#3 |
Member
Location: ITALY Join Date: Oct 2010
Posts: 89
|
![]()
I tried to google it, but couldn't find what I was looking for. Btw the link you posted seems to be good...THANKS a lot!
|
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: germany Join Date: Oct 2009
Posts: 140
|
![]()
I'd just write a simple program to do it
5 min ? |
![]() |
![]() |
![]() |
#5 |
Junior Member
Location: Germany Join Date: Jun 2012
Posts: 6
|
![]()
Here a quick and dirty solution in python - was still missing :-)
#!/usr/bin/env python import re import string import sys infile = open(sys.argv[1]) data = infile.readlines() infile.close() outfile = open(sys.argv[2], "w") c = 1 l = 1 for i in data: i = re.sub("\n|\r", "", i) if c%2 != 0: outfile.write(i+"_" +str(l) +"\n") l+=1 else: outfile.write(i +"\n") c += 1 outfile.close() save the upper code in a file called for example "numberFasta.py" on a terminal call the program with: python numberFasta.py <yourInfile> <outfilename> |
![]() |
![]() |
![]() |
#6 |
Junior Member
Location: Germany Join Date: Jun 2012
Posts: 6
|
![]()
ah..and correct the indention - was lost here...
|
![]() |
![]() |
![]() |
#7 | |
Senior Member
Location: St. Louis Join Date: Dec 2010
Posts: 535
|
![]() Quote:
Code:
#!/usr/bin/env python import re import string import sys infile = open(sys.argv[1]) data = infile.readlines() infile.close() outfile = open(sys.argv[2], "w") c = 1 l = 1 for i in data: i = re.sub("\n|\r", "", i) if c%2 != 0: outfile.write(i+"_" +str(l) +"\n") l+=1 else: outfile.write(i +"\n") c += 1 outfile.close() |
|
![]() |
![]() |
![]() |
#8 |
Just a member
Location: Southern EU Join Date: Nov 2012
Posts: 103
|
![]()
A short one:
Code:
awk '/^>/{$0=$0"_"(++i)}1' infile |
![]() |
![]() |
![]() |
Thread Tools | |
|
|