View Single Post
Old 05-09-2015, 02:18 PM   #1
Zapages
Member
 
Location: NJ

Join Date: Oct 2012
Posts: 97
Default Adding counting Contig number at the start of Fasta Header

I am trying to add Prefix to a list of Fasta sequences (30K+ which I have de-novo assembled) in a fasta file at the same time.

I know from the past that have been able to add it at the end of the sequences. But I am getting a bit confused on how should I do it for the start of the Fasta sequence.

For the end of sequence.
Code:
awk '/^>/{$0=$0"_Contig_"(++i)}1' input_file.fasta > output_file.fasta
This will change something like this in the fasta file:

From:
>Sequence_header
ATAGCATA
To:
>Sequences_header_Contig_1
ATAGCATA
...
>Sequences_header_Contig_n
ATAGCATA


I hope to do something like this:

>Sequence_header
ATAGCATA
To:
>Contig_1_Sequences_header
ATAGCATA
...
>Contig_n_Sequences_header
ATAGCATA


Code:
awk '/^>/{"Contig_"(++i)"_"$0=$0}1' input_file.fasta > output_file.fasta
Unfortunately I receive a synthax error.

If someone could kindly show what I am doing wrong. I would greatly appreciate it.

Many thanks.

Last edited by Zapages; 05-09-2015 at 05:02 PM.
Zapages is offline   Reply With Quote