Seqanswers Leaderboard Ad

**atcghelix** · 09-26-2013, 09:47 PM

Here's one way using Perl. Save the text in a file named numbers.pl (or whatever). Usage would be:

perl numbers.pl --in file_to_change.fasta --out revised_file.fasta

Code:

#!/usr/bin/perl

use strict;
use warnings;
use Getopt::Long;

my $inFile;
my $outFile;

GetOptions  ("in=s"      => \$inFile,
             "out=s"      => \$outFile);

if (!$inFile or !$outFile) {
    die "Must supply both infile and outfile as command line arguments.\n";
}

open(my $inFH, "<", $inFile) or die "couldn't open infile for reading.\n";
if (-e $outFile) {
    die "Output file $outFile already exists--aborting so you don't overwrite.\n";
}
open(my $outFH, ">", $outFile) or die "couldn't open outfile for writing.\n";
    
my $counter = 1;
while (my $line = <$inFH>) {
    chomp $line;
    if ($line =~ /^(>.*)/) {
        print $outFH $1 . "_$counter\n";
        $counter++;
    } else {
        print $outFH "$line\n";
    }
}

**Jeremy** · 09-26-2013, 10:06 PM

Heres another way: R

Code:

library(seqinr)
read.fasta("fastafile.fa")->fa
write.fasta(fa,names=paste(getName(fa),1:5,sep="_"),file.out="fa_new_name.fa")

where you swap '1:5' with '1:n', n being the number of sequences you have.

**garethboy** · 09-26-2013, 10:31 PM

Anyone know how to use AWK to do this task?

**garethboy** · 09-26-2013, 10:32 PM

Thanks. I am pretty weak in Perl. Do you have any idea using AWK to do this?

**atcghelix** · 09-26-2013, 10:49 PM

What version of Awk are you running/what operating system?

**garethboy** · 09-26-2013, 11:03 PM

Running is UNIX

**atcghelix** · 09-26-2013, 11:26 PM

This work? (It assumes all sequence strings are on a single line)

Code:

awk '{if($0 ~ /^>/){print $0"_"(NR+1)/2}else{print $0}}' input.fasta > changed.fasta

**Kennels** · 09-26-2013, 11:33 PM

try this

Code:

paste - - < input.fa | awk ' { print $1"_"NR"\n"$2 } ' > output.fa

make sure to have spaces between the hyphens for 'paste'

**garethboy** · 09-27-2013, 01:27 AM

Thank you everybody. I have done my task. =)

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 56 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 52 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 45 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

problem with adding numerical sequence at the end of line

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News