SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem working with Illumina paired-end sequence data yangfangisok Bioinformatics 7 10-22-2012 07:42 AM
Adding nucleotides onto the 5' end of adapter uberfinch Illumina/Solexa 0 12-06-2011 02:57 PM
Problem with adding 454 reads with cross_match greigite Bioinformatics 6 06-02-2011 01:29 PM
Paired end reads: line # does not match juan Illumina/Solexa 0 10-27-2010 06:56 PM
consed 19.0 adding new reads problem, realy need help rucyfa Bioinformatics 9 05-08-2010 10:45 AM

Reply
 
Thread Tools
Old 09-26-2013, 10:29 PM   #1
garethboy
Member
 
Location: Kuala Lumpur

Join Date: Feb 2012
Posts: 19
Default problem with adding numerical sequence at the end of line

Hi,

Anyone has any idea how to get this:

>no_name
TATGCATCGATGCACATATGCTAGTGCGCTAGTGTCGAGGCTAGCTACG
>no_name
GACGTACGTAGCATGCATGCATGCGTAGCTGTAGCTAGC
>no_name
GCTAGCTAGGTAGGTCATGTAGTAGGTGCACTGAGCTAGCTAGCTAGCTAGCAGC
>no_name
GCTAGCATGCTAGCTAGCTAGCACTAGCTAGCTAGCTAGCTAATGCATCATC
>no_name
GCTACGTAGCATGCTAGCGGATCATGCATGCATGCTAGCATCGATGCTAGCATGCAT

become this:

>no_name_1
TATGCATCGATGCACATATGCTAGTGCGCTAGTGTCGAGGCTAGCTACG
>no_name_2
GACGTACGTAGCATGCATGCATGCGTAGCTGTAGCTAGC
>no_name_3
GCTAGCTAGGTAGGTCATGTAGTAGGTGCACTGAGCTAGCTAGCTAGCTAGCAGC
>no_name_4
GCTAGCATGCTAGCTAGCTAGCACTAGCTAGCTAGCTAGCTAATGCATCATC
>no_name_5
GCTACGTAGCATGCTAGCGGATCATGCATGCATGCTAGCATCGATGCTAGCATGCAT
garethboy is offline   Reply With Quote
Old 09-26-2013, 10:47 PM   #2
atcghelix
Member
 
Location: CA

Join Date: Jul 2013
Posts: 74
Default

Here's one way using Perl. Save the text in a file named numbers.pl (or whatever). Usage would be:

perl numbers.pl --in file_to_change.fasta --out revised_file.fasta


Code:
#!/usr/bin/perl

use strict;
use warnings;
use Getopt::Long;

my $inFile;
my $outFile;

GetOptions  ("in=s"      => \$inFile,
             "out=s"      => \$outFile);

if (!$inFile or !$outFile) {
    die "Must supply both infile and outfile as command line arguments.\n";
}

open(my $inFH, "<", $inFile) or die "couldn't open infile for reading.\n";
if (-e $outFile) {
    die "Output file $outFile already exists--aborting so you don't overwrite.\n";
}
open(my $outFH, ">", $outFile) or die "couldn't open outfile for writing.\n";
    
my $counter = 1;
while (my $line = <$inFH>) {
    chomp $line;
    if ($line =~ /^(>.*)/) {
        print $outFH $1 . "_$counter\n";
        $counter++;
    } else {
        print $outFH "$line\n";
    }
}

Last edited by atcghelix; 09-26-2013 at 10:57 PM. Reason: Edited to move $counter++ so that you didn't just get odd-numbered sequences
atcghelix is offline   Reply With Quote
Old 09-26-2013, 11:06 PM   #3
Jeremy
Senior Member
 
Location: Pathum Thani, Thailand

Join Date: Nov 2009
Posts: 190
Default

Heres another way: R

Code:
library(seqinr)
read.fasta("fastafile.fa")->fa
write.fasta(fa,names=paste(getName(fa),1:5,sep="_"),file.out="fa_new_name.fa")
where you swap '1:5' with '1:n', n being the number of sequences you have.
Jeremy is offline   Reply With Quote
Old 09-26-2013, 11:31 PM   #4
garethboy
Member
 
Location: Kuala Lumpur

Join Date: Feb 2012
Posts: 19
Default

Anyone know how to use AWK to do this task?
garethboy is offline   Reply With Quote
Old 09-26-2013, 11:32 PM   #5
garethboy
Member
 
Location: Kuala Lumpur

Join Date: Feb 2012
Posts: 19
Default

Thanks. I am pretty weak in Perl. Do you have any idea using AWK to do this?
garethboy is offline   Reply With Quote
Old 09-26-2013, 11:49 PM   #6
atcghelix
Member
 
Location: CA

Join Date: Jul 2013
Posts: 74
Default

What version of Awk are you running/what operating system?
atcghelix is offline   Reply With Quote
Old 09-27-2013, 12:03 AM   #7
garethboy
Member
 
Location: Kuala Lumpur

Join Date: Feb 2012
Posts: 19
Default

Running is UNIX
garethboy is offline   Reply With Quote
Old 09-27-2013, 12:26 AM   #8
atcghelix
Member
 
Location: CA

Join Date: Jul 2013
Posts: 74
Default

This work? (It assumes all sequence strings are on a single line)

Code:
awk '{if($0 ~ /^>/){print $0"_"(NR+1)/2}else{print $0}}' input.fasta > changed.fasta

Last edited by atcghelix; 09-27-2013 at 12:33 AM. Reason: Less confusing regex
atcghelix is offline   Reply With Quote
Old 09-27-2013, 12:33 AM   #9
Kennels
Senior Member
 
Location: Sydney

Join Date: Feb 2011
Posts: 149
Default

try this

Code:
paste - - < input.fa | awk ' { print $1"_"NR"\n"$2 } ' > output.fa
make sure to have spaces between the hyphens for 'paste'
Kennels is offline   Reply With Quote
Old 09-27-2013, 02:27 AM   #10
garethboy
Member
 
Location: Kuala Lumpur

Join Date: Feb 2012
Posts: 19
Default

Thank you everybody. I have done my task. =)
garethboy is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:11 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO