SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
extract data from fasta-files with perl?? anna_ Bioinformatics 20 02-17-2016 08:29 AM
extract full fasta file for local blast hits Oyster Bioinformatics 8 02-16-2016 01:34 PM
perl : Remove redundant feature in fasta file StephaniePi83 Bioinformatics 9 12-15-2012 07:01 PM
Perl: get specific base from FASTA file. njh_TO Bioinformatics 6 02-02-2012 06:34 AM
Help with glimmer multi-extract sbberes Bioinformatics 2 03-19-2010 02:35 PM

Reply
 
Thread Tools
Old 03-13-2012, 12:51 AM   #21
julianeishida
Junior Member
 
Location: Japan

Join Date: Mar 2012
Posts: 1
Default

Thanks.

I didn`t know about Biopieces. It is really useful. Highly recommended for those whose programing ability is low
julianeishida is offline   Reply With Quote
Old 03-13-2012, 03:21 AM   #22
swaraj
Member
 
Location: Naples, Italy

Join Date: Feb 2012
Posts: 50
Default

A quick way to do in bioperl
http://biostar.stackexchange.com/que...3gb-fasta-file
swaraj is offline   Reply With Quote
Old 05-06-2012, 09:40 PM   #23
pjyoti
Junior Member
 
Location: india

Join Date: Nov 2011
Posts: 2
Default

hello everyone...

I am using the following perl script for retrieving sequences in fasta format.....


use Bio::Perl;
$database="genbank";
$format="fasta";
$pipe ="\\|";
$space = " ";
open(INPUTFILE, "<1.txt");
while(<INPUTFILE>)
{
my($line) = $_;
chomp($line);
$line=~ s/$space/:/;
$line=~ s/$pipe/$space/;
$line=~ s/g/G/;
$line=~ s/i/I/;
$id= "$line";
#print "$id";
#print "\n";
$sequence = get_sequence($database, $id);
$test = write_sequence( ">>sequences_1.txt", $format, $sequence);
open (CHK , ">>checking.txt");
print CHK <<HERE;
$test
HERE
close CHK;
}
exit;



after getting some sequences i am getting an error messege....

-----------Exception-------------
MSG: WebDBSeqI Request Error:
HTTP/1.1 502 Bad Gateway
connection: close
Date:
.
.
.
.
.
.
<?xml version="1.0" encoding="ISO-8859-1"?




The proxy server received an invalid response from an upstream server.


plz help me out...
pjyoti is offline   Reply With Quote
Old 05-06-2012, 09:47 PM   #24
pjyoti
Junior Member
 
Location: india

Join Date: Nov 2011
Posts: 2
Default

hello everyone...

I am using the following perl script for retrieving sequences in fasta format.....


use Bio::Perl;
$database="genbank";
$format="fasta";
$pipe ="\\|";
$space = " ";
open(INPUTFILE, "<1.txt");
while(<INPUTFILE>)
{
my($line) = $_;
chomp($line);
$line=~ s/$space/:/;
$line=~ s/$pipe/$space/;
$line=~ s/g/G/;
$line=~ s/i/I/;
$id= "$line";
#print "$id";
#print "\n";
$sequence = get_sequence($database, $id);
$test = write_sequence( ">>sequences_1.txt", $format, $sequence);
open (CHK , ">>checking.txt");
print CHK <<HERE;
$test
HERE
close CHK;
}
exit;



after getting some sequences i am getting an error messege....

-----------Exception-------------
MSG: WebDBSeqI Request Error:
HTTP/1.1 502 Bad Gateway
connection: close
Date:
.
.
.
.
.
.
<?xml version="1.0" encoding="ISO-8859-1"?
<!DOCTYPE html PUBLIC "-//W#C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="htttp://www.org/1999/xhtml" lang="en" xm:lang="en"
<head>
<title>Bad Gateway!</title> <link rev="made" href="mailto:[email protected]"/>





The proxy server received an invalid response from an upstream server.


plz help me out...
pjyoti is offline   Reply With Quote
Old 07-12-2012, 04:54 AM   #25
vivek
Junior Member
 
Location: India

Join Date: Sep 2010
Posts: 2
Smile

Dear ......,

I follow the same steps but it is not working ...

Vivek

Quote:
Originally Posted by apc2010 View Post
If you need sequences extracted from a multi-FASTA and are open to using a pre-existing tool, I would also suggest either the faSomeRecords or faOneRecord command line utilities from UCSC.

They have versions of this tool for OSX and Linux. Here is a link to the executable downloads:

http://hgdownload.cse.ucsc.edu/admin/exe/

The difference between the two: faOneRecord takes the sequence name to extract from the command line, faSomeRecords reads in a file of 1 or more sequence names to extract from the multi-FASTA.

Usage:
Code:
================================================================
========   faOneRecord   ====================================
================================================================
faOneRecord - Extract a single record from a .FA file
usage:
   faOneRecord in.fa recordName

================================================================
========   faSomeRecords   ====================================
================================================================
faSomeRecords - Extract multiple fa records
usage:
   faSomeRecords in.fa listFile out.fa
options:
   -exclude - output sequences not in the list file.
__________________
Vivek Keshri
vivek is offline   Reply With Quote
Old 01-31-2013, 01:27 PM   #26
yzzhang
Member
 
Location: florida

Join Date: Jan 2013
Posts: 66
Default

don't contain > in the file list, the script faSomeRecords can work well.
Quote:
Originally Posted by mghita View Post
I have given up. I replaced the @ with > and still didn't work. I have combined a little awk and R and does my job just fine. Thanks a lot for the effort!

Madalina
yzzhang is offline   Reply With Quote
Old 12-05-2017, 10:51 PM   #27
ML1975
Junior Member
 
Location: Sydney, Australia

Join Date: Dec 2017
Posts: 3
Smile

Quote:
Originally Posted by boetsie View Post
Hi,

I've attached a script which can do this. If i understand it correctly you have a file like;

>chr1
AGCTGATGATAGT...
>chr2
ACAAAATAGTCGAT....
>chr3
....

And your perl script would be something like;

perl extractSequence.pl genomefile.fa chr1

where 'chr1' corresponds to a sequence named chr1 (indicated by chr1)?

Say you have a more complicated file like;

>chr1_coverage1000_length100
AGATGTATGTTAGA

You can do something like;

perl extractSequence.pl genomefile.fa chr1_.

which will extract all the sequences containing the header chr1_

To store the results, do;

perl extractSequence.pl genomefile.fa chr1 > filename.txt

If this is what you want, you can use my script.

Boetsie
7 years later and I have used your script - thanks for sharing Works a treat!
ML1975 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:10 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO