SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Extract sequence from multi fasta file with PERL andreitudor Bioinformatics 27 07-07-2019 08:45 AM
Extract a gene list from a gtf file produced by Cuffmerge tigerxu RNA Sequencing 0 09-16-2014 11:30 AM
Parsing multi fasta sequence file using Perl newbie2this Bioinformatics 9 09-11-2013 05:48 AM
How do I extract partial sequence data (Fasta) from multiple hits in NCBI nucleotide? andtill Bioinformatics 1 11-09-2012 01:49 PM
Extract only sequence ids from fasta file with makeblastdb angeloulivieri Bioinformatics 13 07-30-2012 03:41 AM

Reply
 
Thread Tools
Old 02-02-2015, 01:58 AM   #1
Anti
Junior Member
 
Location: Italy

Join Date: Sep 2014
Posts: 5
Default How to Extract Multiple Sequence from Multi Fasta File by ID list

Hi,
I have a list of ids in .txt format and a multi fasta file with sequences. I need to extract sequences with the IDs in the list.

Can you help me, please?
Anti is offline   Reply With Quote
Old 02-02-2015, 03:16 AM   #2
mike.t
Member
 
Location: Spain

Join Date: Mar 2010
Posts: 36
Default

I think you can do that using seqret which is part of EMBOSS. According to the documentation the paramater -iquery1 can be used to specify a list of IDs, although probably not a file with IDs...
mike.t is offline   Reply With Quote
Old 02-02-2015, 04:46 AM   #3
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Do you program? You can do that with a few lines using a library like Biopython.

Alternatively, if you have a local Galaxy you could ask your admin to install one of these tools: http://toolshed.g2.bx.psu.edu/view/p...q_filter_by_id or http://toolshed.g2.bx.psu.edu/view/p...q_select_by_id
maubp is offline   Reply With Quote
Old 02-02-2015, 04:55 AM   #4
rhinoceros
Senior Member
 
Location: sub-surface moon base

Join Date: Apr 2013
Posts: 372
Default

If there are no linebreaks in the sequences, then

Code:
grep -A1 -w -f id.txt seqFile.fasta > output.fasta
should work. The ids have to be identical to the fasta headers including the greater than sign.
__________________
savetherhino.org
rhinoceros is offline   Reply With Quote
Old 02-02-2015, 05:02 AM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,992
Default

faSomeRecords from Kent utilities is the simplest solution (http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/)

More here: http://seqanswers.com/forums/showpos...0&postcount=13
GenoMax is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:51 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO