Unconfigured Ad

**SylvainL** · 10-02-2015, 03:51 AM

Using R, quite easy and fast...

library(Biostrings)

all_fasta <- read.DNAStringSet("combined.fasta") ## You have to give the path to your file combined.fasta

id_symb <- scan("id_symb", what="character", sep="\n")

symbFasta <- all_fasta[names(all_fasta) %in% id_symb]
hostFasta <- all_fasta[! names(all_fasta) %in% id_symb]

**GenoMax** · 10-02-2015, 04:11 AM

Another option is Jim Kent's faSomeRecords utility. I am linking the linux version but he has source/OS X executables available as well.

http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/faSomeRecords

faSomeRecords - Extract multiple fa records
usage:
faSomeRecords in.fa listFile out.fa
options:
-exclude - output sequences not in the list file.

**DrYak** · 10-02-2015, 04:39 AM

faSomeRecords = lifesaver

Thank you so much genomax - that took less than 2 seconds for each file. The other way was still chugging away after two days...

Topics	Statistics	Last Post
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 12 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 48 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 107 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 125 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM

Unconfigured Ad

Subsetting a fasta file based on a set of BLAST results

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News