SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Extracting only the first match from FASTA file Henry_C Bioinformatics 4 06-01-2015 07:40 AM
FANCY BLAST PARSING NEEDED- extract fasta sequences distroboto Bioinformatics 15 11-06-2014 12:45 PM
Parsing multi fasta sequence file using Perl newbie2this Bioinformatics 9 09-11-2013 04:48 AM
Using Tabix for parsing files bnfoguy Bioinformatics 0 09-09-2011 08:10 AM
Help with FASTA parsing code. bigmac3000lbs Bioinformatics 6 03-28-2011 02:38 PM

Reply
 
Thread Tools
Old 01-11-2016, 08:42 PM   #1
Niranjanks
Member
 
Location: India

Join Date: Aug 2015
Posts: 11
Post Parsing and extracting from 2 Fasta Files

Hello I have 2 Multi-Fasta files with different Headers - a Reference file and a Test file

Reference file example
>gi|536779208|gb|GANF01000001.1| TSA: Momordica charantia Locus_17026_Transcript_1/1_Confidence_1.000_Length_828 transcribed RNA sequence
CGGGCGTAGCGACGAACGGCGGCGAAGACGACGCTCCAATCGAGGAGGTACTGGTTTTCAATCGCTTCCG
TGAATTAGTTTCGGTCCCTGCGGAGGAAGAGGAATGTTTGGGAGGCAGAGCCGCAACGCCAGGAATGGCG
CTCAAATCGTACTCAGAATCATCGTTGTAGAAACGGGAAGGGGAAGATTGAATCTGGGAGTGAGAATTGG
...

Test file example
>gi|537289490|gb|GANG01000001.1| TSA: Momordica charantia Locus_12460_Transcript_2/3_Confidence_0.400_Length_1699 transcribed RNA sequence
TGTCTGTGTTTTAGAGATATGAAAAGTGTTGGCCTAGTGCCTGATAATGTAATTTATACTATACTTATAG
ATGGGTTTTGTCGAAATGGTGCTATTTCAGATGCTCTGAAAATGCGGGACGAGATGCTTGCTCAGGGCTG
TGTTATGGATGTGGTTGCGTACAATACTATTTTGAATGGGTTATGCAAGAAAAAGATGTATGTTGACGCA
..

These files contain ~51000 entries.

I want to separate out entries that are similar in the Reference and Test with the preference of setting a similarity percentage - like 95% similar or so.
The output would ideally be in 2 files - the similar ones and the excluded ones.

Can the multiple sequence alignment programs like Mummer do that? or BLAST? If any similar program exists please help me out

Thank you.
Niranjanks is offline   Reply With Quote
Old 01-11-2016, 11:57 PM   #2
SylvainL
Senior Member
 
Location: Geneva

Join Date: Feb 2012
Posts: 177
Default

The easiest and fastest would be to use BLAST (locally using one of your file as database and setting the tabular output) and then parse the result yourself... You can easily make it using R for example.
SylvainL is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:25 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO