SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
"allele balance ratio" and "quality by depth" in VCF files efoss Bioinformatics 2 10-25-2011 11:13 AM
Relatively large proportion of "LOWDATA", "FAIL" of FPKM_status running cufflink ruben6um Bioinformatics 3 10-12-2011 12:39 AM
The position file formats ".clocs" and "_pos.txt"? Ist there any difference? elgor Illumina/Solexa 0 06-27-2011 07:55 AM
Removing reads with "N" > 4 marcela Bioinformatics 5 03-28-2011 02:47 AM
"Systems biology and administration" & "Genome generation: no engineering allowed" seb567 Bioinformatics 0 05-25-2010 12:19 PM

Reply
 
Thread Tools
Old 04-06-2012, 11:51 AM   #1
tusharbiot
Junior Member
 
Location: India

Join Date: Feb 2012
Posts: 7
Default perl script for removing sequences with "X" in them

Hi all!

I have a multi-fasta file which was masked to remove vector contamination. This has yielded some sequences to have "X" in them (probably the one which were partially masked). I wish to remove all such sequences. Can someone help me out with a perl script for the same. I guess a perl script which was originally written to remove the sequences with " N" (a much common issue) in them, would work perfectly fine.

Thanks a ton in advance
tusharbiot is offline   Reply With Quote
Old 04-06-2012, 11:57 AM   #2
twaddlac
Member
 
Location: Pittsburgh, PA

Join Date: Feb 2011
Posts: 49
Default

Assuming there are no 'X' in your read names:

Code:
perl -pe 's/X//g' reads.fa > new.reads.fa
twaddlac is offline   Reply With Quote
Old 04-06-2012, 12:27 PM   #3
maasha
Senior Member
 
Location: Denmark

Join Date: Apr 2009
Posts: 153
Default

Using Biopieces:

read_fasta -i in.fna | grab -ip X -k SEQ | write_fasta -o out.fna -x
maasha is offline   Reply With Quote
Old 04-07-2012, 11:44 PM   #4
tomc
Member
 
Location: Oregon

Join Date: Feb 2011
Posts: 29
Default

again assuming there are no 'X' in your read names (that you care about):

tr -d X reads.fa > filtered_reads.fa

Last edited by tomc; 04-08-2012 at 10:06 AM.
tomc is offline   Reply With Quote
Old 04-08-2012, 08:07 AM   #5
tusharbiot
Junior Member
 
Location: India

Join Date: Feb 2012
Posts: 7
Default

Thanks all! The script worked.
tusharbiot is offline   Reply With Quote
Old 04-08-2012, 03:38 PM   #6
krobison
Senior Member
 
Location: Boston area

Join Date: Nov 2007
Posts: 747
Default

Quote:
Originally Posted by twaddlac View Post
Assuming there are no 'X' in your read names:

Code:
perl -pe 's/X//g' reads.fa > new.reads.fa
Even better, remove the reliance on a dangerous assumption


Code:
perl -pe 's/X//g unless (/^>/)' reads.fa > new.reads.fa
[/QUOTE]
krobison is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:23 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO