SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Reply
 
Thread Tools
Old 12-17-2010, 02:19 AM   #1
semna
Member
 
Location: holland

Join Date: Apr 2010
Posts: 55
Default question?

Hi
I have a file that has a one column like this:

ENSSSCG00000000005|ENSSSCT00000000006
ENSSSCG00000000005|ENSSSCT00000000006
ATXN10|ENSSSCT00000000009
ENSSSCG00000019685|ENSSSCT00000021280
LDOC1L|ENSSSCT00000000023
-
TSPO|ENSSSCT00000000035
ENSSSCG00000000032|ENSSSCT00000000034
ENSSSCG00000000032|ENSSSCT00000000034
TTLL1|ENSSSCT00000000037
TTLL1|ENSSSCT00000000037
TTLL1|ENSSSCT00000000037
TTLL1|ENSSSCT00000000037
How can I get rid of lines that start with ENS or those lines that after gene name has |EN.......? Actually I want to keep just gene names like in this example ATXN10,LDOC1L,
TSPO and TTLL1.
Anyone know how can I do that? Thanks for your help
semna is offline   Reply With Quote
Old 12-17-2010, 02:54 AM   #2
Bruins
Member
 
Location: Groningen

Join Date: Feb 2010
Posts: 78
Default

Also considering your other, related post I think that perhaps it would be worth it for you to learn Perl and regular expressions. Since that is not going to cut it for you today, here's some sed for you, play around with it (I'm no guru). Hope that helps!

cheers

Code:
sed -e'
  s/|ens(.+?)//g,
  s/ens(.+?)|ens(.+?)//g
' inputfile.txt > output.txt
Code:
enssscg00000000005|ensssct00000000006
enssscg00000000005|ensssct00000000006
atxn10|ensssct00000000009
enssscg00000019685|ensssct00000021280
ldoc1l|ensssct00000000023
-
tspo|ensssct00000000035
enssscg00000000032|ensssct00000000034
enssscg00000000032|ensssct00000000034
ttll1|ensssct00000000037
ttll1|ensssct00000000037
ttll1|ensssct00000000037
ttll1|ensssct00000000037
pls use code tags
Bruins is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:33 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO