SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
keeping heterozygote information in fasta format from .bam files pjalvaro Bioinformatics 1 06-12-2013 11:39 PM
piRNAs sequence files in fasta format for rat and mapping small RNA seq reads TJC Epigenetics 0 10-08-2012 02:11 PM
Is it possible to convert FASTQ/FASTA files in HDF5 format? vincebaby6 Pacific Biosciences 5 08-30-2012 06:30 AM
merging bam files with different headers dnusol Bioinformatics 2 02-06-2012 11:09 PM
Replacing FASTA headers for TopHat & Cufflinks brachysclereid Bioinformatics 2 02-16-2011 04:44 AM

Reply
 
Thread Tools
Old 02-05-2013, 06:33 AM   #1
Shishir
Member
 
Location: Germany

Join Date: Nov 2012
Posts: 22
Default Any script to format headers in fasta files?

I have a 47GB file to parse. The sequences are in following format:
>TSCS_00041 gene0EA_12345_rframe2_ORF
MLAATHYYKFAIRRLFPLLKDTICASYSISIKHHENFMALSNMPKIWEDVEVDGNNMQWTRFQTTPVMPVYFIAAGVFNLSFITNWNTKLLYRKDILPYMTFAYNVAKNIAWFLSHIRKTKITNHI
>TSCS_00044 gene0EA_12341_rframe2_ORF
MTICASYSISIKHHENFMAIKHHENFMALSNMPKIWEDV
I simply want to format this file like:
>TSCS_00041
MLAATHYYKFAIRRLFPLLKDTICASYSISIKHHENFMALSNMPKIWEDVEVDGNNMQWTRFQTTPVMPVYFIAAGVFNLSFITNWNTKLLYRKDILPYMTFAYNVAKNIAWFLSHIRKTKITNHI
>TSCS_00044
MTICASYSISIKHHENFMAIKHHENFMALSNMPKIWEDV
Could anyone share the script.
Shishir is offline   Reply With Quote
Old 02-05-2013, 06:45 AM   #2
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

cut -f1 -d" " inputold.fa > outputnew.fa
Richard Finney is offline   Reply With Quote
Old 02-05-2013, 06:52 AM   #3
Shishir
Member
 
Location: Germany

Join Date: Nov 2012
Posts: 22
Default

Thanks Richard!
Shishir is offline   Reply With Quote
Reply

Tags
bioinfomatics, fasta format, next gen sequencing, perl, perl script

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:20 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO