Dear bioinformatics community,
I have got several deepseq files in tab delimited format: for example in this tabular format:
TAGGAACCATTAGCCAACAA 88889
GATTAGGCCCAAATGCAAAG 7799
....
or in this tabular format:
1 1 3233 223322 TAGGGCCTTAGGAAGCCTAA
1 1 3234 222334 AGGTAACCGATAGAGGTCCA
....
I would like to convert these files to fasta format. If I can use one or multiple non-sequence column as the fasta seq title, that will be nice.
What tools or scripts can I use to achieve this?
-these files are pretty big -around 400Mb, so I cannot use excel to do the job.
( I don't have programming skill yet. I google searched and found tab2fasta.pl within HOMER package, and bioscripts.convert (but this doesn't work because it requires 1st column to be name and 2nd to be sequence). I haven't tried HOMER package yet. Thought I would get some insights from your guys first.)
Thanks a lot!
Jian
I have got several deepseq files in tab delimited format: for example in this tabular format:
TAGGAACCATTAGCCAACAA 88889
GATTAGGCCCAAATGCAAAG 7799
....
or in this tabular format:
1 1 3233 223322 TAGGGCCTTAGGAAGCCTAA
1 1 3234 222334 AGGTAACCGATAGAGGTCCA
....
I would like to convert these files to fasta format. If I can use one or multiple non-sequence column as the fasta seq title, that will be nice.
What tools or scripts can I use to achieve this?
-these files are pretty big -around 400Mb, so I cannot use excel to do the job.
( I don't have programming skill yet. I google searched and found tab2fasta.pl within HOMER package, and bioscripts.convert (but this doesn't work because it requires 1st column to be name and 2nd to be sequence). I haven't tried HOMER package yet. Thought I would get some insights from your guys first.)
Thanks a lot!
Jian
Comment