![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
What tools can convert sequence file from tabular format to fasta format? | yangjianhunt | Bioinformatics | 5 | 03-26-2014 02:48 PM |
Convert maf (multiple alignment file) to FASTA | Avro1986 | Bioinformatics | 4 | 12-17-2012 12:12 PM |
Convert WIG file into Fasta file | kumardeep | Bioinformatics | 3 | 08-23-2012 05:56 AM |
Text file editing_perl-GFF | bioman1 | Bioinformatics | 0 | 07-05-2012 09:47 AM |
How to convert diploid abi file into two fasta sequences? | ymc | Bioinformatics | 1 | 04-28-2011 07:24 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: ITALY Join Date: Oct 2010
Posts: 89
|
![]()
Hi all,
I have this file: TGAGGTAGTAGATTGTATAGTT 424866 TAGCTTATCAGACTGATGTTGA 359141 TAGCTTATCAGACTGATGTTGAC 276052 TGAGGTAGTAGGTTGTATAGTT 268735 ACAGTAGTCTGCACATTGGTT 209280 ACAGTAGTCTGCACATTGGTTA 178652 TAGCTTATCAGACTGATGTTG 166159 TGAGGTAGTAGGTTGTGTGGTT 105275 TGAGGTAGTAGGTTGTATGGTT 102447 AGCAGCATTGTACAGGGCTATGA 91296 TGAGGTAGTAGGTTGTGTGGTTT 63300 TGAGGTAGTAGTTTGTACAGTT 61604 TGAGGTAGTAGATTGTATAGT 61492 TAGCACCATCTGAAATCGGTTA 60637 TTCAAGTAATCCAGGATAGGCT 52300 TGAGGTAGTAGATTGTATAGTTA 50905 TGAGGTAGTAGGTTGTATAGT 48150 TACAGTAGTCTGCACATTGGTT 47534 TCTACAGTCCGACGATC 45803 ................ They are sequences and the numbers are the respective occurrences. I would like to convert that file in a fasta format, decollapsing the sequences and giving a name like that: >Sample1_0 TGAGGTAGTAGATTGTATAGTT >Sample1_1 TGAGGTAGTAGATTGTATAGTT >Sample1_2 TGAGGTAGTAGATTGTATAGTT ..... for 424866 times. >Sample1_424666 TGAGGTAGTAGATTGTATAGTT then >Sample1_424667 TAGCTTATCAGACTGATGTTGA (the second sequences) The same for the other sequences in series. Is there any scripts for that purpose? Thanks in advance, Giorgio |
![]() |
![]() |
![]() |
#2 |
PhD Student
Location: Denmark Join Date: Jul 2012
Posts: 164
|
![]() Code:
awk '{for(i=0;i<=($2-1);i++) print ">Sample"NR"_"i"\n"$1}' file.txt |
![]() |
![]() |
![]() |
#3 |
Member
Location: ITALY Join Date: Oct 2010
Posts: 89
|
![]()
Thanks vivek it works great!
|
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: Denmark Join Date: Apr 2009
Posts: 153
|
![]()
This can also be done with Biopieces (www.biopieces.org):
Code:
read_tab -i in.tab -k SEQ,COUNT | duplicate_record -k COUNT | add_ident -k SEQ_NAME -p Sample1_ | write_fasta -x |
![]() |
![]() |
![]() |
Thread Tools | |
|
|