Hi everyone,
I currently have 24 fasta files (for 24 species) with thousands of contigs. These files are contigs constructed from a reference transcriptome, so each file has contig numbers that are the same as contig numbers in the other files. For example:
File 1 (SpeciesA):
>contig_1
AAAAAAAAAAAAAA
>contig_2
TTTTTTTTTTTTTTTT
>contig_3
CCCCCCCCCCCCCC
File 2 (SpeciesB):
>contig_2
TTTTTTTTTTTTTTTT
>contig_3
CCCCCCCCCCCCCC
File 3 (SpeciesC):
>contig_1
AAAAAAAAAAAAAA
>contig_2
TTTTTTTTTTTTTTTT
I want to create files for each contig, so that instead of having 24 fasta files (one for each species), I have a file for each contig number like:
File 1 (contig1):
>contig_1_SpeciesA
AAAAAAAAAAAAAA
>contig_1_SpeciesC
AAAAAAAAAAAAAA
File 2 (contig2):
etc..
My programming skills are pretty much non-existent, so any direction on how I could do this for many large files would be highly appreciated!
I currently have 24 fasta files (for 24 species) with thousands of contigs. These files are contigs constructed from a reference transcriptome, so each file has contig numbers that are the same as contig numbers in the other files. For example:
File 1 (SpeciesA):
>contig_1
AAAAAAAAAAAAAA
>contig_2
TTTTTTTTTTTTTTTT
>contig_3
CCCCCCCCCCCCCC
File 2 (SpeciesB):
>contig_2
TTTTTTTTTTTTTTTT
>contig_3
CCCCCCCCCCCCCC
File 3 (SpeciesC):
>contig_1
AAAAAAAAAAAAAA
>contig_2
TTTTTTTTTTTTTTTT
I want to create files for each contig, so that instead of having 24 fasta files (one for each species), I have a file for each contig number like:
File 1 (contig1):
>contig_1_SpeciesA
AAAAAAAAAAAAAA
>contig_1_SpeciesC
AAAAAAAAAAAAAA
File 2 (contig2):
etc..
My programming skills are pretty much non-existent, so any direction on how I could do this for many large files would be highly appreciated!
Comment