I have the series of coordinates in id.txt file, whose coordinates sequences are in genome.fasta file. The coordinates of id.txt file is shown below,
Contig3:15-7
Contig2:5-10
Contig1:12-3
The genome.fasta file is shown below,
>Contig1
AAGGCCATCAAGGACGTGGATGAGGTCGTCAAG
>Contig2
ACGAAGCGCTCGCCAAGGCCGAAGAAGAAGGCC
>Contig3
GCTGCGGCGCTGATCCTGGCGGCCCGCGCCGAG
I have used the following codes to extract the sequences from genome.fasta based on the coordinates in <id.txt> by using the following command <xargs samtools faidx genome.fasta <id.txt > result.fasta> It has extracted only <Contig2:5-10sequence>. Because, the coordinates of <Contig2:5-10> is in proper order. But <Contig3:15-7> and <Contig1:12-3> coordinates are in reverse order, So samtools could not fetch those sequences.
I need to extract those sequences from <Contig2:5-10> and <Contig3:15-7> coordinates and also I want to reverse complement them. I have larger fasta file and plenty of coordinates sequences to be extracted, So I need to automate this process. So, please help me to automate this process. Thank you in advance.
Contig3:15-7
Contig2:5-10
Contig1:12-3
The genome.fasta file is shown below,
>Contig1
AAGGCCATCAAGGACGTGGATGAGGTCGTCAAG
>Contig2
ACGAAGCGCTCGCCAAGGCCGAAGAAGAAGGCC
>Contig3
GCTGCGGCGCTGATCCTGGCGGCCCGCGCCGAG
I have used the following codes to extract the sequences from genome.fasta based on the coordinates in <id.txt> by using the following command <xargs samtools faidx genome.fasta <id.txt > result.fasta> It has extracted only <Contig2:5-10sequence>. Because, the coordinates of <Contig2:5-10> is in proper order. But <Contig3:15-7> and <Contig1:12-3> coordinates are in reverse order, So samtools could not fetch those sequences.
I need to extract those sequences from <Contig2:5-10> and <Contig3:15-7> coordinates and also I want to reverse complement them. I have larger fasta file and plenty of coordinates sequences to be extracted, So I need to automate this process. So, please help me to automate this process. Thank you in advance.