View Single Post
Old 06-13-2019, 05:02 AM   #1
dineshkumarsrk
Junior Member
 
Location: India

Join Date: Mar 2019
Posts: 5
Default How to reverse complement the nuclotides, if the coordinates are in reverse order?

I have the series of coordinates in id.txt file, whose coordinates sequences are in genome.fasta file. The coordinates of id.txt file is shown below,
Contig3:15-7
Contig2:5-10
Contig1:12-3

The genome.fasta file is shown below,
>Contig1
AAGGCCATCAAGGACGTGGATGAGGTCGTCAAG
>Contig2
ACGAAGCGCTCGCCAAGGCCGAAGAAGAAGGCC
>Contig3
GCTGCGGCGCTGATCCTGGCGGCCCGCGCCGAG

I have used the following codes to extract the sequences from genome.fasta based on the coordinates in <id.txt> by using the following command <xargs samtools faidx genome.fasta <id.txt > result.fasta> It has extracted only <Contig2:5-10sequence>. Because, the coordinates of <Contig2:5-10> is in proper order. But <Contig3:15-7> and <Contig1:12-3> coordinates are in reverse order, So samtools could not fetch those sequences.

I need to extract those sequences from <Contig2:5-10> and <Contig3:15-7> coordinates and also I want to reverse complement them. I have larger fasta file and plenty of coordinates sequences to be extracted, So I need to automate this process. So, please help me to automate this process. Thank you in advance.
dineshkumarsrk is offline   Reply With Quote