Hello ...
I'm working on a thesis in bioinformatics , and I need to accomplish the following task using BLAST plant EST database .
Given 10 different plant species , I need to test the hypothesis assumes that genes containing SSRs are one of the sources that defect existing or generate new genes , leading to form pseudo-genes or non-translated expressed genes that could form microRNAs or form new genes , through the following procedure :
1.first by matching the left strand of flanking region of a given sequence to the left strand of flanking region of another different one , and the left strand of flanking region of one sequence to the right reverse complimentary strand of flanking region for the second sequence .
2.Then ; comparing the right strand of flanking region of a given sequence to the right strand of flanking region of another different one , and the right strand of flanking region of one sequence to the left reverse complimentary strand of flanking region for the second sequence .
The results of such comparisons are considered candidate genes that follow the hypothesis of 50% minimum difference between similarities of comparison phases , which will be further examined to detect their functional change through the evolutionary path of those 10 species on both levels of nucleotide sequences and protein sequences (for accuracy purposes ).
How can I perform this task and get the most accurate results out of it ?
I will be very grateful and thankful if you could guide me with some detailed steps , as I'm kind of lost here and don't know where to start from .
I'm working on a thesis in bioinformatics , and I need to accomplish the following task using BLAST plant EST database .
Given 10 different plant species , I need to test the hypothesis assumes that genes containing SSRs are one of the sources that defect existing or generate new genes , leading to form pseudo-genes or non-translated expressed genes that could form microRNAs or form new genes , through the following procedure :
1.first by matching the left strand of flanking region of a given sequence to the left strand of flanking region of another different one , and the left strand of flanking region of one sequence to the right reverse complimentary strand of flanking region for the second sequence .
2.Then ; comparing the right strand of flanking region of a given sequence to the right strand of flanking region of another different one , and the right strand of flanking region of one sequence to the left reverse complimentary strand of flanking region for the second sequence .
The results of such comparisons are considered candidate genes that follow the hypothesis of 50% minimum difference between similarities of comparison phases , which will be further examined to detect their functional change through the evolutionary path of those 10 species on both levels of nucleotide sequences and protein sequences (for accuracy purposes ).
How can I perform this task and get the most accurate results out of it ?
I will be very grateful and thankful if you could guide me with some detailed steps , as I'm kind of lost here and don't know where to start from .