View Single Post
Old 02-02-2011, 10:36 AM   #1
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Cool One read has many alignments starting at the same position

Dear SEQanswers,

I have a particular gene. I use its sequence as a reference to gather an atlas of variations induced by some enzyme, whose target is located within the said gene sequence.


The sequence of the region of interest follows.


Code:
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca
In the sequence, two repeated 4-mers are colored in red.


I also happen to have two short reads obtained by DNA sequencing.

FORWARD:

gatattgatattggtcttaatatgacttgttttcattgttctcagccagcatggcagcctctttcccacccac


REVERSE:

tggtcttaatatgacttgttttcattgttctcagccagcatggcagcctctttcccacccaccttgggactca


Aligning these sequences onto the aforementioned reference sequence can potentially produce at least 5 alignments, regardless of utilized alignment software & algorithms.




Alignment 1


Code:
gatattgatattggtcttaatatgacttgttttcattgttctca---------gccagcatggcagcctctttcccacccac FORWARD
           tggtcttaatatgacttgttttcattgttctca---------gccagcatggcagcctctttcccacccaccttgggactca REVERSE
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca GENE

Alignment 2


Code:
gatattgatattggtcttaatatgacttgttttcattgttctc---------agccagcatggcagcctctttcccacccac FORWARD
           tggtcttaatatgacttgttttcattgttctc---------agccagcatggcagcctctttcccacccaccttgggactca REVERSE
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca GENE

Alignment 3


Code:
gatattgatattggtcttaatatgacttgttttcattgttct---------cagccagcatggcagcctctttcccacccac FORWARD
           tggtcttaatatgacttgttttcattgttct---------cagccagcatggcagcctctttcccacccaccttgggactca REVERSE
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca GENE

Alignment 4

Code:
gatattgatattggtcttaatatgacttgttttcattgttc---------tcagccagcatggcagcctctttcccacccac FORWARD
           tggtcttaatatgacttgttttcattgttc---------tcagccagcatggcagcctctttcccacccaccttgggactca REVERSE
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca GENE


Alignment 5


Code:
gatattgatattggtcttaatatgacttgttttcattgtt---------ctcagccagcatggcagcctctttcccacccac FORWARD
           tggtcttaatatgacttgttttcattgtt---------ctcagccagcatggcagcctctttcccacccaccttgggactca REVERSE
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca GENE

Alignment 6

Code:
gatattgatattggtcttaatatgacttgttttcattgt--t-------ctcagccagcatggcagcctctttcccacccac FORWARD
           tggtcttaatatgacttgttttcattgt--t-------ctcagccagcatggcagcctctttcccacccaccttgggactca REVERSE
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca GENE
(alignments omitted)


I can rule out the alignment number 6 and similar omitted alignments by considering them too complex in comparison to the 5 others.



Therefore, there is a deletion of 9 nucleotides.

However, the deletion can potentially start at 5 different positions.


Based solely on sequence information, I think there is just nothing that can be done to retrieve the true starting position of the deletion.



SEQanswers, what do you think ?

-Seb
seb567 is offline   Reply With Quote