SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Different starting position in Genbank blast result? khunny7 Bioinformatics 3 02-02-2011 01:13 PM
RNA-Seq: SAMMate: a GUI tool for processing short read alignments in SAM/BAM format. Newsbot! Literature Watch 0 01-15-2011 02:50 AM
Short read alignments between species sd3 Bioinformatics 3 10-21-2010 06:16 PM
get read position from Samtools pileup rcorbett Bioinformatics 2 02-10-2010 03:38 PM
How to visualise alignments with different read lengths? lindseyjane Bioinformatics 5 09-17-2009 01:27 AM

Reply
 
Thread Tools
Old 02-02-2011, 10:36 AM   #1
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Cool One read has many alignments starting at the same position

Dear SEQanswers,

I have a particular gene. I use its sequence as a reference to gather an atlas of variations induced by some enzyme, whose target is located within the said gene sequence.


The sequence of the region of interest follows.


Code:
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca
In the sequence, two repeated 4-mers are colored in red.


I also happen to have two short reads obtained by DNA sequencing.

FORWARD:

gatattgatattggtcttaatatgacttgttttcattgttctcagccagcatggcagcctctttcccacccac


REVERSE:

tggtcttaatatgacttgttttcattgttctcagccagcatggcagcctctttcccacccaccttgggactca


Aligning these sequences onto the aforementioned reference sequence can potentially produce at least 5 alignments, regardless of utilized alignment software & algorithms.




Alignment 1


Code:
gatattgatattggtcttaatatgacttgttttcattgttctca---------gccagcatggcagcctctttcccacccac FORWARD
           tggtcttaatatgacttgttttcattgttctca---------gccagcatggcagcctctttcccacccaccttgggactca REVERSE
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca GENE

Alignment 2


Code:
gatattgatattggtcttaatatgacttgttttcattgttctc---------agccagcatggcagcctctttcccacccac FORWARD
           tggtcttaatatgacttgttttcattgttctc---------agccagcatggcagcctctttcccacccaccttgggactca REVERSE
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca GENE

Alignment 3


Code:
gatattgatattggtcttaatatgacttgttttcattgttct---------cagccagcatggcagcctctttcccacccac FORWARD
           tggtcttaatatgacttgttttcattgttct---------cagccagcatggcagcctctttcccacccaccttgggactca REVERSE
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca GENE

Alignment 4

Code:
gatattgatattggtcttaatatgacttgttttcattgttc---------tcagccagcatggcagcctctttcccacccac FORWARD
           tggtcttaatatgacttgttttcattgttc---------tcagccagcatggcagcctctttcccacccaccttgggactca REVERSE
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca GENE


Alignment 5


Code:
gatattgatattggtcttaatatgacttgttttcattgtt---------ctcagccagcatggcagcctctttcccacccac FORWARD
           tggtcttaatatgacttgttttcattgtt---------ctcagccagcatggcagcctctttcccacccaccttgggactca REVERSE
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca GENE

Alignment 6

Code:
gatattgatattggtcttaatatgacttgttttcattgt--t-------ctcagccagcatggcagcctctttcccacccac FORWARD
           tggtcttaatatgacttgttttcattgt--t-------ctcagccagcatggcagcctctttcccacccaccttgggactca REVERSE
gatattgatattggtcttaatatgacttgttttcattgttctcaggtacctcagccagcatggcagcctctttcccacccaccttgggactca GENE
(alignments omitted)


I can rule out the alignment number 6 and similar omitted alignments by considering them too complex in comparison to the 5 others.



Therefore, there is a deletion of 9 nucleotides.

However, the deletion can potentially start at 5 different positions.


Based solely on sequence information, I think there is just nothing that can be done to retrieve the true starting position of the deletion.



SEQanswers, what do you think ?

-Seb
seb567 is offline   Reply With Quote
Old 02-03-2011, 10:09 AM   #2
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

There is no "true" start. But there's probably a guideline as to whether one reports alignment 1 or 4. 4 puts the deletion as far left as possible, 1 puts it as far right as possible.

I'm not sure that current SNP databases are currently consistent about this; I'm pretty sure I found a few places in the mouse genome where the same indel was reported under two ID's because of ambiguities like that.

So if you don't know which way is preferred, pick one, and make note of which way you picked.
swbarnes2 is offline   Reply With Quote
Reply

Tags
combinatorics

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:25 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO