SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Alignment to b37 with decoy sequences (http://seqanswers.com/forums/showthread.php?t=16197)

Michael.James.Clark 12-13-2011 03:06 PM

Alignment to b37 with decoy sequences
 
Hi all,

I've heard from 1kG that alignment to b37 with decoy sequences is superior to without decoy sequences. Obviously the decoy is there to improve our results, but I'm just wondering if others here are using it and, if so, how your results have improved (or if they haven't, what the problem is).

MJ

Jon_Keats 12-14-2011 02:46 PM

I saw that version on the website, what are the "decoy" sequences? And more importantly where are they, are the standard chromosomes and super-contigs unchanges so that existing annotations still match?

lh3 12-14-2011 03:03 PM

No changes to the primary assembly. Decoy sequences are the ones from other assembled human individuals but absent from the primary assembly.

http://lh3lh3.users.sourceforge.net/...d/decoyseq.pdf

Michael.James.Clark 12-14-2011 03:24 PM

Quote:

Originally Posted by Jon_Keats (Post 59592)
I saw that version on the website, what are the "decoy" sequences? And more importantly where are they, are the standard chromosomes and super-contigs unchanges so that existing annotations still match?

Basically sequences not present in the reference genome, but present in humans and that should be aligned against for the most accurate/sensitive results.

Thanks for that link, Heng Li. I feel like the community outside the 1kG mailing list is almost ignorant to this whole "decoy sequence" issue.

lh3 12-14-2011 04:33 PM

The decoy sequence resembles those unplaced and unlocalized contigs in some sense. My experience is that including those GL* contigs or not does not have a big impact on the variant accuracy. I believe the decoy sequence should not have a bigger effect in general especially after filtering. Nonetheless, decoy may be helpful when you investigate weird and rare events. I used to see two SNPs on different chromosomes are tightly linked. It turns out that one of the loci has a much better match to the Venter genome - the reference genome is missing a piece. Other strange things may happen due to an imperfect reference genome.

I guess decoy will have a bigger impact to those who work on SVs. I look forward to the results the 1000SV group will get from the phase 2 data.


All times are GMT -8. The time now is 01:49 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.