gsnap can use information about known splice sites if you give it a .iit file with this format:
>NM_004448.ERBB2.exon1 17:35110090..35110091 donor 6678
>NM_004448.ERBB2.exon2 17:35116768..35116769 acceptor 6678
>NM_004448.ERBB2.exon2 17:35116920..35116921 donor 1179
>NM_004448.ERBB2.exon3 17:35118099..35118100 acceptor 1179
>NM_004449.ERG.exon1 21:38955452..38955451 donor 783
>NM_004449.ERG.exon2 21:38878740..38878739 acceptor 783
>NM_004449.ERG.exon2 21:38878638..38878637 donor 360
>NM_004449.ERG.exon3 21:38869542..38869541 acceptor 360
The number at the end is the length of the intron and it is optional. In this example (which comes from the gmap/gsnap README file), each donor is getting spliced to the following acceptor. But I wonder if this is a requirement. If you have one exon that can get spliced to three different acceptors, does that mean that you have to list that exon three times, once followed by acceptor 1, another followed by acceptor two and the third followed by acceptor 3? Or can I just list all the donor sites and all the acceptor sites without regard to how they are matched up?
Thank you.
Eric
>NM_004448.ERBB2.exon1 17:35110090..35110091 donor 6678
>NM_004448.ERBB2.exon2 17:35116768..35116769 acceptor 6678
>NM_004448.ERBB2.exon2 17:35116920..35116921 donor 1179
>NM_004448.ERBB2.exon3 17:35118099..35118100 acceptor 1179
>NM_004449.ERG.exon1 21:38955452..38955451 donor 783
>NM_004449.ERG.exon2 21:38878740..38878739 acceptor 783
>NM_004449.ERG.exon2 21:38878638..38878637 donor 360
>NM_004449.ERG.exon3 21:38869542..38869541 acceptor 360
The number at the end is the length of the intron and it is optional. In this example (which comes from the gmap/gsnap README file), each donor is getting spliced to the following acceptor. But I wonder if this is a requirement. If you have one exon that can get spliced to three different acceptors, does that mean that you have to list that exon three times, once followed by acceptor 1, another followed by acceptor two and the third followed by acceptor 3? Or can I just list all the donor sites and all the acceptor sites without regard to how they are matched up?
Thank you.
Eric
Comment