Hey all,
I want to find the unique number of bases from the gff file. My problem here is the exons might be overlapping. I tried one algorithm but it failed.
For example -
Start Stop
8 11
9 14
10 18
15 20
1. I took each Set A and compared with all the others
2. if Set A overlapped with Set B, I took the minimum start point and maximum end point and marked the Set B as 'Done' ( i.e. do not consider them again ).
3. After comparing Set A to all the coordinates, I found the bases in Set A. ( End - Start ) ( zero based )
I repeated the process for all the co-ordinates.
However, this fails.
Can anyone please tell me some other way to do this.
Thanks in advance,
K
I want to find the unique number of bases from the gff file. My problem here is the exons might be overlapping. I tried one algorithm but it failed.
For example -
Start Stop
8 11
9 14
10 18
15 20
1. I took each Set A and compared with all the others
2. if Set A overlapped with Set B, I took the minimum start point and maximum end point and marked the Set B as 'Done' ( i.e. do not consider them again ).
3. After comparing Set A to all the coordinates, I found the bases in Set A. ( End - Start ) ( zero based )
I repeated the process for all the co-ordinates.
However, this fails.
Can anyone please tell me some other way to do this.
Thanks in advance,
K
Comment