I recently found that my junctions.bed file contained names that are not found in gff reference. How does it happen?
PHP Code:
upendra_35@vm142-14 tophat_out_3_7_8_lanes]$ head junctions.bed
track name=junctions description="TopHat junctions"
Scaffold006725 29 277 JUNC00000001 34 + 29 277 255,0,0 2 90,90 0,158
Scaffold006725 31 254 JUNC00000002 2 + 31 254 255,0,0 2 88,55 0,168
Scaffold007604 1 292 JUNC00000003 27 - 1 292 255,0,0 2 79,66 0,225
Scaffold007614 50 255 JUNC00000004 54 + 50 255 255,0,0 2 90,35 0,170
Scaffold006711 38 322 JUNC00000005 39 - 38 322 255,0,0 2 89,82 0,202
Scaffold007629 81 293 JUNC00000006 8 - 81 293 255,0,0 2 70,56 0,156
Scaffold006763 96 316 JUNC00000007 7 - 96 316 255,0,0 2 90,52 0,168
Scaffold007639 84 292 JUNC00000008 7 - 84 292 255,0,0 2 82,56 0,152
Scaffold007736 14 230 JUNC00000009 6 - 14 230 255,0,0 2 44,86 0,130
PHP Code:
[upendra_35@vm142-14 tophat_out_3_7_8_lanes]$ tail /mydata/B.rapa_gene_model_0830.gff
Scaffold004047 glean CDS 11 33 . + 0 Parent=Bra041170;
Scaffold004047 glean CDS 123 321 . + 2 Parent=Bra041170;
Scaffold004813 glean mRNA 190 414 0.998901 - . ID=Bra041171;
Scaffold004813 glean CDS 190 414 . - 0 Parent=Bra041171;
Scaffold004894 glean mRNA 3 410 1 + . ID=Bra041172;
Scaffold004894 glean CDS 3 410 . + 0 Parent=Bra041172;
Scaffold005112 blat mRNA 131 295 1.0000 + . ID=Bra041173;
Scaffold005112 blat CDS 131 295 100 + . Parent=Bra041173;
Scaffold008211 glean mRNA 18 251 0.970334 + . ID=Bra041174;
Scaffold008211 glean CDS 18 251 . + 0 Parent=Bra041174;
Comment