1. Use things like "cut" and "uniq" to determine this. This isn't something you need to look up, just determine it yourself.
2. How does one define a gene? Is it a location, a sequence, something else? If you have essentially the same sequence on different chromosomes and both are expressed are they the same gene or different ones? In such cases, gencode/ensembl will give each instance a unique ID. UCSC will give each instance the same ID in such cases, which is a good way to completely break a LOT of programs.This is why one should normally quantify by gene ID. You can add gene names after everything is analysed.
3. UCSC annotations are rather minimalistic.
|