Hi, I have a few questions about training Augustus:
1) is etraining something new that supercedes autoAug?
2) Are the training genes expected to be essentially highly curated, manually annotated genes?
3) Based on 2 above, I know there is a method for using ESTs in the training. But how will it know where the ORF actually starts in the EST vs. what is UTR, and is that likely to be worse than a model trained from carefully curated data?
The main reason I am asking is that I am starting to suspect that using a pre-trained model from a somewhat related organism may actually better than making my own custom model, partly because our custom organism has few highly-curated genes and the transcriptome assemblies from it are not that great.
Thanks,
Shane
1) is etraining something new that supercedes autoAug?
2) Are the training genes expected to be essentially highly curated, manually annotated genes?
3) Based on 2 above, I know there is a method for using ESTs in the training. But how will it know where the ORF actually starts in the EST vs. what is UTR, and is that likely to be worse than a model trained from carefully curated data?
The main reason I am asking is that I am starting to suspect that using a pre-trained model from a somewhat related organism may actually better than making my own custom model, partly because our custom organism has few highly-curated genes and the transcriptome assemblies from it are not that great.
Thanks,
Shane