I have assembled a fungal genome de novo using Mira which has resulted in a multiple entry fasta file of over 4000 contigs. We have so far found all genes we have been looking for and have no plans just now to close the genome further.
The thing is, that I have problems visualizing the annotations. I have mostly been using Artemis so far, but this program is having problems with coordinates from blast searches and gene finder programs. It tends to lump all annotations in the first contig. To get the coordinates to work in Artemis I need to concatenate the multiple contigs into a single entry fasta file or do some scripting to convert the coordinates. None of the solutions is ideal. Also, genefinder programs have problems with the single contig file as they find genes that in reality border two contigs and thus are not real.
I have not managed to install GBrowse on my MacOSX 10.6 machine (there seems to be known problems), so I have been unable to try that one. I am currently playing around with Argo, but it doesn't seem that it would work for me.
What I would like to do is continue to use the multiple entry fasta genome file and be able to load annotations/genes onto that. That is after all the file I will be using as input for gene finder programs, blast, and so on. I like to see the structure of the genes such as the number of exons and introns and also be able to manually curate the annotations when needed. Artemis has the functions I need, but it doesn't deal with coordinates the way I want it to. I cannot be the only one out there who tries to find genes in a multiple contig file and then wants to visualize the results. Right?
If anyone has a completely different strategy for me, or just want to tell me to stop whining I'd like to hear that too!
The thing is, that I have problems visualizing the annotations. I have mostly been using Artemis so far, but this program is having problems with coordinates from blast searches and gene finder programs. It tends to lump all annotations in the first contig. To get the coordinates to work in Artemis I need to concatenate the multiple contigs into a single entry fasta file or do some scripting to convert the coordinates. None of the solutions is ideal. Also, genefinder programs have problems with the single contig file as they find genes that in reality border two contigs and thus are not real.
I have not managed to install GBrowse on my MacOSX 10.6 machine (there seems to be known problems), so I have been unable to try that one. I am currently playing around with Argo, but it doesn't seem that it would work for me.
What I would like to do is continue to use the multiple entry fasta genome file and be able to load annotations/genes onto that. That is after all the file I will be using as input for gene finder programs, blast, and so on. I like to see the structure of the genes such as the number of exons and introns and also be able to manually curate the annotations when needed. Artemis has the functions I need, but it doesn't deal with coordinates the way I want it to. I cannot be the only one out there who tries to find genes in a multiple contig file and then wants to visualize the results. Right?
If anyone has a completely different strategy for me, or just want to tell me to stop whining I'd like to hear that too!
Comment