Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
GTF reference files that work with TopHat/Cufflinks marcora Bioinformatics 23 01-14-2014 11:10 PM
SRMA Problem SAMRecord contig does not match the current reference sequence contig gavin.oliver Bioinformatics 5 07-05-2011 05:28 AM
Using TopHat output files with UCSC genome browser statsteam Bioinformatics 7 05-16-2011 06:09 PM
RNA-Seq: ENCODE whole-genome data in the UCSC genome browser (2011 update). Newsbot! Literature Watch 1 11-24-2010 01:08 PM
Gene prediction programs that work with multi-fasta files Hobbe Bioinformatics 0 11-16-2010 01:31 AM

Thread Tools
Old 10-05-2010, 12:59 AM   #1
Location: Uppsala, Sweden

Join Date: Apr 2010
Posts: 29
Default Genome browser that work well with multiple contig files

I have assembled a fungal genome de novo using Mira which has resulted in a multiple entry fasta file of over 4000 contigs. We have so far found all genes we have been looking for and have no plans just now to close the genome further.

The thing is, that I have problems visualizing the annotations. I have mostly been using Artemis so far, but this program is having problems with coordinates from blast searches and gene finder programs. It tends to lump all annotations in the first contig. To get the coordinates to work in Artemis I need to concatenate the multiple contigs into a single entry fasta file or do some scripting to convert the coordinates. None of the solutions is ideal. Also, genefinder programs have problems with the single contig file as they find genes that in reality border two contigs and thus are not real.

I have not managed to install GBrowse on my MacOSX 10.6 machine (there seems to be known problems), so I have been unable to try that one. I am currently playing around with Argo, but it doesn't seem that it would work for me.

What I would like to do is continue to use the multiple entry fasta genome file and be able to load annotations/genes onto that. That is after all the file I will be using as input for gene finder programs, blast, and so on. I like to see the structure of the genes such as the number of exons and introns and also be able to manually curate the annotations when needed. Artemis has the functions I need, but it doesn't deal with coordinates the way I want it to. I cannot be the only one out there who tries to find genes in a multiple contig file and then wants to visualize the results. Right?

If anyone has a completely different strategy for me, or just want to tell me to stop whining I'd like to hear that too!
Hobbe is offline   Reply With Quote
Old 10-05-2010, 01:28 AM   #2
Location: France

Join Date: Dec 2009
Posts: 41

If I remind well IGV can do it, ie assign correctly to its sequence a gff file (multiple seqs, one gff...)
Either you can change coordiantes in your annotation/balst file using a simple perl script. It seems to me it was what was recommended by Artemis developpers.

Francois Sabot, PhD

Be realistic. Demand the Impossible.
francois.sabot is offline   Reply With Quote
Old 10-05-2010, 05:41 AM   #3
Location: Uppsala, Sweden

Join Date: Apr 2010
Posts: 29

Thanks Francois, that is a great suggestion. I am fiddling around with it now, it just might work.

Anyone else have a suggestion?
Hobbe is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 12:28 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO