SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
looking for .GFF / . GFF3 file repositories delinquentme Bioinformatics 2 01-25-2012 09:03 AM
TopHat and the GFF3 file Ender985 RNA Sequencing 13 05-28-2011 06:02 PM
gff3 file junction coordinates for tophat wenhuang Bioinformatics 0 03-06-2010 05:49 PM
GFF3 file format for TopHat shurjo Bioinformatics 0 01-20-2010 01:37 PM
In gene annotation table/gff3, why is same gene name appeared in different chromosome iloveneworleans Bioinformatics 1 01-14-2010 08:55 AM

Reply
 
Thread Tools
Old 02-08-2010, 12:27 PM   #1
Wei-HD
Member
 
Location: Germany

Join Date: Oct 2009
Posts: 59
Default GFF3 annotation file

Hi All,

I want to consult everyone how to use this GFF3 annotation file. Since I use bowtie index in which the name of chromosome has been changed as "1","2","3"..., instead of "chr1","chr2","chr3"..., therefore I could not upload the junction to UCSC since the name is case sensitive.

I just read the tophat manual providing TopHat with an annotation file. But I don't know how to use this annotation file. I just simply run "--solexa1.3-quals", then got the result. Should I use this file before running this command?
Can some experienced SEQers give me some hints?

Really appreciate your help
Wei-HD is offline   Reply With Quote
Old 02-08-2010, 12:55 PM   #2
shurjo
Senior Member
 
Location: Rockville, MD

Join Date: Jan 2009
Posts: 126
Default

This depends on how you want to treat your data. Giving TopHat the annotation file will force it look for the junctions contained therein even if it would not have considered them otherwise. There is a gtf2gff3 script available online (google the term) that you can use to make a GFF3 file for hg18 from the hg18 knownGenes table (which is downloadable in GTF format).

HTH,

Shurjo
shurjo is offline   Reply With Quote
Old 02-08-2010, 01:05 PM   #3
Wei-HD
Member
 
Location: Germany

Join Date: Oct 2009
Posts: 59
Default

Hi shurjo,

Thanks your reply. I already have the GFF3 file of mouse Mus_musculus.NCBIM37.56.gff3. But still have no clue when I should use this GFF file, before or after tophat running? sorry I am a bit confused.

Many thanks!
Wei-HD is offline   Reply With Quote
Old 02-08-2010, 01:13 PM   #4
svl
Member
 
Location: Netherlands

Join Date: Sep 2009
Posts: 43
Default

I am not sure what exactly you want, but if you:

1) want to use a GFF file to find out about gene-expression, then tophat since version 1.0.12 says: "TopHat no longer calculates gene expression. Users interested in expression calculations should consider using Cufflinks for gene- and isoform-level expression calculations."

or

2) want to provide your own junctions, then search the manual for "Supplying your own junctions" and you'll see the "-G/--GFF <GFF3 file>" flag explained

svl
svl is offline   Reply With Quote
Old 02-08-2010, 01:15 PM   #5
shurjo
Senior Member
 
Location: Rockville, MD

Join Date: Jan 2009
Posts: 126
Default

Neither before nor after but during the TopHat run :-). Use it with the -G option to Tophat

Like so:

tophat --mate-inner-dist 240 --mate-std-dev 25 ~/bin/bowtie/bowtie-0.12.1/indexes/hg18_inclusive 108971.read1.fa 108971.read2.fa -m 2 -p 4 -G /home/sensh/pipeline_test/GFF3/UCSC_knowngenes_hg18_tweaked.gff3
shurjo is offline   Reply With Quote
Old 02-08-2010, 01:26 PM   #6
Wei-HD
Member
 
Location: Germany

Join Date: Oct 2009
Posts: 59
Default

Thanks Shurjo and svl!

I just want to provide my own junctions. Therefore I should write (I put data file: bic.txt, and index file as well as GFF3 file in the same folder):

tophat --solexa1.3-quals Mus_musculus.NCBIM37.56 bic.txt -G mus_musculus.NCBIM37.56.gff3

But I got en error: Error: you must set the mean inner distance between mates with -r
And my data is not pair-end data.

Thanks in advance!
Wei-HD is offline   Reply With Quote
Old 02-08-2010, 01:38 PM   #7
svl
Member
 
Location: Netherlands

Join Date: Sep 2009
Posts: 43
Default

Quote:
Originally Posted by Wei-HD View Post
tophat --solexa1.3-quals Mus_musculus.NCBIM37.56 bic.txt -G mus_musculus.NCBIM37.56.gff3
Maybe you have to put all options before the index-base and reads. The manual says:

Usage: tophat [options]* <index_base> <reads1_1[,...,readsN_1]> [reads1_2,...readsN_2]
svl is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:42 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO