SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
TopHat Error: Could not find Bowtie index files /bowtie-0.12.5/indexes/. rebrendi Bioinformatics 11 06-22-2016 09:55 AM
Building bowtie index with mirBase hairpin.fa file Gators RNA Sequencing 6 05-07-2015 11:43 AM
bowtie index file jay2008 Bioinformatics 1 09-15-2011 01:42 AM
strange bowtie index building and mapping problem Gangcai Bioinformatics 0 08-04-2010 05:02 PM
tophat-bowtie building index repinementer Bioinformatics 1 07-17-2010 10:53 PM

Reply
 
Thread Tools
Old 08-23-2012, 12:02 PM   #1
Aholton
Member
 
Location: Mississippi, US

Join Date: Jul 2012
Posts: 23
Post Tophat building Bowtie index from gtf file

Not really a problem, as much as an inquiry. When I run Tophat it tells me that it is building a Bowtie index from my .gtf file I've supplied. However, I already supplied a Bowtie2 index in the command.

Was wondering if I am doing something wrong or if it needs both types of indexes?
Aholton is offline   Reply With Quote
Old 08-23-2012, 12:38 PM   #2
Aholton
Member
 
Location: Mississippi, US

Join Date: Jul 2012
Posts: 23
Default

Ok just checked and my CuffDiff output file gene_exp.diff is showing the first three columns as the same thing instead of showing me the geneid and gene so I'm assuming something is definitely wrong in my process.

I got the 22 chromosome files (including x and y) from NCBI and the genes.gtf file from them as well. I built a Bowtie2 index with the 22 chr. files with x and y. I checked to make sure they had the same first column identifiers.

I ran Tophat for the 2 samples (and their replicates) and then took their accepted_hits.bam and used CuffDiff to find the difference. I tried many different ways but can't get the gene_exp.diff file to have information in the first three columns.

tldr; Ran Tophat then CuffDiff, gene_exp.diff file's first 3 columns are the same, not supposed to be like this,help
Aholton is offline   Reply With Quote
Old 08-30-2012, 11:41 AM   #3
Aholton
Member
 
Location: Mississippi, US

Join Date: Jul 2012
Posts: 23
Default

Bump, still having the reoccuring problem.
Aholton is offline   Reply With Quote
Old 08-30-2012, 01:49 PM   #4
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,177
Default

Quote:
Originally Posted by Aholton View Post
Not really a problem, as much as an inquiry. When I run Tophat it tells me that it is building a Bowtie index from my .gtf file I've supplied. However, I already supplied a Bowtie2 index in the command.

Was wondering if I am doing something wrong or if it needs both types of indexes?
If you supply TopHat with a GTF annotation file it will first extract transcript sequences from your genome based on the annotations and build a Bowtie index of the transcript fasta file. I assume the index you supplied to TopHat is for the genome sequence. When using -G/--GTF TopHat will first attempt to align reads directly to transcripts then unaligned reads to the genome.

If you are planning on aligning several data sets to the same genome/annotation it would be a waste of time for TopHat to rebuild the transcript index every time. For this reason TopHat also has the --transcriptome-index option which you supply the first time you run TopHat, along with the -G option to direct TopHat where to store the index it builds. In subsequent runs you can omit the -G option and use the --transcriptome-index parameter to direct TopHat to where it can locate the prebuilt transcript indexes. Check out the "Supplying your own transcript annotation data:" section in the TopHat Manual.
kmcarr is offline   Reply With Quote
Old 08-30-2012, 02:12 PM   #5
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,177
Default

Quote:
Originally Posted by Aholton View Post
Ok just checked and my CuffDiff output file gene_exp.diff is showing the first three columns as the same thing instead of showing me the geneid and gene so I'm assuming something is definitely wrong in my process.

I got the 22 chromosome files (including x and y) from NCBI and the genes.gtf file from them as well. I built a Bowtie2 index with the 22 chr. files with x and y. I checked to make sure they had the same first column identifiers.

I ran Tophat for the 2 samples (and their replicates) and then took their accepted_hits.bam and used CuffDiff to find the difference. I tried many different ways but can't get the gene_exp.diff file to have information in the first three columns.

tldr; Ran Tophat then CuffDiff, gene_exp.diff file's first 3 columns are the same, not supposed to be like this,help
This is expected depending on the annotation supplied.

First of all there is a slight error in the cuffdiff manual describing the format of the gene_exp.diff file; it says that there are 13 columns with the first three being Tested id, gene, locus. In the current output there are actually 14 columns with a 'gene_id' column added between test_id and gene. Here is the header and two lines from a recent output of mine (and realize that the headers don't line up directly over the corresponding data columns due to text formatting):

Code:
test_id	gene_id	gene	locus	sample_1	sample_2	status	value_1	value_2	log2(fold_change)	test_stat	p_value	q_value	significant
AT1G01080	AT1G01080	AT1G01080	1:45295-47019	fae1_7-8	fae1_9-10	OK	23.3894	15.1937	-0.622382	1.24939	0.211521	0.706688	no
AT1G01090	AT1G01090	PDH-E1 ALPHA	1:47484-49286	fae1_7-8	fae1_9-10	OK	609.513	569.592	-0.0977292	0.153759	0.8778	0.999999	no
You can see that in the first line the test_id, gene_id and gene are all the same, whereas in the second there is a common name in the gene column. This is all a function of what information is present in your annotation (GTF) file which cufflinks/cuffdiff is able to parse out.
kmcarr is offline   Reply With Quote
Old 08-31-2012, 12:18 PM   #6
Aholton
Member
 
Location: Mississippi, US

Join Date: Jul 2012
Posts: 23
Default

Ok thank you so much! That makes much more sense now
Aholton is offline   Reply With Quote
Reply

Tags
building bowtie2 index, tophat

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:30 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO