SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bowtie2 long index(.bt21) and tophat sbdk82 Bioinformatics 5 09-04-2015 06:32 AM
Could not find Bowtie2 index files (genome.*.bt2) Juntheboon Bioinformatics 10 03-03-2015 11:25 AM
Tophat Can't Find Bowtie2 Index!?!?! thickrick99 RNA Sequencing 1 08-12-2014 11:41 PM
How to create bowtie2 index for use in Gene Pattern mara34 RNA Sequencing 4 02-03-2014 01:29 AM
How to align reads to multiple index in bowtie2 ssharma Bioinformatics 3 07-16-2013 03:13 AM

Reply
 
Thread Tools
Old 06-30-2015, 03:38 AM   #1
imsharmanitin
Postdoc Cancer Bioinformatics
 
Location: Olso, Norway

Join Date: Dec 2014
Posts: 17
Default transcriptome_index Bowtie2 Index

Dear all,

As far as understand Tophat2 creates an transcriptome_index which also includes Bowtie2 Index files using GTF and the genome file. Then why do we need to supply Bowtie2 Index to create transcriptome index for the first time.

I tried to create transcriptome index without supplying bowtie2 index files and it gave following error.

tophat2 -G ../data/Homo_sapiens.GRCh37.75.gtf --transcriptome-index ./transcriptome_index/ ../data/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa

[2015-06-30 12:25:22] Building transcriptome files with TopHat v2.0.13
-----------------------------------------------
[2015-06-30 12:25:22] Checking for Bowtie
Bowtie version: 2.2.4.0
[2015-06-30 12:25:24] Checking for Bowtie index files (genome)..
Error: Could not find Bowtie 2 index files (../data/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.*.bt2)



However, after that I created bowtie2 index files and ran same process and no error was produced.
imsharmanitin is offline   Reply With Quote
Old 06-30-2015, 03:46 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975
Default

Tophat2 uses the entire human genome index files and then creates a "transcriptome-only" subset using information in the GTF file. You only need to do this once, as you discovered.
GenoMax is offline   Reply With Quote
Old 06-30-2015, 05:07 AM   #3
imsharmanitin
Postdoc Cancer Bioinformatics
 
Location: Olso, Norway

Join Date: Dec 2014
Posts: 17
Default

Quote:
Originally Posted by GenoMax View Post
Tophat2 uses the entire human genome index files and then creates a "transcriptome-only" subset using information in the GTF file. You only need to do this once, as you discovered.

so this will be the workflow:

1) generate whole genome index using bowtie2
Homo_sapiens.GRCh37.75.dna.primary_assembly.fa
Homo_sapiens.GRCh37.75.gtf


2) the using tophat 2 to create a "transcriptome-only" subset using the whole genome index flie created in step 1

3) point the tophat 2 to the directory containing "transcriptome-only" subset using

--transcriptome-index <directory containing "transcriptome-only" subset>

for subsequent runs
imsharmanitin is offline   Reply With Quote
Old 06-30-2015, 05:11 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975
Default

That is correct. When you are using --transcriptome-only option consider additional options (e.g. -T) that become relevant (scroll down to find the --transcriptome-only option section): https://ccb.jhu.edu/software/tophat/manual.shtml#toph.
GenoMax is offline   Reply With Quote
Old 06-30-2015, 04:43 PM   #5
imsharmanitin
Postdoc Cancer Bioinformatics
 
Location: Olso, Norway

Join Date: Dec 2014
Posts: 17
Default

Quote:
Originally Posted by GenoMax View Post
That is correct. When you are using --transcriptome-only option consider additional options (e.g. -T) that become relevant (scroll down to find the --transcriptome-only option section): https://ccb.jhu.edu/software/tophat/manual.shtml#toph.
If i understood correctly, when we use the --transcriptome-index without -T then the reads will be first mapped to the transcriptome-index and the reads which fail to match will be mapped to genome (just like using only -G option)

on the other hand if i use -T then mapping will happen only with transcriptome and report only those mappings as genomic mappings.

if i want to map to genome only then use of -G and -T should be avoided.

Also as far i have read and understood mapping to both transcriptome and genome will be an ideal approach i.e. use of -T should be avoided. Am I right?
imsharmanitin is offline   Reply With Quote
Reply

Tags
bioinformactics, bowtie 2 indexes, rna-seq, tophat 2, transcriptome index

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:31 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO