SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Building bowtie index with mirBase hairpin.fa file Gators RNA Sequencing 6 05-07-2015 11:43 AM
Problem with Building Index with Bowtie viv Bioinformatics 3 08-27-2014 12:38 AM
shorter time for building bowtie index adrian Bioinformatics 5 06-30-2014 09:29 AM
building consensus with multiple samples mht Bioinformatics 2 12-02-2012 02:22 PM
tophat-bowtie building index repinementer Bioinformatics 1 07-17-2010 10:53 PM

Reply
 
Thread Tools
Old 09-18-2014, 01:20 AM   #1
LeonDK
Member
 
Location: Denmark

Join Date: Sep 2014
Posts: 69
Default TopHat2 on multiple samples, avoid building Bowtie index from genes.fa each time?

Hi all,

Ultimately aiming at differential expression, I'm mapping human RNAseq read using tophat2 with the following command:
Code:
tophat2 --num-threads 12 --GTF /Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf /Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/genome myfastq_R1.fastq.gz myfastq_R2.fastq.gz
Is it really necessary to:
Code:
[2014-09-18 10:38:45] Building transcriptome data files /tmp/genes
[2014-09-18 10:39:21] Building Bowtie index from genes.fa
Foreach sample? I mean - The different samples are all mapped using the same Bowtie2Index/genome files and the same Genes/genes.gtf files?

Cheers,
Leon
LeonDK is offline   Reply With Quote
Old 09-18-2014, 01:33 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Have a look at the --transcriptome-index option, which is what you're looking for.
dpryan is offline   Reply With Quote
Old 09-18-2014, 02:41 AM   #3
LeonDK
Member
 
Location: Denmark

Join Date: Sep 2014
Posts: 69
Default

Quote:
Originally Posted by dpryan View Post
Have a look at the --transcriptome-index option, which is what you're looking for.
Hi dpryan,

Thanks for input reg. the --transcriptome-index option for tophat2. I looked it up in the TopHat2 manual. For other users, which may encounter the same challenge - The trick is to run this command first:
Code:
tophat2 -G iGenomes/Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf --transcriptome-index=transcriptome_data/known iGenomes/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/genome
and then subsequently call tophat2 with this command:
Code:
tophat2 --num-threads 12 --transcriptome-index=transcriptome_data/known iGenomes/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/genome myfastq_R1.fastq.gz myfastq_R2.fastq.gz
After running the above command, you'll see
Code:
[2014-09-18 12:12:04] Using pre-built transcriptome data..
Which is significantly faster, when running multiple samples.

The UCSC/hg19 data can retrieved like so:
Code:
wget ftp://igenome:G3nom3s4u@ussd-ftp.illumina.com/Homo_sapiens/UCSC/hg19/Homo_sapiens_UCSC_hg19.tar.gz
Cheers,
Leon
LeonDK is offline   Reply With Quote
Old 08-11-2015, 05:48 AM   #4
konika
Member
 
Location: Norway

Join Date: Sep 2010
Posts: 14
Default tophat not creating transcriptome indexes

Hi
In my case The following command doesnt start tophat2. tophat2 just shows me the available options, like I have used a wrong option somewhere. Does anyone has an idea whats wrong here
The command I use:
tophat2 -G /home/chawla/rna_seq_pipeline/gff/mouse_ensembl.gff --transcriptome-index=tdata /home/chawla/rna_seq_pipeline/gff/mouse_ensembl
konika is offline   Reply With Quote
Old 08-11-2015, 06:14 AM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Quote:
Originally Posted by konika View Post
Hi
In my case The following command doesnt start tophat2. tophat2 just shows me the available options, like I have used a wrong option somewhere. Does anyone has an idea whats wrong here
The command I use:
tophat2 -G /home/chawla/rna_seq_pipeline/gff/mouse_ensembl.gff --transcriptome-index=tdata /home/chawla/rna_seq_pipeline/gff/mouse_ensembl
You have to point tophat2 process to the indexes for the full genome. It appears that you are including a gff file instead of the bowtie2 indexes at the end of your command. Refer to LeonDK's example in posts above.
GenoMax is offline   Reply With Quote
Old 08-11-2015, 06:37 AM   #6
konika
Member
 
Location: Norway

Join Date: Sep 2010
Posts: 14
Default

Thanks, it was actually old version of tophat that also needs an input read to create transcriptome indexes.
konika is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:19 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO