SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
fastq_screen vs. tophat2 frymor Bioinformatics 1 07-03-2015 10:28 AM
Tophat2 and illumina v1.5 ege Bioinformatics 1 04-11-2014 01:20 AM
Tophat2 stucked Joshua_HIT Bioinformatics 4 12-08-2013 01:36 AM
tophat2 errors ahmetz Bioinformatics 25 09-04-2013 06:24 AM
tophat2 [email protected] Bioinformatics 6 02-04-2013 10:51 AM

Reply
 
Thread Tools
Old 11-20-2015, 01:12 PM   #1
fk566938
Junior Member
 
Location: Northeastern USA

Join Date: Nov 2015
Posts: 9
Question TopHat2 command

Hello, I am not a bioinformatician I am more of a bench scientist. I am trying to align illumina bodymap 2.0 RNAseq files from ENA to the hg19 .fa file. I cannot find the command anywhere to do this with tophat. Is it similar to the bowtie2 command? I just started using linux to get this data and I really need help .
fk566938 is offline   Reply With Quote
Old 11-20-2015, 01:51 PM   #2
blancha
Senior Member
 
Location: Montreal

Join Date: May 2013
Posts: 367
Default

First, you want to be sure you have a powerful enough computer, and enough disk space.

Second, you want to build an index for the hg19.fa file or download one, from the iGenomes for example

Third, you'll want to download the annotation file, in the format of a GTF file, from UCSC or the iGenomes.

Fourth, you need to understand the Linux command line.
To get the arguments for TopHat, just run the following command.

Quote:
tophat --help
Here is a sample command for TopHat
Quote:
tophat \
--library-type fr-firststrand \
-G Mus_musculus.GRCm38.77.gtf \
-o results/tophat/sample_1 \
Bowtie2Index/Mus_musculus.GRCm38.dna.primary_assembly \
sample_1_R1.fastq.gz \
sample_2_R2.fastq.gz
There is a classic article in Nature Protocols for beginners.
They make it sound a bit too simple, though, and don't cover any of the subtleties of RNA-Seq.
Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks
http://www.nature.com/nprot/journal/....2012.016.html

Last edited by blancha; 11-21-2015 at 01:22 PM.
blancha is offline   Reply With Quote
Old 11-21-2015, 03:57 AM   #3
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,784
Default

@fk566938: If you want to avoid the time consuming step of aligning the files (sounds like you are a new user and may not have adequate software/hardware available locally) you could use the aligned BAM files that were made available by Ensembl for this data.

Original announcement was here : http://www.ensembl.info/blog/2011/05...from-illumina/
BAM files first appeared in ensembl release 70 (were available till Ensembl release 78 at least): ftp://ftp.ensembl.org/pub/release-78...ens/genebuild/

If you just want to query/browse specific genes for tissues you can do that here: https://www.ebi.ac.uk/gxa/experiments/E-MTAB-513

This link has the tissue specific gene expression analysis from the bodymap data: http://www.cureffi.org/2013/07/11/ti...n-bodymap-2-0/
GenoMax is offline   Reply With Quote
Old 11-21-2015, 06:34 AM   #4
fk566938
Junior Member
 
Location: Northeastern USA

Join Date: Nov 2015
Posts: 9
Default

thank you, I've been looking for a way to get already aligned bam files. I have access to the university's linux cluster but I don't have the time anymore to map. My PI wants this data asap. I've tried using these expression sites to find expression of my gene, but I am looking for a specific Isoform and there isn't a lot of information on it in the databases. Thank you again for a link to the already aligned bam files. Would you happen to know if these alignments are already "sorted"? Also they say they aligned them to the genome using the epstein barr virus as a decoy, do you know what that means?

Last edited by fk566938; 11-21-2015 at 06:39 AM.
fk566938 is offline   Reply With Quote
Old 11-21-2015, 10:50 AM   #5
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by fk566938 View Post
I have access to the university's linux cluster but I don't have the time anymore to map. My PI wants this data asap.
Your PI is underestimating the amount of training and effort it takes to get correct and useful results using bioinformatics. It's not just something you can toss in at the end on a whim.
Brian Bushnell is offline   Reply With Quote
Reply

Tags
illumina, rnaseq, rnaseq alignment, tophat, tophat 2

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:52 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO