SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Alignment/transcriptome assembly/differential expression analysis with 40bp reads? heytreeful Illumina/Solexa 4 03-11-2013 08:54 AM
BWA alignment followed by TopHat slny Bioinformatics 9 06-06-2011 06:19 AM
tophat total alignment zorph Bioinformatics 4 12-09-2010 04:09 AM
TopHat or ABySS for transcriptome analysis? Ichinichi Bioinformatics 14 10-06-2010 06:37 PM
TopHat alignment issue telos Bioinformatics 0 03-16-2010 10:39 AM

Reply
 
Thread Tools
Old 01-23-2012, 10:23 AM   #1
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default Tophat v1.4.0 (Transcriptome Alignment)

Not sure how many people have tried this new version. I'm a big fan of the transcriptome alignment phase this new version includes (ie. Align to transcritopme, then genome, then split remaining reads to attempt alignment to genome) as most mRNAseq reads (60+ percent) will align to known transcriptome references using bwa or your favorite aligner in a very short amount of time. So that is the good news...

The bad news, at least on my first test run, the sample I tested had 71% alignment using Tophat v1.3.2 and now with the new version, which I expected to increase the percent alignment it dropped to 62%. Oddly the 62% is almost exactly the alignment frequency (63%) I see when I align to transcriptome reference using bwa. Has anyone else noted this odd behavior or percent aligned decreasing with this version?
Jon_Keats is offline   Reply With Quote
Old 02-20-2012, 05:05 AM   #2
rnaseek
Member
 
Location: USA

Join Date: Nov 2011
Posts: 22
Default

Hi

I am wondering whether you can share the tophat command that worked successfully for transcriptome only alignment using "-T" option. I am trying use TopHat 1.4.0 with -T option, but could not make it to work. I think I am not using the "-T" option ciorrectly as I get the error "IOError: [Errno 2] No such file or directory: '-T'".

Thanks
rnaseek is offline   Reply With Quote
Old 02-20-2012, 05:29 AM   #3
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Quote:
Originally Posted by rnaseek View Post
I think I am not using the "-T" option ciorrectly as I get the error "IOError: [Errno 2] No such file or directory: '-T'".
Posting your unsuccessful command line might be useful since the rest of us could debug it for you. I suspect that you are using an option that requires additional information (e.g., the --GTF option) and that option is 'swallowing up' the '-T' as its additional input.
westerman is offline   Reply With Quote
Old 02-20-2012, 05:44 AM   #4
rnaseek
Member
 
Location: USA

Join Date: Nov 2011
Posts: 22
Default

Sorry about that and I think you are right. Here is the command that used
tophat -p 8 -o top_out -T -G <path2GTF> <path2BowTieIndex> --transcriptome-index=transcriptome_data/known <path2FASTQfile>

Thanks
rnaseek is offline   Reply With Quote
Old 02-20-2012, 05:49 AM   #5
pzumbo
Member
 
Location: NY

Join Date: Mar 2009
Posts: 11
Default

I observe better alignment rate with tophat 1.4.1 vs 1.3.1.

On the same read set, I ran tophat-1.3.1 vs tophat-1.4.1.

tophat-1.3.1
~/bin/tophat-1.3.1/tophat -g 1 --segment-length 25 --segment-mismatch 2 -G /home/paz2005/bin/ppbs/references/hg19/annotation/ref/gtf/refFlat.gtf -o /tmp/pz/test/tophat131 hg19 /tmp/pz/test/test.fastq.gz &

samtools flagstat ./tophat131/accepted_hits.bam
2504664 + 0 mapped (100.00%:nan%)


tophat-1.4.1
~/bin/tophat-1.4.1/tophat -g 1 --segment-length 25 --segment-mismatch 2 -G /home/paz2005/bin/ppbs/references/hg19/annotation/ref/gtf/refFlat.gtf -o /tmp/pz/test/tophat141 hg19 /tmp/pz/test/test.fastq.gz &

samtools flagstat ./tophat141/accepted_hits.bam
2873734 + 0 mapped (100.00%:nan%)


So, tophat-1.4.1 managed to map 369,070 more reads than tophat-1.3.1.
pzumbo is offline   Reply With Quote
Old 02-20-2012, 12:21 PM   #6
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

For the transcriptome only alignment you should upgrade to Tophat 1.4.1 as they fixed a bug regarding this limited alignment in the new release.

Pzumbo,

Watch using samtools flagstats as you don't know how many of the 369,070 extra aligned reads are unique alignment events. If a read maps to two regions tophat reports those two alignments as two independent events. Picard alignment metrics will give you the unique alignment events compared to the total alignment events reported by samtools
Jon_Keats is offline   Reply With Quote
Old 02-21-2012, 05:31 PM   #7
vyellapa
Member
 
Location: phoenix

Join Date: Oct 2011
Posts: 59
Default

We were consistently getting better alignment % for shorter read lengths on the same dataset using tophat1.4 ie. there was worse alignment % for 100bp reads compared to 50bp clipped reads from the same dataset. We believe that -N/--initial-read-mismatches option whose default is 2 should be increased for longer read lengths.

Im trying the tophat1.4.1 for -N/--initial-read-mismatches values =2,3,4 & 5 to align 101bp paired end reads.
Everything except N=4 aligned without errors.

I'm getting the following error which looks like a Python error. Did anybody come across this error?

"[Sat Feb 18 20:48:47 2012] Building Bowtie index from GRCh37_E64_1kg.fa
Traceback (most recent call last):
File "/home/vyellapantula/local/bin/tophat", line 3063, in ?
sys.exit(main())
File "/home/vyellapantula/local/bin/tophat", line 3029, in main
user_supplied_deletions)
File "/home/vyellapantula/local/bin/tophat", line 2501, in spliced_alignment
m2g_left_maps, m2g_right_maps = mapped_gtf_list
ValueError: need more than 1 value to unpack"

Last edited by vyellapa; 02-21-2012 at 05:34 PM.
vyellapa is offline   Reply With Quote
Reply

Tags
tophat 1.4.0

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:17 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO