SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
tophat2 errors ahmetz Bioinformatics 25 09-04-2013 06:24 AM
Tophat2.03: error mrfox Bioinformatics 6 08-07-2013 05:09 AM
tophat2 error Xi Wang Bioinformatics 13 12-21-2012 06:36 AM
TopHat vs Tophat2 sphil Bioinformatics 4 12-04-2012 06:50 PM
tophat2 fusions.out empty MerFer Bioinformatics 0 04-30-2012 01:32 AM

Reply
 
Thread Tools
Old 07-12-2012, 09:08 AM   #1
ajgentles
Junior Member
 
Location: Planet Earth

Join Date: Jun 2011
Posts: 4
Default tophat2/samtools

We have been trying to move from tophat1 to tophat2 utilizing a transcriptome as well as genome reference, and are having some performance and output issues. RNA-seq data that previously took about 12 hours to map now takes >4 days (stopped the runs at that point), with tophat_reports apparently being the culprit. Providing just a few hundred reads still takes much longer than expected and ends with an error in tophat_reports. The reported error is typically something like

Warning: mapped sequence without CIGAR (DJG84KN1:201:C0CKNACXX:7:1101:1815:2215)
Warning: mapped sequence without CIGAR (DJG84KN1:201:C0CKNACXX:7:1101:1815:2215)
Warning: mapped sequence without CIGAR (DJG84KN1:201:C0CKNACXX:7:1101:2101:2179)
Warning: mapped sequence without CIGAR (DJG84KN1:201:C0CKNACXX:7:1101:2101:2179)
Warning: mapped sequence without CIGAR (DJG84KN1:201:C0CKNACXX:7:1101:2307:2249)
Warning: mapped sequence without CIGAR (DJG84KN1:201:C0CKNACXX:7:1101:2307:2249)
Error: CIGAR and sequence length are inconsistent!(TGCTCTTCCGATCTGCCCCCTTAAACACCATTTTCCCTCCAGGACCACCTTGGTTTCTAGGCACTGTGGTTCTTGGCAGGGGCTGTCTTAGG)

This looks like it's possibly an incompatibility with samtools but we have tried versions 0.11 through 0.18 with no change. I wondered if anyone can help narrow down the issue ?
ajgentles is offline   Reply With Quote
Old 11-07-2012, 11:37 PM   #2
kristofit
Junior Member
 
Location: france

Join Date: Apr 2012
Posts: 3
Default

Hi,

I got the same problem using tophat 2.0.6. (and bowtie 2.0.2) and can not find a way to solve the problem.
- error log is:
Error running /PATH/tophat.dir/bin/tophat_reports
Error: CIGAR and sequence length are inconsistent!(TTCAAACAAAATCGAATCCTGAAAGAGTAGAAGGGGAGCGGTGAGAGGAGGAGGAGGAGGAAGAGGAGGAGGGGGGCAGTCCTCCCCGAGCTAAAAACCTC)
Somebody did solve this problem ?
kristofit is offline   Reply With Quote
Old 12-06-2012, 01:09 AM   #3
tschauer
Junior Member
 
Location: Munich

Join Date: Oct 2012
Posts: 6
Default

Hi,

Same problem...

Did you guys solve it?

Last edited by tschauer; 12-06-2012 at 01:38 AM.
tschauer is offline   Reply With Quote
Old 12-10-2012, 10:09 AM   #4
ajgentles
Junior Member
 
Location: Planet Earth

Join Date: Jun 2011
Posts: 4
Default

I found that if you tell tophat2 to generate a
transcriptome index by supplying a GTF file, but the index already
exists, it bombs out. You have to make sure not to tell it the GTF file
again.

We've basically abandoned tophat2 in favour of STAR these days.
ajgentles is offline   Reply With Quote
Old 12-13-2012, 01:07 AM   #5
tschauer
Junior Member
 
Location: Munich

Join Date: Oct 2012
Posts: 6
Default

thanks

samtools was not correctly installed

without GTF it works
tschauer is offline   Reply With Quote
Old 03-25-2013, 04:48 PM   #6
joseph.troy
Junior Member
 
Location: Urbana, IL

Join Date: Oct 2012
Posts: 4
Default Question about removing the GTF file

Thanks all for the information! I'm having the same problem. I tried removing the GTF file, but perhaps did it wrong (see with and without below). If possible can you share your command lines without the GTF file?

My command with GTF file...
tophat -p 8 -o tophat.out --library-type fr-firststrand -G ~/jmt/projects/run3_01/mm9gtf/genes.gtf --transcriptome-index ~/jmt/projects/run3_01/mm9transcripts/transcriptome ~/jmt/projects/run3_01/mm9genome/genome test.fastq


My command without the GTF file...
tophat -p 8 -o tophat.out --library-type fr-firststrand --transcriptome-index ~/jmt/projects/run3_01/mm9transcripts/transcriptome ~/jmt/projects/run3_01/mm9genome/genome test.fastq

-Thank you!!
joseph.troy is offline   Reply With Quote
Old 10-20-2013, 07:02 PM   #7
pengchy
Senior Member
 
Location: China

Join Date: Feb 2009
Posts: 116
Default

Hi all,

this problem seems still unresolved.
GTF was recommended by Tophat2 paper if available.
In my case, only one line has the problem in one bam file:
Code:
        73      scaffold210     158009  50      30M26178N70M    *       0       0       GTACGAGTCGTTCTGCCGGCCGCCGTGCTCGGAGTCGCCGTTGACGATCCAGACGATGTGCGGCGCGGGCTTGGCGGAGCCGGAGCTGCAGTTGGCGTGC  $#$%&&&#%$"&"%("')&!&)''&!"%%(""!$!"!!"""#%%"$%%%%$"!"!!!!!"%!"%$"#%%%%""%%!!!!"$%"!!"%$$!!!!!!!!!!!  AS:i:-12        XM:i:2  XO:i:0  XG:i:0  MD:Z:78C18C2    NM:i:2  XS:A:-  NH:i:1
Where, the read name was not printed successfully.

I filtered this line by:
awk '$1!~/^[0-9]/'

Last edited by pengchy; 10-20-2013 at 07:06 PM. Reason: add awk command
pengchy is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:01 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO