Seqanswers Leaderboard Ad

**yueluo** · 01-08-2014, 06:05 PM

I don't think tophat_report has the '--no-discordant' option.

**jake13** · 01-09-2014, 07:33 PM

Sorry I run tophat2 with the --no-discordant option, but it seems to fail when it has to call tophat_reports.

**andylemire** · 01-09-2014, 07:50 PM

I've had problems with tophat dying at various points near the end of the run, but a few times I was able to recover with the -R option. Made my day.

One thing I noticed was missing from the error message (which I think is the tophat.log file) was the --no-discordant option itself, in the full command string. Maybe it's a bug? Seems like it should be in there. Does it show up in the run.log file?

**jake13** · 01-09-2014, 07:57 PM

I tried resume and it didn't work, but I'm retrying with more memory. The --no-discordant option shows up for tophat in the run log, but not for tophat_reports.

**andylemire** · 01-09-2014, 08:06 PM

Looks like someone found a solution, if you want to edit the code to patch it yourself:

http://seqanswers.com/forums/showthr...t=24205&page=2

This is the thread that Alex Williams referenced in the google groups thread:

https://groups.google.com/forum/#!ms...8/UtLS0XcWKnUJ

**jake13** · 01-09-2014, 08:54 PM

Thanks, took a look at that, but I'm not sure where to make those edits.

**andylemire** · 01-09-2014, 09:52 PM

(I had some spare time waiting on some jobs to finish on the cluster…)

Try this. I had to rename it as tophat.txt to upload it. Having no idea what your experience level is with this kind of thing (I'm no expert…yet), I'll go verbose:

Unzip and copy this to your cluster (or have an admin do it).
You can compare it to the original to see what changes I've made

Code:

diff /path/to/original/tophat tophat    #modified version is the 2nd one
1427c1427
<         bowtie_sam_header_filename = tmp_dir + idx_prefix.split('/')[-1]
---
>         bowtie_sam_header_filename = tmp_dir + str(idx_prefix).split('/')[-1]
2114c2114
<     bwt_idx_name = bwt_idx_prefix.split('/')[-1]
---
>     bwt_idx_name = str(bwt_idx_prefix).split('/')[-1]

Then replace the original with this one and make it executable

Code:

chmod 777 tophat    #or whatever permissions are appropriate

You should be able to resume the run, according to the other thread.

caveat emptor: I haven't tried it myself…

Attached Files

tophat.gz (35.5 KB, 53 views)

**jake13** · 01-10-2014, 12:01 PM

So I tried that and it did not fix the problem. Right now my sample is split into multiple fastq.gz files so I tried running each file separately. All of them ran fine except for one pair, which failed at the same step, so I think this pair was holding up the entire run. However, I tried mapping this same pair with STAR and it mapped fine so I don't think the file is completely corrupted. Any ideas? Thanks

**andylemire** · 01-10-2014, 12:15 PM

Hmm...running out of ideas. How much data does this one file represent, if you were to omit it completely? Or, you could downsample this one file and maybe get lucky enough to remove the reads that are causing the problematic alignments. I hate losing data, but I also hate debugging and waiting on software to finish. Depends on your needs.

You could use split to break it down, run the pieces through, and omit the chunk(s) that are causing problems:

Code:

zcat wonky.fq.gz | split -l 4000000 -    #4M lines = 1M reads = 500k pairs if interleaved

or something like that. You could do this iteratively on the bad segment to minimize data loss.

**jake13** · 01-10-2014, 03:52 PM

So someone else suggested I try an older version of tophat. I ran my files with 2.0.8b and 2.0.9 and everything ran fine so I guess I will just use that.

**dembot** · 04-05-2014, 05:21 PM

How to deal with discordant alignments?

I've had exactly the same problem as everyone else on this thread. Tophat 2.0.9 output fails when running large, paired-end datasets with --no discordant set to yes (ie, to not report discordant alignments). However, it runs fine when Tophat is set to report discordant alignments. After much troubleshooting and thread browsing, it seems like this can only be a bug in Tophat. Short of waiting for a bug fix, does anyone know of an easy way to filter out discordant alignments from the acceptedhits.bam file post hoc?

**Brian Bushnell** · 04-06-2014, 09:09 AM

You can use BBMap instead, which is faster and more accurate than Tophat anyway.

(index)
bbmap.sh ref=reference.fasta -Xmx29g

(map)
bbmap.sh in1=reads1.fq in2=reads2.fq outm=mapped.sam outu1=unmapped1.fq outu2=unmapped2.fq -Xmx29g po=t maxindel=100000 xstag=unstranded intronlen=10 ambig=random

The "-Xmx29g" indicates how much memory java is allowed to use; set that at ~85% of your physical memory.

The "outm" and "outu" flags indicate streams for mapped and unmapped reads, respectively. "po=t" means "paired only = true" which tells it to consider unpaired reads as unmapped, so they will go to the outu rather than outm streams. This should be analgous to Tophat's "--no-discordant" flag. The outu streams are optional and you can leave them off. "maxindel" tells it the longest expected intron length to look for; "intronlen" tells it the minimum intron length (deletions at least that length get the cigar symbol 'N' rather than 'D'). If your data is stranded, you can set the xstag to "firststrand" or "secondstrand" instead of unstranded.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

tophat fails with --no-discordant

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News