Unconfigured Ad

**danielsbrewer** · 01-13-2014, 03:12 AM

On further examination, it appears that the FLAGS in the unmapped.bam are inaccurate and even after filtering out the reads without the unpaired flag, there are still reads that are unpaired. I assume this is because the other read of the pair has been mapped.

**dpryan** · 01-13-2014, 03:47 AM

You might want to try the "--no-mixed" option for tophat2 next time.

**danielsbrewer** · 01-13-2014, 03:49 AM

Yes that would have done the trick. Still playing around with RNAseq data so I am definitely in the learning phase!

The script in the following looks like it will help:

cleaning partial PE sam data - SEQanswers

http://seqanswers.com/forums/showthread.php?t=34520

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

Just giving it a go now.

**danielsbrewer** · 01-13-2014, 03:50 AM

Yes that would have done the trick. Still playing around with RNAseq data so I am definitely in the learning phase!

The script in the following looks like it will help:

cleaning partial PE sam data - SEQanswers

http://seqanswers.com/forums/showthread.php?t=34520

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

Just giving it a go now.

**bpb9** · 10-15-2014, 12:08 PM

bam2fastx libz error

I too am trying to make a fast file out of the unmapped reads so that I can run top hat on an alternative genome. I get a different error:

samtools sort -n unmapped.bam unmapped_sort.bam
bam2fastx -q -Q -A -o outfile unmapped_sort.bam.bam

I get this error:
bam2fastx: /lib64/libz.so.1: no version information available (required by bam2fastx)

Anyone come across this error before?

**GenoMax** · 10-15-2014, 04:53 PM

One possibility is that you are running older versions of libz/libxml2. Are you able to get the bam2fastx to complete (that "error" is likely a warning) otherwise?

**bpb9** · 10-16-2014, 04:43 AM

Warning can be ignored

Originally posted by GenoMax View Post

One possibility is that you are running older versions of libz/libxml2. Are you able to get the bam2fastx to complete (that "error" is likely a warning) otherwise?

Hm…sure enough, despite the warning, there is in fact a fastq file produced anyway.

But when I run the program from the cluster's login node (shame on me, I know) I don't get the error, and I still get the fast file. Could that be due to different versions of the program running on the login vs. compute nodes? Any idea?

**GenoMax** · 10-16-2014, 04:51 AM

Originally posted by bpb9 View Post

But when I run the program from the cluster's login node (shame on me, I know) I don't get the error, and I still get the fast file. Could that be due to different versions of the program running on the login vs. compute nodes? Any idea?

That is certainly a possibility. On large clusters sometimes a few stray nodes don't get updated properly/fully. If you know which node gave you the error let the admins know. They should be able to manually update that node.

**offspring** · 10-16-2014, 05:16 AM

Just a note on this general topic, the script fix_tophat_unmapped_reads.py in https://github.com/cbrueffer/misc_bioinf/ fixes various issues in unmapped.bam files that prevent them from being used in downstream tools.

**fchatonnet** · 05-20-2016, 12:57 AM

It might be a very late answer, but apparently, tophat can even accept bam files as input. I tested it by error and it works perfectly, no differences with an alignment with a fastq file obtained after bam2fatsq transformation...
If anyone can confirm that I'm not doing anything wrong, it would be nice.

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, Yesterday, 11:08 AM	0 responses 7 views 0 reactions	Last Post by SEQadmin2 Yesterday, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 11 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 19 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 53 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

Tophat2: prepare unmapped.bam file for input into a tophat run on alternative genome

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News