Seqanswers Leaderboard Ad

**Carlos Borroto** · 11-07-2011, 10:17 AM

Hi,

I'm also trying to run tophat-fusion for mm9. I see tophat-fusion-post use hardcoded blast database files, but this is easy to change. Generating these files you mention is what I haven't being able to figured out. I don't think 'mcl' is important, as is just the Mitelman Database, for easy checking of the results, but the rest are for sure important.

Were you able to construct these files for mm9?

Thanks,
Carlos

**rcorbett** · 11-07-2011, 10:30 AM

Hi Carlos,
I did manage to reconstruct the files for mm9. It just required some reverse engineering.

I just downloaded the
ensGene.txt
refGene.txt
knownGene.txt
from UCSC then made refGene_sorted.txt with this command (I don't remember the details but this worked for me)
echo "1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y" | tr ' ' '\n' | xargs -i echo "awk '\$3==\"chr{}\"' refGene.txt | sort +4n -6 " | bash > refGene_sorted.txt

Also, the ensGtp.txt file was available from the USCS website, I just had to do a little more digging.

**Carlos Borroto** · 11-07-2011, 01:04 PM

Great! I should have recognized these names. Let me just add for anyone else going through the same.

You can get these files from the table browser at UCSC:
http://genome.ucsc.edu/cgi-bin/hgTables

Just make sure you select:
output format: "all fields from selected table"

Sorting like rcorbett mentioned above seems to work fine, I did it to produce refGene_sorted.txt and I also sorted ensGene.txt, as it seems to me the annotation file for human in the source package is sorted.

I also filtered ensGtp.txt to keep lines containing ENSMUSP only:
grep ENSMUSP ensGtp.txt.tmp > ensGtp.txt

Because it failed complaining you need tree elements per line in ensGtp.txt. Filtering by ENSMUSP, seems to work, as if there is a protein id there would be a gene and transcript id.

Now you just need to edit tophat-fusion-post to look for the right blast db, I'll be blasting against "other_genomic*" and "nt*".

Thanks!

**himanshu04** · 09-11-2012, 08:35 AM

How do you edit tophat-fusion-post to look for the right blast db. I am running tophat fusion on mouse using mm9 reference and I have finished with the tophat fusion step. I am trying to run the tophatfusion post step. I have downloaded the other_genomic* and nt* databases as well as mouse_genomic blast database. But, I detect fusions and I am not getting the blast score and the sequence alignments.?
I get the following error : “no index or alias found for nucleotide database[blast/other_genomic] in search path [home/fusion(this is the top_dir)::]”.

My directory format is :
home/fusion(top_dir)/blast/nt, home/fusion/blast/other_genomic

What am I doing wrong?.
Any help is much appreciated.
Thanks

**bharati** · 12-11-2012, 03:26 AM

Confusion between Spanning reads and spanning mate pairs

Can anybody please explain the difference between Spanning Reads and Spanning Mate pairs. As much I could understand Spanning reads are those reads which do not harbor the fusion point but Split reads do harbor it, but Spanning mate pairs are those spanning reads which are supported by their mate pairs and the number of Spanning mate pairs should be lesser than spanning reads, but this is not the case in my results, why so?

please guide its urgent

**jp.** · 08-03-2013, 06:19 AM

hi
may someone tell me where can i find other_genomic* and nt* for mm9. i searched ftp://ftp.ncbi.nlm.nih.gov/blast/db/, and found only:
1. est_mouse.tar.gz
2. mouse_genomic_transcript.tar.gz

where are these files (other_genomic* and nt*) ?

when i run the tophat-fusion it says:
blast nt now found..???
i have downloaded blast but there is no such things like blastall ??

expecting reply

Originally posted by himanshu04 View Post

How do you edit tophat-fusion-post to look for the right blast db. I am running tophat fusion on mouse using mm9 reference and I have finished with the tophat fusion step. I am trying to run the tophatfusion post step. I have downloaded the other_genomic* and nt* databases as well as mouse_genomic blast database. But, I detect fusions and I am not getting the blast score and the sequence alignments.?
I get the following error : “no index or alias found for nucleotide database[blast/other_genomic] in search path [home/fusion(this is the top_dir)::]”.

My directory format is :
home/fusion(top_dir)/blast/nt, home/fusion/blast/other_genomic

What am I doing wrong?.
Any help is much appreciated.
Thanks

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 47 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

tophat-fusion on mouse

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News