View Single Post
Old 03-17-2017, 08:11 AM   #1
komalsrathi
Member
 
Location: Philadelphia, PA

Join Date: Aug 2013
Posts: 14
Default tophat-fusion-post: ValueError: invalid literal for int() with base 10: 'exonCount'

Hi everyone,

I am running tophat-fusion-post like this:

Code:
tophat-fusion-post -o ./fusion_results -p 8 --num-fusion-reads 1 --num-fusion-pairs 2 --num-fusion-both 5 /mnt/isilon/cbmi/variome/reference/bowtie_indexes/hg38_no_alt/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/genome
My root folder is tophat-fusion. I have 4 tophat output folders under it: CHP212, SHSY5Y, SKNAS and SKNSH, each of which contain a fusions.out file. I have created symbolic links to refGene.txt, ensGene.txt and blast database (blast) in the same folder.

This is my directory structure where I have run tophat:

Code:
$ tree -L 2 ./tophat-fusion

./
|-- CHP212
|   |-- accepted_hits.bam
|   |-- align_summary.txt
|   |-- deletions.bed
|   |-- fusions.out
|   |-- insertions.bed
|   |-- junctions.bed
|   |-- logs
|   |-- prep_reads.info
|   `-- unmapped.bam
|-- CHP212.sh
|-- CHP212_R1.fastq.gz -> /mnt/isilon/maris_lab/target_nbl_ngs/CellLineRNASeq/rawfiles/cat_fastq/CHP212_R1.fastq.gz
|-- CHP212_R2.fastq.gz -> /mnt/isilon/maris_lab/target_nbl_ngs/CellLineRNASeq/rawfiles/cat_fastq/CHP212_R2.fastq.gz
|-- IMR32.sh
|-- IMR32_R1.fastq.gz -> /mnt/isilon/maris_lab/target_nbl_ngs/CellLineRNASeq/rawfiles/cat_fastq/IMR32_R1.fastq.gz
|-- IMR32_R2.fastq.gz -> /mnt/isilon/maris_lab/target_nbl_ngs/CellLineRNASeq/rawfiles/cat_fastq/IMR32_R2.fastq.gz
|-- SHSY5Y
|   |-- accepted_hits.bam
|   |-- align_summary.txt
|   |-- deletions.bed
|   |-- fusions.out
|   |-- insertions.bed
|   |-- junctions.bed
|   |-- logs
|   |-- prep_reads.info
|   `-- unmapped.bam
|-- SHSY5Y.sh
|-- SHSY5Y_R1.fastq.gz -> /mnt/isilon/maris_lab/target_nbl_ngs/CellLineRNASeq/rawfiles/cat_fastq/SHSY5Y_R1.fastq.gz
|-- SHSY5Y_R2.fastq.gz -> /mnt/isilon/maris_lab/target_nbl_ngs/CellLineRNASeq/rawfiles/cat_fastq/SHSY5Y_R2.fastq.gz
|-- SKNAS
|   |-- accepted_hits.bam
|   |-- align_summary.txt
|   |-- deletions.bed
|   |-- fusions.out
|   |-- insertions.bed
|   |-- junctions.bed
|   |-- logs
|   |-- prep_reads.info
|   `-- unmapped.bam
|-- SKNAS.sh
|-- SKNAS_R1.fastq.gz -> /mnt/isilon/maris_lab/target_nbl_ngs/CellLineRNASeq/rawfiles/cat_fastq/SKNAS_R1.fastq.gz
|-- SKNAS_R2.fastq.gz -> /mnt/isilon/maris_lab/target_nbl_ngs/CellLineRNASeq/rawfiles/cat_fastq/SKNAS_R2.fastq.gz
|-- SKNSH
|   |-- accepted_hits.bam
|   |-- align_summary.txt
|   |-- deletions.bed
|   |-- fusions.out
|   |-- insertions.bed
|   |-- junctions.bed
|   |-- logs
|   |-- prep_reads.info
|   `-- unmapped.bam
|-- SKNSH.sh
|-- SKNSH_R1.fastq.gz -> /mnt/isilon/maris_lab/target_nbl_ngs/CellLineRNASeq/rawfiles/cat_fastq/SKNSH_R1.fastq.gz
|-- SKNSH_R2.fastq.gz -> /mnt/isilon/maris_lab/target_nbl_ngs/CellLineRNASeq/rawfiles/cat_fastq/SKNSH_R2.fastq.gz
|-- blast -> /mnt/isilon/cbmi/variome/reference/blast_db/hg38
|-- ensGene.txt -> /mnt/isilon/cbmi/variome/reference/blast_db/hg38/ensGene.txt
|-- fusion_results
|   |-- fusion_seq.bwtout
|   |-- fusion_seq.fa
|   |-- fusion_seq.map
|   |-- logs
|   `-- tmp
|-- refGene.txt -> /mnt/isilon/cbmi/variome/reference/blast_db/hg38/refGene.txt
`-- tophat-fusion.sh
When I run tophat-fusion-post under this directory, I am getting the following errors:

Code:
[Fri Mar 17 15:09:18 2017] Beginning TopHat-Fusion post-processing run (v2.1.0)
-----------------------------------------------
[Fri Mar 17 15:09:18 2017] Extracting 23-mer around fusions and mapping them using Bowtie
[Fri Mar 17 15:09:30 2017] Filtering fusions
Traceback (most recent call last):
  File "/home/rathik/tools/miniconda3/envs/fusion-env/bin/tophat-fusion-post", line 2924, in <module>
    sys.exit(main())
  File "/home/rathik/tools/miniconda3/envs/fusion-env/bin/tophat-fusion-post", line 2895, in main
    filter_fusion(bwt_idx_prefix, params)
  File "/home/rathik/tools/miniconda3/envs/fusion-env/bin/tophat-fusion-post", line 965, in filter_fusion
    ensGene_list = read_genes("ensGene.txt")
  File "/home/rathik/tools/miniconda3/envs/fusion-env/bin/tophat-fusion-post", line 917, in read_genes
    num_exons = int(line[7])
ValueError: invalid literal for int() with base 10: 'exonCount'
komalsrathi is offline   Reply With Quote