Hi, i'm currently trying to align RNAseq reads to a reference genome and corresponding .gff file.
I've built my genome index using bowtie-build, and bowtie-inspect -names returns values chr1, chr2, chr3, etc. I've edited the .gff file to be in the format name start end strand. For example, a line in my .gff file can be as follows:
chr9 10190 10248 +
My problem is that the alignment fails. The error output from Tophat is:
[Thu Mar 1 23:03:43 2012] Preparing output location ./tophat_out/
[Thu Mar 1 23:03:43 2012] Checking for Bowtie index files
[Thu Mar 1 23:03:43 2012] Checking for reference FASTA file
[Thu Mar 1 23:03:43 2012] Checking for Bowtie
Bowtie version: 0.12.7.0
[Thu Mar 1 23:03:43 2012] Checking for Samtools
Samtools Version: 0.1.18
[Thu Mar 1 23:03:43 2012] Generating SAM header for MYgenome
format: fasta
[Thu Mar 1 23:03:46 2012] Reading known junctions from GTF file
Warning: TopHat did not find any junctions in GTF file
[Thu Mar 1 23:03:47 2012] Preparing reads
left reads: min. length=50, count=8625200
[Thu Mar 1 23:06:47 2012] Creating transcriptome data files..
[Thu Mar 1 23:07:03 2012] Building Bowtie index from transcriptome_index.fa
[FAILED]
Error: Couldn't build bowtie index with err = 1
Does anyone know why this process is failing? I don't know why Tophat says it can't read any junctions from the GTF file (in my case a .gff file). I'm using the -G option in the tophat command to specify using the .gff file.
The manual says .junc is in the format I mentioned above, but that it specifies an inclusive range for introns, with flanking exons. That's why I used -G instead of -j for .juncts, since my .gff file specifies an inclusive range for exons.
Anyone have any thoughts on this?? Thanks for your input
Is my format for .gff file correct?
I've built my genome index using bowtie-build, and bowtie-inspect -names returns values chr1, chr2, chr3, etc. I've edited the .gff file to be in the format name start end strand. For example, a line in my .gff file can be as follows:
chr9 10190 10248 +
My problem is that the alignment fails. The error output from Tophat is:
[Thu Mar 1 23:03:43 2012] Preparing output location ./tophat_out/
[Thu Mar 1 23:03:43 2012] Checking for Bowtie index files
[Thu Mar 1 23:03:43 2012] Checking for reference FASTA file
[Thu Mar 1 23:03:43 2012] Checking for Bowtie
Bowtie version: 0.12.7.0
[Thu Mar 1 23:03:43 2012] Checking for Samtools
Samtools Version: 0.1.18
[Thu Mar 1 23:03:43 2012] Generating SAM header for MYgenome
format: fasta
[Thu Mar 1 23:03:46 2012] Reading known junctions from GTF file
Warning: TopHat did not find any junctions in GTF file
[Thu Mar 1 23:03:47 2012] Preparing reads
left reads: min. length=50, count=8625200
[Thu Mar 1 23:06:47 2012] Creating transcriptome data files..
[Thu Mar 1 23:07:03 2012] Building Bowtie index from transcriptome_index.fa
[FAILED]
Error: Couldn't build bowtie index with err = 1
Does anyone know why this process is failing? I don't know why Tophat says it can't read any junctions from the GTF file (in my case a .gff file). I'm using the -G option in the tophat command to specify using the .gff file.
The manual says .junc is in the format I mentioned above, but that it specifies an inclusive range for introns, with flanking exons. That's why I used -G instead of -j for .juncts, since my .gff file specifies an inclusive range for exons.
Anyone have any thoughts on this?? Thanks for your input
Is my format for .gff file correct?
Comment