I used tophat plugin in geneious to map my RNAseq data, but I found the result is much worse than using command line (tophat), many introns are not well mapped. You can see the picture of my setting in Geneious and in Assembly Report, I found the command used in geneious is "home/bac/geneious9.0data/plugins/com.biomatters.plugins.tophat.TophatPlugin/com/biomatters/plugins/tophat/linux64/bowtie2-build input.fa input
/home/bac/geneious9.0data/plugins/com.biomatters.plugins.tophat.TophatPlugin/com/biomatters/plugins/tophat/linux64/tophat --b2-sensitive -N 2 --read-gap-length 2 --read-edit-dist 2 -a 8 -m 0 -i 70 -I 500000 -F 0.15 -g 40 -p 30 --segment-length 25 --library-type fr-unstranded input forwardReads.fastq"
I checked this settings, and found 2 problems:
1. When I used command line version of tophat, I used the reference (PH1_genome.fasta) and its annotation (gene.gtf) seperately. First, index the reference file (fasta) and then run tophat with the option -G gene.gtf (the CDS annotation), for this parameter, topHat will first extract the transcript sequences and use Bowtie to align reads to this virtual transcriptome first. But in geneious, I import the reference (PH1_genome.fasta) first and then import gene.gtf to add on the reference, so in geneious I just used the combined file as reference. In this condition, can geneious extract the transcript sequences first? And I didn't find the command "-G gene.gtf" in Assembly Report in geneious, so is this the root cause of this problem? Maybe I should import the sequence (PH1_genome.fasta) and annotation (gene.gtf) seperately, and add command line "-G gene.gtf" in the option "Additional Command Line Parameters". Is it right???
2. From tophat manual, it said "When running TopHat with paired reads it is critical that the *_1 files an the *_2 files appear in separate comma-delimited lists", but in geneious, I select the 2 documents, but in the command in Assembly Report, I found it is just one document "forwardReads.fastq". This will influence the result?
/home/bac/geneious9.0data/plugins/com.biomatters.plugins.tophat.TophatPlugin/com/biomatters/plugins/tophat/linux64/tophat --b2-sensitive -N 2 --read-gap-length 2 --read-edit-dist 2 -a 8 -m 0 -i 70 -I 500000 -F 0.15 -g 40 -p 30 --segment-length 25 --library-type fr-unstranded input forwardReads.fastq"
I checked this settings, and found 2 problems:
1. When I used command line version of tophat, I used the reference (PH1_genome.fasta) and its annotation (gene.gtf) seperately. First, index the reference file (fasta) and then run tophat with the option -G gene.gtf (the CDS annotation), for this parameter, topHat will first extract the transcript sequences and use Bowtie to align reads to this virtual transcriptome first. But in geneious, I import the reference (PH1_genome.fasta) first and then import gene.gtf to add on the reference, so in geneious I just used the combined file as reference. In this condition, can geneious extract the transcript sequences first? And I didn't find the command "-G gene.gtf" in Assembly Report in geneious, so is this the root cause of this problem? Maybe I should import the sequence (PH1_genome.fasta) and annotation (gene.gtf) seperately, and add command line "-G gene.gtf" in the option "Additional Command Line Parameters". Is it right???
2. From tophat manual, it said "When running TopHat with paired reads it is critical that the *_1 files an the *_2 files appear in separate comma-delimited lists", but in geneious, I select the 2 documents, but in the command in Assembly Report, I found it is just one document "forwardReads.fastq". This will influence the result?