I used Tophat and cufflink to assemble my RNA-seq data. Then use cuffmerge to merge assembly of three samples. However, I found that a lot of transcripts overlap with each. For example the gene showed as attach file. The red means reference annotation. The blue means the result of cuffmerge.
There are not many genes overlaping with other ones in this genome. But there are lots of genes overlaping other genes in the other strand. I don't know the reason.
Can cufflink or cuffmerge solve this problem? Or what parameter should I use?
The parameter I used are showed below.
#Tophat
tophat -p 10 -r 20 --mate-std-dev 30 -i 20 -I 3000 --phred64-quals --read-mismatches 5 --read-gap-length 5 --read-edit-dist 5 genome fastq1.fq fastq2.fq
#cufflinks
cufflinks -p 10 --min-intron-length 20 --max-intron-length 3000 --label predictGene -o cufflink sample1.accepted_hits.bam
#cuffmerge
cuffmerge -p 10 -g ../ref.transcripts.gtf -s ../genome.fa assembly.txt
Thank you very much.
Best regards.
There are not many genes overlaping with other ones in this genome. But there are lots of genes overlaping other genes in the other strand. I don't know the reason.
Can cufflink or cuffmerge solve this problem? Or what parameter should I use?
The parameter I used are showed below.
#Tophat
tophat -p 10 -r 20 --mate-std-dev 30 -i 20 -I 3000 --phred64-quals --read-mismatches 5 --read-gap-length 5 --read-edit-dist 5 genome fastq1.fq fastq2.fq
#cufflinks
cufflinks -p 10 --min-intron-length 20 --max-intron-length 3000 --label predictGene -o cufflink sample1.accepted_hits.bam
#cuffmerge
cuffmerge -p 10 -g ../ref.transcripts.gtf -s ../genome.fa assembly.txt
Thank you very much.
Best regards.
Comment