I am using paired end reads with Tophat/Cufflinks and although I get great mapping with Tophat and a decent number of novel transcripts with cufflinks I think the software is not treating my reads as paired end. I suspect I have some minor formatting problem with my tophat command but maybe it’s more complicated. My commands:
I get >90% mapping of the reads here, but when I go to use cufflinks with this command:
I get the following error:
“Warning: Using default Gaussian distribution due to insufficient paired-end reads in open ranges. It is recommended that correct parameters (--frag-len-mean and --frag-len-std-dev) be provided”
The run completes, but this implies to me that although Tophat is mapping my reads fine, it is treating them as 2 pools of single end reads and not as paired end. Otherwise why would cufflinks be unable to determine the fragment length? Is my formatting wrong somewhere? Thanks for any help you can provide!
<EDIT> I've inspected the accepted_hits.bam file using SAMTOOLS and it looks like >90% of the reads ARE in fact being mapped as a pair. I am really at a loss as to why Cufflinks is giving an error about "insufficient paired end reads in open ranges".
Code:
tophat -i 20 -I 100000 -m 1 -g 1 –tophatrun_1 -p 2 -r 180 --transcriptome-index=known/known_transcripts Gen/Genome A01_1.fastq A01_2.fastq
Code:
cufflinks -I 100000 --min-intron-length 20 -g transcripts.gff -p 2 -o cufflinksrun_1 tophatrun_1/accepted_hits.bam
“Warning: Using default Gaussian distribution due to insufficient paired-end reads in open ranges. It is recommended that correct parameters (--frag-len-mean and --frag-len-std-dev) be provided”
The run completes, but this implies to me that although Tophat is mapping my reads fine, it is treating them as 2 pools of single end reads and not as paired end. Otherwise why would cufflinks be unable to determine the fragment length? Is my formatting wrong somewhere? Thanks for any help you can provide!
<EDIT> I've inspected the accepted_hits.bam file using SAMTOOLS and it looks like >90% of the reads ARE in fact being mapped as a pair. I am really at a loss as to why Cufflinks is giving an error about "insufficient paired end reads in open ranges".
Comment