Hi,
I am running cuffdiff 2.1.1 and I want to perform basic diff gene ex on a number of paired-end sequenced samples that were aligned using tophat 2 with annotations obtained from Illumina's iGenomes repository, namely the human NCBI build 37.2 data.
When I run cuffdiff using the same GTF file as found in the NCBI build 37.2 location, I got this error message
Warning: Using default Gaussian distribution due to insufficient paired-end reads in open ranges. It is recommended that correct parameters (--frag-len-mean and --frag-len-std-dev) be provided.
... for each file :-( I did not use these suggested parameters since the doc says that it is not recommended to use them with paired-end date. But I still got lots of files, with what seems to be perfectly usable data.
Ok, some googling and I see that the GTF file might be missing the tss_id and p_id columns so I use cuffcompare as specified in the Cufflink's FAQ page and try to rerun my analysis. I get the same error message...
So my questions:
- What is the right GTF file to use?
- What do I need to do to create the appropriate infos that cuffdiff needs?
- Can I still trust the data that came from my first run?
Thanks in advance
Sylvain Foisy
Project manager - Bioinformatics
Montreal Heart Institute
Montreal, Qc
I am running cuffdiff 2.1.1 and I want to perform basic diff gene ex on a number of paired-end sequenced samples that were aligned using tophat 2 with annotations obtained from Illumina's iGenomes repository, namely the human NCBI build 37.2 data.
When I run cuffdiff using the same GTF file as found in the NCBI build 37.2 location, I got this error message
Warning: Using default Gaussian distribution due to insufficient paired-end reads in open ranges. It is recommended that correct parameters (--frag-len-mean and --frag-len-std-dev) be provided.
... for each file :-( I did not use these suggested parameters since the doc says that it is not recommended to use them with paired-end date. But I still got lots of files, with what seems to be perfectly usable data.
Ok, some googling and I see that the GTF file might be missing the tss_id and p_id columns so I use cuffcompare as specified in the Cufflink's FAQ page and try to rerun my analysis. I get the same error message...
So my questions:
- What is the right GTF file to use?
- What do I need to do to create the appropriate infos that cuffdiff needs?
- Can I still trust the data that came from my first run?
Thanks in advance
Sylvain Foisy
Project manager - Bioinformatics
Montreal Heart Institute
Montreal, Qc