cuffdiff for differential gene testing

I am using tophat, cufflinks, cuffcompare and cuffdiff to analyze 5 MZ twin pairs discordant for diabetes.
After the preliminary fastq groomer file conversion, the RNA-seq fastq paired end data files are run in tophat with reference HG19 genome to align the reads.
Cuffcompare is then run on the tophat files to compare twin pair transcripts.
Finally, cuffdiff is used first to compare individual twin pairs (tophat files)
with HG19 reference gtf which, generates unique NM ids, FPKM values, p
and q statistic values and fdr significances.
Then the groups replicate function is used to the combined diabetic and non-diabetic twins in 2 groups against the HG19 reference gtf.
This also, produces NM ids, etc.
From my reading of the documentation it appears the statistical test for replicates is similar a group comparison t-test. What I would like to run is a
combined paired t-test as the samples are MZ twins. I think I am ok with the
initial individual twins cuffdiff analysis but, suspect the combined replicates
is a group mean comparison rather then a combined paired comparison.
The documentation on cufflinks says not to use count-based differential gene
methods on the FPKM data. I am not sure this applies to the data after cuffdiff analyses as the spliced variants are separated after this analysis but, when I run a paired t-test on the extracted FPKM 5 twin pairs data I get quite different results.
What is going on?

Paul W.
