Hi all,
I have two transcripts files, a.gtf and b.gtf and I use cuffcompare to compare them. More specifically, I treat one as the reference annotation by putting it after the -r option. The command I use looks like this
cuffcompare -r a.gtf b.gtf
The second time, I use b.gtf as the reference annotation using the following command
cuffcompare -r b.gtf a.gtf
And I compare the two cuffcmp.stats files generated by these two commands and I find inconsistent number of loci, exons, inrons, almost everything in these two files. For example, there are 35930 loci in b.gtf if it is treated as the reference annotation, but the number of loci will be 35932 if b.gtf is not used as a reference annotation. Detail information are listed below. Can anyone tell me what happened here or give me some suggestions where materials about this are available? I google a lot but do not find explanations. Many thanks.
# Cuffcompare v1.0.3 | Command line was:
#cuffcompare -r a.gtf b.gtf
#
#= Summary for dataset: b.gtf :
# Query mRNAs : 89421 in 35932 loci (76334 multi-exon transcripts)
# (13738 multi-transcript loci, ~2.5 transcripts per locus)
# Reference mRNAs : 24299 in 17251 loci (24299 multi-exon)
# Corresponding super-loci: 11479
#--------------------| Sn | Sp | fSn | fSp
Base level: 84.0 37.7 - -
Exon level: 65.8 25.3 75.8 29.1
Intron level: 92.5 43.0 96.4 44.8
Intron chain level: 25.6 8.2 100.0 42.0
Transcript level: 0.0 0.0 0.2 0.0
Locus level: 32.6 15.6 60.6 29.1
Missed exons: 3380/130250 ( 2.6%)
Wrong exons: 160300/338929 ( 47.3%)
Missed introns: 1358/111559 ( 1.2%)
Wrong introns: 110851/240016 ( 46.2%)
Missed loci: 531/17251 ( 3.1%)
Wrong loci: 22853/35932 ( 63.6%)
---------------------------------------------------------
# Cuffcompare v1.0.3 | Command line was:
#cuffcompare -r b.gtf a.gtf
#
#= Summary for dataset: a.gtf :
# Query mRNAs : 37750 in 30116 loci (24288 multi-exon transcripts)
# (5474 multi-transcript loci, ~1.3 transcripts per locus)
# Reference mRNAs : 91369 in 35930 loci (78396 multi-exon)
# Corresponding super-loci: 12575
#--------------------| Sn | Sp | fSn | fSp
Base level: 41.0 65.0 - -
Exon level: 25.0 59.6 28.9 68.8
Intron level: 42.9 92.2 44.7 96.1
Intron chain level: 8.0 26.0 26.7 86.2
Transcript level: 0.0 0.0 0.1 0.1
Locus level: 15.9 19.0 22.7 27.1
Missed exons: 158434/342083 ( 46.3%)
Wrong exons: 13453/143700 ( 9.4%)
Missed introns: 111449/240016 ( 46.4%)
Wrong introns: 1328/111559 ( 1.2%)
Missed loci: 21690/35930 ( 60.4%)
Wrong loci: 10553/30116 ( 35.0%)
I have two transcripts files, a.gtf and b.gtf and I use cuffcompare to compare them. More specifically, I treat one as the reference annotation by putting it after the -r option. The command I use looks like this
cuffcompare -r a.gtf b.gtf
The second time, I use b.gtf as the reference annotation using the following command
cuffcompare -r b.gtf a.gtf
And I compare the two cuffcmp.stats files generated by these two commands and I find inconsistent number of loci, exons, inrons, almost everything in these two files. For example, there are 35930 loci in b.gtf if it is treated as the reference annotation, but the number of loci will be 35932 if b.gtf is not used as a reference annotation. Detail information are listed below. Can anyone tell me what happened here or give me some suggestions where materials about this are available? I google a lot but do not find explanations. Many thanks.
# Cuffcompare v1.0.3 | Command line was:
#cuffcompare -r a.gtf b.gtf
#
#= Summary for dataset: b.gtf :
# Query mRNAs : 89421 in 35932 loci (76334 multi-exon transcripts)
# (13738 multi-transcript loci, ~2.5 transcripts per locus)
# Reference mRNAs : 24299 in 17251 loci (24299 multi-exon)
# Corresponding super-loci: 11479
#--------------------| Sn | Sp | fSn | fSp
Base level: 84.0 37.7 - -
Exon level: 65.8 25.3 75.8 29.1
Intron level: 92.5 43.0 96.4 44.8
Intron chain level: 25.6 8.2 100.0 42.0
Transcript level: 0.0 0.0 0.2 0.0
Locus level: 32.6 15.6 60.6 29.1
Missed exons: 3380/130250 ( 2.6%)
Wrong exons: 160300/338929 ( 47.3%)
Missed introns: 1358/111559 ( 1.2%)
Wrong introns: 110851/240016 ( 46.2%)
Missed loci: 531/17251 ( 3.1%)
Wrong loci: 22853/35932 ( 63.6%)
---------------------------------------------------------
# Cuffcompare v1.0.3 | Command line was:
#cuffcompare -r b.gtf a.gtf
#
#= Summary for dataset: a.gtf :
# Query mRNAs : 37750 in 30116 loci (24288 multi-exon transcripts)
# (5474 multi-transcript loci, ~1.3 transcripts per locus)
# Reference mRNAs : 91369 in 35930 loci (78396 multi-exon)
# Corresponding super-loci: 12575
#--------------------| Sn | Sp | fSn | fSp
Base level: 41.0 65.0 - -
Exon level: 25.0 59.6 28.9 68.8
Intron level: 42.9 92.2 44.7 96.1
Intron chain level: 8.0 26.0 26.7 86.2
Transcript level: 0.0 0.0 0.1 0.1
Locus level: 15.9 19.0 22.7 27.1
Missed exons: 158434/342083 ( 46.3%)
Wrong exons: 13453/143700 ( 9.4%)
Missed introns: 111449/240016 ( 46.4%)
Wrong introns: 1328/111559 ( 1.2%)
Missed loci: 21690/35930 ( 60.4%)
Wrong loci: 10553/30116 ( 35.0%)
Comment