Measures of predictive accuracy
To calculate accuracy statistics at the nucleotide level,
each nucleotide of a test sequence is
classified as predicted positive (PP) if it
is in a predicted coding region,
predicted negative (PN) otherwise,
and also as actual positive (AP) or
actual negative (AN) according to the sequence
These assignments are then compared to calculate the number of
true positives (TP), false positives (FP),
true negatives (TN) and false negatives (FN).
Accuracy is then measured by:
Sensitivity, Sn = TP / AP
Specificity, Sp = TP / PP
and Approximate Correlation, AC, defined as:
AC = ((TP/(TP+FN)) + (TP/(TP+FP)) + (TN/(TN+FP)) + (TN/(TN+FN))) / 2 - 1
At the exon level, predicted exons (PE)
are compared to annotated exons (AE).
True exons (TE) is the number of predicted exons
which are exactly identical to an annotated exon (i.e. both endpoints correct).
Accuracy is again measured by:
Sensitivity, Sn = TE / AE
Specificity, Sp = TE / PE
The average of Sn and Sp is typically used
as an overall measure of accuracy at the exon level
in lieu of a correlation measure.
Two additional accuracy measures are also calculated at the exon level:
Missing Exons (ME), the
fraction of annotated exons not overlapped by any predicted exon;
and Wrong Exons (WE), the fraction of predicted exons
not overlapped by any true exon.
Accuracy measures for a set of sequences are calculated by
averaging the values obtained for each sequence separately,
the average being taken over all sequences for which the
measure is defined.
-. .-. .-. .-. .-. .-. .-. .-. .-. .-. .-. .-. .-. .-. .-. .-. .-. .-. .
||X|||\ /|||X|||\ /|||X|||\ /|||X|||\ /|||X|||\ /|||X|||\ /|||X|||\ /|||X|||\ /|||X|||\ /|
|/ \|||X|||/ \|||X|||/ \|||X|||/ \|||X|||/ \|||X|||/ \|||X|||/ \|||X|||/ \|||X|||/ \|||X||
' `-' `-' `-' `-' `-' `-' `-' `-' `-' `-' `-' `-' `-' `-' `-' `-' `-' `-
Back to the GENSCAN Web site.
Address any comments/questions/suggestions to: