# Measures of predictive accuracy

To calculate accuracy statistics at the nucleotide level,
each nucleotide of a test sequence is
classified as predicted positive (**PP**) if it
is in a predicted coding region,
predicted negative (**PN**) otherwise,
and also as actual positive (**AP**) or
actual negative (**AN**) according to the sequence
annotation.
These assignments are then compared to calculate the number of
true positives (**TP**), false positives (**FP**),
true negatives (**TN**) and false negatives (**FN**).
Accuracy is then measured by:

Sensitivity, **Sn** = TP / AP
Specificity, **Sp** = TP / PP
and Approximate Correlation, **AC**, defined as:

**AC** = ((TP/(TP+FN)) + (TP/(TP+FP)) + (TN/(TN+FP)) + (TN/(TN+FN))) / 2 - 1

At the exon level, predicted exons (**PE**)
are compared to annotated exons (**AE**).
True exons (**TE**) is the number of predicted exons
which are exactly identical to an annotated exon (i.e. both endpoints correct).
Accuracy is again measured by:
Sensitivity, **Sn** = TE / AE
Specificity, **Sp** = TE / PE
The average of **Sn** and **Sp** is typically used
as an overall measure of accuracy at the exon level
in lieu of a correlation measure.
Two additional accuracy measures are also calculated at the exon level:
Missing Exons (**ME**), the
fraction of annotated exons not overlapped by any predicted exon;
and Wrong Exons (**WE**), the fraction of predicted exons
not overlapped by any true exon.
Accuracy measures for a set of sequences are calculated by
averaging the values obtained for each sequence separately,
the average being taken over all sequences for which the
measure is defined.
