TL;DR ๐: this note discusses a rarely mentioned property of the equal error rate (EER) metric and shows its graphical interpretation.
For the interested reader, refer to 1 and 2.
The Equal Error Rate (EER) is a performance metric commonly used to evaluate binary classifiers. The EER is defined as the point where the False Acceptance Rate (FAR) and False Rejection Rate (FRR) are equal, providing a single scalar value that balances the two types of errors.
The EER can be derived from the Receiver Operating Characteristic (ROC) or from the Detection Error Tradeoff (DET) curves. It is defined as the point on the curve for which (FAR, FRR) = (EER, EER).
Unlike accuracy, EER is less sensitive to class imbalance because it focuses on the trade-off between FAR and FRR rather than absolute counts.
Let's start from defining a binary classification problem in the framework of Bayes decision theory:
-
Class prior:
$P(Y=1) = \pi$ ,$P(Y=0) = 1 - \pi$ . -
Class-conditional densities:
$p(x|Y=1)$ ,$p(x|Y=0)$ . - Loss function: 0-1 loss.
where
The Bayes-optimal decision rule classifies
The total probability of error for the given threshold
where
FAR (False Positive Rate):
FRR (False Rejection Rate):
The Bayes error rate (BER) is obtained at the optimal threshold
It is the minimum achievable classification error for a given
If the prior
The inner minimization
Let's express BER as a dot product
To find the worst-case error,
Note that a point (vertex of the convex hull) on the empirical DET curve may correspond to multiple points on the right plot.
By Sionโs minimax theorem, if:
-
$P_\mathrm{error}(\pi, t)$ is quasi-convex in$t$ , - quasi-concave (or linear) in
$\pi$ ,
then it is possible to swap maximization and minimization:
The right-hand side
Since
Thus, the minimax solution is the threshold
which is precisely the
Given that the maximin = minimax, we have:
Hence, EER is the worst-case Bayes error when the prior
This means: if a binary classifier is trained by minimizing the EER (worst-case BER), concavity of BER would insure that error-rates at all the operating points will be pushed down.
Validity of the theorem's application
BER can be written as follows:
We derive the Equal Error Rate (EER) as the worst-case Bayes error rate and seek the prior
Differentiating BER with respect to
Using the chain rule on FAR and FRR and since
Derivatives of
Derivative of the Bayes threshold
Substituting back into
At the Bayes threshold
Substituting this into the derivative, we get:
The maximum BER occurs where
This is precisely the Equal Error Rate (EER) condition.
Second derivative check (concavity of BER)
To confirm this is a **maximum**, we check the second derivative:
From earlier:
Thus:
This shows that
Minimizing a linear function over a convex set:
Consider a convex set
Let's start from expressing
Then, the problem can be formulated as follows:
Since

The objective function for the outer optimization can be rewritten as:
For each fixed
In the 2D case, the condition
Let's express BER as a dot product
To find the worst-case error
Let's recall that a theoretical DET (or ROC) curve is convex (concave). Hence, the inner minimization over its epigraph (convex set) can be replaced by minimization over a scalar
Footnotes
-
Brummer, N. (2010). Measuring, refining and calibrating speaker and language information extracted from speech. โฉ
-
Brummer, N., Ferrer, L., Swart, A. (2021). Out of a hundred trials, how many errors does your speaker verifier make? โฉ
-
Cali, C., Longobardi, M. (2015). Some mathematical properties of the ROC curve and their applications. โฉ
-
Gneiting, T., Vogel, P. (2022). Receiver operating characteristic (ROC) curves: equivalences, beta model, and minimum distance estimation. โฉ