Statistical decision theory can be used as a means to establish the discriminant functions for probabilistic patterns governed by known probability functions. By Bayes' rule we can write
where p(X|i) is the probability that X occurs, given that it is a pattern belonging to category i; regarded as a function of i, p(X|i) is often called the likelihood of i with respect to X; p(i) is the a priori probability of occurrence of category i; and p(X) is the probability that X occurs regardless of its category.
The probability p(X) is calculated as follows:
The various terms used in Eq.(2.12) are defined below.
is a column vector representing the pattern.
is a column vector. It has the property of being equal
to the expected value of X, i.e., M=E[X], and is therefore
called the mean vector.
is a symmetric, positive definite matrix, called the covariance
matrix.
The i, j component of the covariance matrix
is given by
for all i, j = 1, , d; in particular,
is
the variance of
.
We can also write
in the compact form
The inverse of is
, and the determinant of
is
. Since the d-variate normal probability
distribution is completely specified by the mean vector M and
covariance matrix
.
The probability p(X|i) is also calculated as below if we define R
mean vectors and R covariance matrices
for i = 1,
, R