Naive Bayesian Classification
Bayesian classifiers can predict class membership probabilities, such as the probability that a given tuple belongs to a particular class.
Naive Bayesian Classification is based on Bayes’ theorem, described below.
P(H) is the prior probability, or a priori probability, of H.
For example, this is the probability that any given customer will buy a computer, regardless of age, income, or any other information, for that matter.
The posterior probability, P(H|X), is based on more information (e.g., customer information) than the prior probability, P(H), which is Why it is called naive?
Bayesian classifiers have also exhibited high accuracy and speed when applied to large databases.
Naive Bayesian classifiers assume that the effect of an attribute value on a given class is independent of the values of the other attributes.
This assumption called class conditional independence. It is made to simplify the computations involved and, in this sense, considered “naïve”.
Bayesian belief networks are graphical models, which unlike naive Bayesian classifiers allow the representation of dependencies among subsets of attributes.
Bayesian belief networks can also use for classification.
In Bayesian terms, X considered “evidence”.
As usual, it described by measurements made on a set of n
Let H be some hypothesis, such as that the data tuple X belongs to a specified class C.
For classification problems, we want to determine P(H|X), the probability that the hypothesis H holds given the “evidence” or observed data tuple X.
In other words, we are looking for the probability that tuple X belongs to class C, given that we know the attribute description of X.
Posterior probability: Naïve Bayesian Classification
P(H|X) the posterior probability, or a posterior probability, of H conditioned on X.
For example, suppose our world of data tuples confined to customers described by the attributes age and income, respectively. And that X is a 35-year-old customer with an income of $40,000.
Suppose that H is the hypothesis that our customer will buy a computer.
Then P(H|X) reflects the probability that customer X will buy a computer given that we know the customer’s age and income.
Prior probability: Naïve Bayesian Classification
Bayes’ theorem is useful in that it provides a way of calculating the posterior probability. P(H|X), from P(H), P(X|H), and P(X).
Bayes’ theorem is,
How effective are Bayesian classifiers?
- In theory, Bayesian classifiers have the minimum error rate in comparison to all other classifiers.
- However, in practice this is not always the case, owing to inaccuracies in the assumptions made for its use.
- Such as class conditional independence, and the lack of available probability data.
- Bayesian classifiers provide a theoretical justification for other classifiers that do not explicitly use Bayes’ theorem.
For example, under certain assumptions, it can show that many neural network and curve-fitting algorithms output the maximum posterior hypothesis, as does the naïve Bayesian classifier.