Naive Bayes is a statistical classificator. It is one of the oldest formal classification algorithm, but still – thanks to simplicity and efficiency – it is often used, for example in anti-spam mechanisms. This method is called: supervised classification – we are given a set of objects with classes assigned and using it we want to generate rules that help us assign future objects to classes.

MAP (maximal a posteriori classification) is very popular estimation method in bayesian statistic. It is said MAP is optimal – minimal error is achieved. The problem is when it comes to computation complexity, which is c^n (c – classes n – describing variables). However, naive bayes it is said the variables (components) are independent (conditionally independent). The point is – if it is true, NB gives also optimal results.

It may seem that independence presumption is too strict to adapt it in real world. Neverthless, activity before classification makes the difference, e.g. selection and elimination of corellated variables occurs always as the part of the methodology of data mining.

Sources: same as previous posts

November 24th, 2016

Posted In: naive Bayes, web content, web mining, YouTube

Leave a Comment