Discover Top Posts Tagged with #machine learnign

Generative Learning Algorithms: NAIVE BAYES

1. Input Data x is discrete. For example in text classification, x is a vector of 0 and 1 (bag of words). The vector size is the size of the vocabulary V.

2. In generate algorithms, we generate a model for the input data. For example, we can assume that the input data is Multivariate Gaussian Dist.

Q. What is the data input model in NB ?

Its a MULTINOMIAL DIST. Since for a vector of size V, we have 2^V possible values of x drawn from a multinomial distribution. Since, we have 2^V possible values, defining or modeling x over this input space of 2^V-1 dim space is tough.

Q. How do we solve this dimensionality problem?

By making the INDEPENDENCE assumption which is clearly not correct but works.

What is independence ?? If x and y are independent, then

p(x) = p(x|y)

in NB, this becomes

p(x1,x2,…,xV|y) = p(x1|y) p(x2|y,x1)p(x3|y,x1,x2)…..

=p(x1|y)p(x2|y)…p(xV|y)

Words in a document are indepenent of each other.

#machine learnign #Naive Bayes

#machine learnign

Trending Tags

Recently Viewed Tags

#machine learnign