Viewing a single comment thread. View all comments

Pomdapi113 t1_iy612tc wrote

I am asked to develop a classifier which can map vectors according to its class. I was told we basically must implement this formula. I will be using python. I have watch many videos on bayes classifiers but I am still struggling with this formula. Can someone please explain this to me and the prior steps to implement it, knowing that I have a training data set and test data set? This formula was titled "log likelihood". I believe it is for calculating the error rate of the classifier one implemented, so please let me know how I should actually implement the classifier from the bayes theorem.

picture of the formula

1

I-am_Sleepy t1_iy7xfo0 wrote

The basic idea of log likelihood is

  1. Assume data is generated from parameterized distribution x ~ p(x| z)
  2. Let X be a set of {x1, x2, …, xi} ~ p(x|z). Because each item is generated independently, to generate this dataset, the probability becomes p(X|z) = p(x1|z) * p(x2|z) * … * p(xi|z)
  3. Best fit z will maximize the above formula, but because multiplication can cause numerical inaccuracy, we apply a monotonic function as it won’t change the optimum point. Thus we get log(p(X|z)) = sum [log p(xi|z)]
  4. Using traditional optimization paradigm, we want to minimize the cost function, thus we need to multiply the formula by -1. Then we arrived at Negative Log Likelihood i.e. Optimize for -log(p(X|z))

Your formula estimate the p distribution as a gaussian, which is parameterized by mu and sigma. Usually initialized as zero vector and identity matrix

Using standard autograd, you can then optimize for those parmateters iteratively. But other optimization method is also possible depends on your preference such as genetic algorithm, or bayesian optimization

For bayesian, if your prior is normal, then its conjugate prior is also normal. For multivariate, it is a bit trickier, depends on your settings (likelihood distribution) you can lookup here. You need to look into Interpretation of hyperparameters columns to understand it better, and/or maybe here too

1