Machine Learning - Gaussian Naive Bayes
Formula
According to Bayes’ theorem:
Using the naive conditional independence assumption that
for all , this relationship is simplified to
Since is constant given the input, we can use the following classification rule:
The different naive Bayes classifiers differ mainly by the assumptions they make regarding the distribution of .
For Gaussian Naive Bayes, the likelihood of the features is assumed to be Gaussian:
Code
class GaussianNB:
def __init__(self):
pass
def fit(self, X, y):
n_features = X.shape[1]
unique_y = np.unique(y)
n_classes = unique_y.shape[0]
self.mu = np.zeros((n_classes, n_features))
self.var = np.zeros((n_classes, n_features))
self.priors = np.zeros(n_classes)
for y_i in unique_y:
i = unique_y.searchsorted(y_i)
X_i = X[y == y_i, :]
self.mu[i, :] = np.mean(X_i, axis=0)
self.var[i, :] = np.var(X_i, axis=0)
self.priors[i] = float(len(X_i)) / len(X)
return self
def predict(self, X):
n_samples = X.shape[0]
y_pred = np.zeros(n_samples)
for i in range(n_samples):
density = (1.0 / np.sqrt(2 * np.pi * self.var)) * np.exp(-(((X[i] - self.mu) ** 2) / (2 * self.var)))
prob_desity = np.multiply(np.multiply.reduce(density, axis=1), self.priors)
y_pred[i] = np.argmax(prob_desity)
return y_pred