Model building approach: Generative, Discriminative

Atin Singh
4 min readNov 29, 2020

--

Different machine learning algorithms are applied to build models for solving complex problems and achieving better results. Fundamentally, these can be classified into two broad approaches- Generative and Discriminative.

Consider a simple classification problem of predicting breast cancer as malign or benign, based on the tumor size.

In this problem Y represents the class label (Malign/ Benign) to be predicted, whereas x represents the input feature (Tumor Size). Practically, there will be many other features for consideration in this scenario but for simplicity let us train the model only on tumor sizes and predict the tumor category.

From a probabilistic perspective, the goal is to find conditional distribution P(Y | x). In simple terms, the model needs to find the probability of tumor being malign or benign based upon the tumor size.

P(Y | x) : Probability of Y(malign/benign) given the tumor size(x).

Figure1: Scenario — Classify Tumor as Malign / Benign

Before we proceed to select an algorithm to train a model, it is important to understand the different ways to approach this problem. As mentioned above, we can approach any machine learning problem via two broad models.

Generative Models that explicitly model the actual distribution of class labels (Malign/Benign) throughout the data space. The goal here is to find the joint probability P(x , Y) and then subsequently use this joint distribution to evaluate the conditional probability P(Y | x) in order to make predictions of Y for new values of x.

Steps-

· Assume some functional form for P(x | Y), P(Y) by estimating their parameters directly from training data

· Use the Bayes rule to calculate the Posterior:

o Prior Probability — is the estimate of the probability of a class label Y, calculated on the training data (available tumor sizes) before the current evidence x is observed.

o Evidence — corresponds to new data that was not used in computing the prior probability.

o Posterior Probability — the probability of a class label Y given the observed evidence.

o Likelihood — Probability of observing x (tumor size — training data) given a class label Y

Figure 2: Generative Model– Decision Boundary determined by the distribution of class labels (Malign/Benign)

Examples of Generative Models are — Naive Bayes, Bayesian Networks and Hidden Markov models (HMM) etc.

Discriminative Models that models the decision boundary between the class labels (Malign/Benign) by directly assuming some functional form of the conditional distribution P(Y | x). The parameters of P(Y | x) or decision boundary are directly estimated from the training data based on the error / loss in prediction.

Figure 3: Discriminative Model– Decision Boundary determined by the error/loss (Malign/Benign)

Examples of Discriminative Models are — Logistic Regression, Decision Trees, Support Vector Machine (SVM) etc.

Model Structure

Figure 4: Model Structure — Generative, Discriminative

The graph above shows the difference in the structures of both models. The circles represent variable(s) and the direction of the lines indicates what probabilities we can infer.

Comparison of Generative and Discriminative Approaches

Getting the Best of Both Worlds

Can there be a principled approach of combining generative and discriminative approaches not only to give a more satisfying foundation for the development of new models, but also bringing practical benefits. One such example where both these approaches are leveraged is Generative Adversarial Networks (GANs). We will explore GAN Models in detail in another post but for a high-level understanding on the model building approach, a generative adversarial network (GAN) has two parts:

· The generator learns to generate plausible data. The generated instances become negative training examples for the discriminator.

· The discriminator learns to distinguish the generator’s fake data from real data. The discriminator penalizes the generator for producing implausible results.

Figure 5: GAN Model Structure — for producing fake images (Source- developers.google.com)

When training begins, the generator produces obviously fake data, and the discriminator quickly learns to identify it as a fake. As the generator improves with training, the discriminator performance gets worse because the discriminator cannot easily tell the difference between real and fake.

Few applications of GANS are — Image Generation, Face Frontal View Generation, Text-To-Image Synthesis etc.

Conclusion

Selecting an algorithm to build a model depends on different factors such as — Use Case, Availability of Training data, Compute, Storage etc. It is necessary to get an understanding of the different model building approaches before trying to solve a problem. The best solution is usually the simplest one.

--

--