Underfitting
July 30, 2023
Underfitting
Underfitting, overfitting, and generalized model are three important concepts in machine learning that describe how well a model performs on unseen data. Let's explain each of them in detail with examples:
Underfitting: Underfitting occurs when a model is too simple to capture the underlying patterns in the data, resulting in poor performance on both the training data and unseen (test) data. In an underfitting scenario, the model fails to learn the complexities of the data, leading to low accuracy and high bias.
Example: Imagine we have a dataset with two features, "Hours of Study" and "Exam Score," and we want to predict students' exam scores based on their study hours. We fit a linear regression model to the data, and the resulting line is nearly flat, barely capturing the trend. This model is underfitting because it oversimplifies the relationship between study hours and exam scores, resulting in poor predictions for both the training data and new test data.
Overfitting: Overfitting occurs when a model is too complex and tries to memorize the noise and random fluctuations in the training data instead of capturing the underlying patterns. As a result, the model performs extremely well on the training data but fails to generalize to new, unseen data. In an overfitting scenario, the model has low bias but high variance.
Example: Continuing from the previous example, let's say we use a high-degree polynomial regression (e.g., 20th-degree polynomial) to fit the data. The resulting curve precisely fits all the training data points but exhibits wild oscillations. This model is overfitting because it "over-adapts" to the noise in the training data, making it unreliable for predicting exam scores for new students who were not part of the training set.
Generalized Model: A generalized model strikes the right balance between underfitting and overfitting. It captures the underlying patterns in the data without being too simplistic or overly complex, resulting in good performance on both the training data and unseen data.
Example: Instead of using a linear regression or a high-degree polynomial, we fit a second-degree polynomial regression to the data. This model provides a smooth curve that adequately captures the trend between study hours and exam scores. The second-degree polynomial model generalizes well to new students, providing reasonably accurate predictions for their exam scores based on their study hours.
In summary, underfitting occurs when a model is too simple, overfitting occurs when a model is too complex, and a generalized model finds the right balance between the two, capturing the underlying patterns without memorizing the noise. The goal in machine learning is to find a generalized model that performs well on unseen data, indicating a good trade-off between bias and variance. Techniques such as cross-validation and regularization are commonly used to achieve a generalized model.
Interview Questions :
1. What is underfitting?
2. What is Overfitting?
3. What is Generalized Model?
Relative Blogs
Aug 02, 2023
Aug 02, 2023
Feb 27, 2023