Bias-Variance Trade Off
so today we would be covering the most important topic in data science which is bias-variance relationship , before we go deep dive in this topic lets understand the basics first
What is Bias ?
Bias is basically the inability of a machine learning model to truly capture the relationship in the training data , see the figure no 2 . the model here is not able to capture the points properly so we can say it is high bias
importance of Variance :
when we say variance we means variability of the error in the training set and test set , lets take a example if your model performs good on training data and worst on test data then the variability of the error in both test and train set will be hug which results in high variance so ideally a ml model should have low variance (low error between test and train set )
Underfitting and Overfitting :
underfitting basically means ( high bias and low variance ) means that your model is not performing good on training set
Overfitting is just opposite (low bias and high variance ) it performs exceptionally well on the train set but perform worst on the test set
So whats the solution and techniques to overcome this ?
Today while developing any model we always should focus on low bias and low variance , but due to variability of the data its not that much easy , so to overcome this we have some techniques which will help us to overcome this problem
1 Regularization
2 Bagging
3 Boosting
Stay tuned for the detailed approach on above techniques !!!!