The Right and the Wrong Way to Do Cross-validation

  You might wonder why do we need cross validation in the first place itself. Let’s explain that first. Normally, the generalization performance of a machine learning algorithm depends on its prediction capability on an independent test data. This assessment is of utmost importance to us. Cross Validation is such a model validation technique...

Parametric data analysis for A/B test in marketing analytics

  With the onset of digital world, everyday a number of new websites is being created trying to sell wide range of products and services. And sustaining in this highly competitive market is itself an art. Companies are hiring tech geeks to customize their website and services according to the response rate of customers....

Feature selection using Decision Tree

  One of the key differentiators in any data science problem is the quality of feature selection and importance. When we have a lot of data available to be used by our model, the task of feature selection becomes inevitable due to computational constraints and the elimination of noisy variables for better prediction. Also,...

How to avoid overfitting while training?

Overfitting happens mostly because the model becomes too complex. Such a model will give poor accuracies, as it memorizes the noise in the training data. A model is usually fit by achieving the highest accuracy on the training data set. However, its efficiency is judged by its its performance on test data. Overfitting occurs...