Busigence

Decision Intelligence Company

Busigence

Decision Intelligence Company

Currently browsing: Data Science

Shrinkage Methods in Linear Regression

Apr 25, 2017 DATA SCIENCE, STATISTICAL INFERENCE

Ever have a question that, “Why is Linear Regression giving me such good accuracy on the training set but a low accuracy on the test set in spite of adding all the available dependent features to the model?” The question above seems inexplicable to many people but is answered by a concept called...

Harsh

Data Science Engineer – Who, What, & Why?

Oct 25, 2016 DATA SCIENCE, TALENT SCIENCE

Data science is the study of where information comes from, what it represents and how it can be turned into a valuable resource in the creation of business and IT strategies. Mining large amounts of structured and unstructured data to identify patterns can help an organization rein in costs, increase efficiencies, recognize new...

Pranav

Friend Follower Analysis using Apache Spark GraphX’s PageRank algorithm

Oct 25, 2016 ALGORITHMS, DATA SCIENCE

GraphX is Apache Spark’s API for graphs and graph-parallel computation. This includes transformation, exploration, and graph computation. Data can be viewed both as graph & collections. This use case discusses friend follower analysis using Apache Spark GraphX’s PageRank operator. PageRank measures the importance of each vertex in a graph, by determining which vertexes have the...

Pranav

Data Science – Let the Data Sing

Oct 21, 2016 DATA SCIENCE, DATA VISUALIZATION

The hype is real. But let’s get past it. What exactly is Data Science? And why is it the next big thing. Massive amounts of data are being generated every sec. The total amount of data in the world is 4.4 zetabytes. And this is not just the internet data. We are talking...

Sijoy

Feature selection using Decision Tree

Oct 21, 2016 DATA SCIENCE, DATA SCIENCE

One of the key differentiators in any data science problem is the quality of feature selection and importance. When we have a lot of data available to be used by our model, the task of feature selection becomes inevitable due to computational constraints and the elimination of noisy variables for better prediction. Also,...

Akhil

Hyperparameter Optimization and Why is it important?

Oct 21, 2016 DATA SCIENCE, MACHINE LEARNING

A machine learning model consists of various parameters that need to be learned from the data. The crux of Machine learning is fitting a model to the data. This process of training a model with existing data to fit the model parameters, is called model training. Hyperparameters refer to another kind of parameters...

Ananya

How to avoid overfitting while training?

Oct 21, 2016 DATA SCIENCE, STATISTICAL INFERENCE

Overfitting happens mostly because the model becomes too complex. Such a model will give poor accuracies, as it memorizes the noise in the training data. A model is usually fit by achieving the highest accuracy on the training data set. However, its efficiency is judged by its its performance on test data. Overfitting occurs...