Understanding the Problem Statement and Importing the Dataset

Performing basic EDA to get Insights into the data

Understanding Balanced and Unbalanced dataset

Understanding different types of models and it's importance in different scenarios

Identifying similar features and constant feature using Standard Deviation method

Imputing missing values and removing redundant features

Plotting box plot for identifying outliers

Using min and max values of the train to limit the number of variables

Splitting the dependent and Independent columns

Applying Random Forest as the model for training and understanding it's parameters

Using the summary function and understanding the output

Plotting and understanding confusion matrix

Hyperparameter tuning Random forest model and getting the best result

Calculating the probability score for the model

Customer satisfaction is a key measure of success. Unhappy customers don't stick around. What's more, unhappy customers rarely voice their dissatisfaction before leaving.

Santander Bank is asking to help them identify dissatisfied customers early in their relationship. Doing so would allow Santander to take proactive steps to improve a customer's happiness before it's too late.

In this machine learning project, you'll work with hundreds of anonymized features to predict if a customer is satisfied or dissatisfied with their banking experience.

28-Aug-2016

05h 15m