Features are input to machine learning algorithm. and the row features that are provided to us are not the best, they are just whatever just has to be in data. Feature engineering have a goal to convert given features into much more predictive ones so that the can predict label more precisely . feature engineering can make simple algorithm to give good results. at a same time if you apply the best algorithm and do not perform feature engineering well you are going to get poor results. feature engineering is a broad subject people dedicate their entire careers to feature engineering. there are some steps in feature engg that we need to follow and repeat most of the times to get job done. steps--- 1. Explore and understand data relationships 2. Transform feature 3.Compute new features from other by applying some maths on it 4. Visualization to check results 5. Test with ML model 6. Repeat above steps as needed Transforming feature--- Why transform feature ?-- 1. To improve distribu
In previous post i have said that duplicate values create biased while training model and have explained how to deal with duplicate values step by step. if you have not read that post make sure you read that (link-- https://www.blogger.com/blog/post/edit/950003373928384423/4720144211787135428 ) as this post going to be practical implementation of theory explained in that post . or just continue reading as i shall explain while showing how it actually work in practical. Steps-- - 1. finding duplicate values through data exploration --- before we start finding duplicates we need to first import dataset and set columns so ones data is imported we can start exploring it use dtypes, describe and other methods to explore data see the image below and observe carefully you will find that we have printed two shapes first one being shape of entire data frame and the other one being shape of unique customer_id . and what we found is that there are 12 observations more in