Features are input to machine learning algorithm. and the row features that are provided to us are not the best, they are just whatever just has to be in data. Feature engineering have a goal to convert given features into much more predictive ones so that the can predict label more precisely. feature engineering can make simple algorithm to give good results. at a same time if you apply the best algorithm and do not perform feature engineering well you are going to get poor results. feature engineering is a broad subject people dedicate their entire careers to feature engineering. there are some steps in feature engg that we need to follow and repeat most of the times to get job done.
steps---
1. Explore and understand data relationships
2. Transform feature
3.Compute new features from other by applying some maths on it
4. Visualization to check results
5. Test with ML model
Transforming feature---
Why transform feature ?--
1. To improve distribution property of feature
2. To improve co-relation with label
How to transform feature--
1. apply some mathematical operation such as(log, square, squareroot, variance, exponential) on feature
2. take Difference or cumulative sum of values in feature this method apply more to features in dates and time format.
3.Non linear transformed features are not colinear.
Interaction terms--
Till now we have talked about transforming single feature. now consider the example of number of people traveling on a bus on different periods of times.
this can be dependent on different factors--
1. what time of day it is if its evening people rush to their home and their rush in residential areas of the city while in the morning people rush to their offices in the opposite direction of their evening route
2. what day it it if its holiday people would rush for shopping and movies etc. but the time would be different then the working day and volume of the people would different.
3. in this scenario time of a day and holiday are interactive terms.
Comments
Post a Comment