6: Dummy Variables & One Hot Encoding

August 6, 2018 comments 104 Reads

Code in tutorial: https://github.com/codebasics/py/tree/master/ML/5_one_hot_encoding
Exercise csv file: https://github.com/codebasics/py/blob/master/ML/5_one_hot_encoding/Exercise/carprices.csv
Machine learning models work very well for dataset having only numbers. But how do we handle text information in dataset? Simple approach is to use interger or label encoding but when categorical variables are nominal, using simple label encoding can be problematic. One hot encoding is the technique that can help in this situation. In this tutorial, we will use pandas get_dummies method to create dummy variables that allows us to perform one hot encoding on given dataset. Alternatively we can use sklearn.preprocessing OneHotEncoder as well to create dummy variables.
Website: http://codebasicshub.com/
Facebook: https://www.facebook.com/codebasicshub
Twitter: https://twitter.com/codebasicshub
Google +: https://plus.google.com/106698781833798756600