Hello, Space Voyager! Today, we're venturing through a fascinating territory: Categorical Data Encoding! Categorical Data consist of groups or traits such as "gender", "marital status", or "hometown". We convert categories into numbers using Label
and One-Hot Encoding
techniques for our machine-learning mates.
Label Encoding
maps categories to numbers ranging from 0
through N-1
, where N
represents the unique category count. It's beneficial for ordered data like "Small"
, "Medium"
, and "Large"
.
To illustrate, here is a Python list of shirt sizes:
Python's Pandas library can be used to assign 0 to "Small", 1 to "Medium", and 2 to "Large":
