Overview
OneHotEncoder transforms categorical features into a one-hot encoded representation. Each categorical feature is converted into multiple binary features, one for each category.Constructor
How to handle unknown categories during transform:
'error': Raise an error if an unknown category is found'ignore': Set all one-hot values to 0 for unknown categories
Properties
The categories for each feature. Each inner array contains the sorted unique values for that feature.
Number of features seen during fit.
Total number of features in the one-hot encoded output.
Starting column index for each input feature in the output matrix.
Methods
fit
Training data matrix where each row is a sample and each column is a categorical feature.
this - The fitted encoder instance.
transform
Data matrix to encode.
Matrix - One-hot encoded data.
fitTransform
fit(X).transform(X).
Training data matrix to fit and encode.
Matrix - One-hot encoded data.
Example
Example: Handling Unknown Categories
Example: Single Feature Encoding
Example: Feature Offsets
Notes
- Categories are automatically determined from the data during fit
- Each categorical value is mapped to a unique column in the output
- The encoder must be fitted before calling
transform() - Input data must be finite numeric values representing categories
- Use
handleUnknown='ignore'if test data may contain unseen categories - For label encoding (single column output), use LabelEncoder instead