Documentation Index
Fetch the complete documentation index at: https://mintlify.com/Seyamalam/bun-scikit/llms.txt
Use this file to discover all available pages before exploring further.
Overview
SelectKBest selects the K highest scoring features according to a scoring function. Commonly used scoring functions include chi2, f_classif, and mutual_info_classif.Constructor
Parameters
Scoring function that takes (X, y) and returns scores and p-values.
Common choices:
chi2, f_classif, f_regression, mutualInfoClassif.Number of top features to select. Use
'all' to keep all features.Methods
fit()
Training data matrix.
Target values.
transform()
Data to transform.
fitTransform()
Properties
Scores of all features.
P-values of feature scores (if supported by scoring function).
Indices of selected features.
Examples
Classification with chi2
Regression with f_regression
Use in pipeline
Notes
- Univariate feature selection examines each feature independently
- Does not account for feature interactions
- Fast and scalable to high-dimensional datasets
- chi2 requires non-negative features (e.g., count data, TF-IDF)
- f_classif and f_regression are suitable for continuous features