Discretization
Discretization of numeric columns
DiscretizeKBins
self, inputs, *, n_bins=5, strategy='uniform') DiscretizeKBins(
A step for binning numeric data into intervals.
Parameters
Name | Type | Description | Default |
---|---|---|---|
inputs | SelectionType | A selection of columns to bin. | required |
n_bins | int | Number of bins to create. | 5 |
strategy | (str, {'uniform', 'quantile'}) | Strategy used to define the bin edges. - ‘uniform’: Evenly spaced bins between the minimum and maximum values. - ‘quantile’: Bins are created based on data quantiles. | 'uniform' |
Raises
Name | Type | Description |
---|---|---|
ValueError | If n_bins is less than or equal to 1 or if an unsupported strategy is provided. |
Examples
>>> import ibis
>>> import ibis_ml as ml
>>> from ibis_ml.core import Metadata
>>> ibis.options.interactive = True
Load penguins dataset
>>> p = ibis.examples.penguins.fetch()
Bin all numeric columns.
>>> step = ml.DiscretizeKBins(ml.numeric(), n_bins=10)
>>> step.fit_table(p, Metadata())
>>> step.transform_table(p)
Bin specific numeric columns.
>>> step = ml.DiscretizeKBins(["bill_length_mm"], strategy="quantile")
>>> step.fit_table(p, Metadata())
>>> step.transform_table(p)