Discretization

Discretization of numeric columns

DiscretizeKBins

DiscretizeKBins(self, inputs, *, n_bins=5, strategy='uniform')

A step for binning numeric data into intervals.

Parameters

Name Type Description Default
inputs SelectionType A selection of columns to bin. required
n_bins int Number of bins to create. 5
strategy (str, {'uniform', 'quantile'}) Strategy used to define the bin edges. - ‘uniform’: Evenly spaced bins between the minimum and maximum values. - ‘quantile’: Bins are created based on data quantiles. 'uniform'

Raises

Name Type Description
ValueError If n_bins is less than or equal to 1 or if an unsupported strategy is provided.

Examples

>>> import ibis
>>> import ibis_ml as ml
>>> from ibis_ml.core import Metadata
>>> ibis.options.interactive = True

Load penguins dataset

>>> p = ibis.examples.penguins.fetch()

Bin all numeric columns.

>>> step = ml.DiscretizeKBins(ml.numeric(), n_bins=10)
>>> step.fit_table(p, Metadata())
>>> step.transform_table(p)

Bin specific numeric columns.

>>> step = ml.DiscretizeKBins(["bill_length_mm"], strategy="quantile")
>>> step.fit_table(p, Metadata())
>>> step.transform_table(p)
Back to top