Imputation

Imputation and handling of missing values

ImputeMean

ImputeMean(self, inputs)

A step for replacing NULL values in select columns with their respective mean in the training set.

Parameters

Name Type Description Default
inputs SelectionType A selection of columns to impute. All columns must be numeric. required

Examples

>>> import ibis_ml as ml

Replace NULL values in all numeric columns with their respective means, computed from the training dataset.

>>> step = ml.ImputeMean(ml.numeric())

ImputeMode

ImputeMode(self, inputs)

A step for replacing NULL values in select columns with their respective modes in the training set.

Parameters

Name Type Description Default
inputs SelectionType A selection of columns to impute. required

Examples

>>> import ibis_ml as ml

Replace NULL values in all numeric columns with their respective modes, computed from the training dataset.

>>> step = ml.ImputeMode(ml.numeric())

ImputeMedian

ImputeMedian(self, inputs)

A step for replacing NULL values in select columns with their respective medians in the training set.

Parameters

Name Type Description Default
inputs SelectionType A selection of columns to impute. All columns must be numeric. required

Examples

>>> import ibis_ml as ml

Replace NULL values in all numeric columns with their respective medians, computed from the training dataset.

>>> step = ml.ImputeMedian(ml.numeric())

FillNA

FillNA(self, inputs, fill_value)

A step for filling NULL values in the input with a specific value.

Parameters

Name Type Description Default
inputs SelectionType A selection of columns to fillna. required
fill_value Any The fill value to use. Must be castable to the dtype of all columns in inputs. required

Examples

>>> import ibis_ml as ml

Fill all NULL values in numeric columns with 0.

>>> step = ml.FillNA(ml.numeric(), 0)

Fill all NULL values in specific columns with 1.

>>> step = ml.FillNA(["x", "y"], 1)
Back to top