Imputation
Imputation and handling of missing values
ImputeMean
ImputeMean(self, inputs)
A step for replacing NULL values in select columns with their respective mean in the training set.
Parameters
Name | Type | Description | Default |
---|---|---|---|
inputs |
SelectionType | A selection of columns to impute. All columns must be numeric. | required |
Examples
>>> import ibis_ml as ml
Replace NULL values in all numeric columns with their respective means, computed from the training dataset.
>>> step = ml.ImputeMean(ml.numeric())
ImputeMode
ImputeMode(self, inputs)
A step for replacing NULL values in select columns with their respective modes in the training set.
Parameters
Name | Type | Description | Default |
---|---|---|---|
inputs |
SelectionType | A selection of columns to impute. | required |
Examples
>>> import ibis_ml as ml
Replace NULL values in all numeric columns with their respective modes, computed from the training dataset.
>>> step = ml.ImputeMode(ml.numeric())
ImputeMedian
ImputeMedian(self, inputs)
A step for replacing NULL values in select columns with their respective medians in the training set.
Parameters
Name | Type | Description | Default |
---|---|---|---|
inputs |
SelectionType | A selection of columns to impute. All columns must be numeric. | required |
Examples
>>> import ibis_ml as ml
Replace NULL values in all numeric columns with their respective medians, computed from the training dataset.
>>> step = ml.ImputeMedian(ml.numeric())
FillNA
FillNA(self, inputs, fill_value)
A step for filling NULL values in the input with a specific value.
Parameters
Name | Type | Description | Default |
---|---|---|---|
inputs |
SelectionType | A selection of columns to fillna. | required |
fill_value |
Any | The fill value to use. Must be castable to the dtype of all columns in inputs. | required |
Examples
>>> import ibis_ml as ml
Fill all NULL values in numeric columns with 0.
>>> step = ml.FillNA(ml.numeric(), 0)
Fill all NULL values in specific columns with 1.
>>> step = ml.FillNA(["x", "y"], 1)