Generic Expression APIs¶

These expressions are available on scalars and columns of any element type.

`Value` ¶

Bases: Expr

Base class for a data generating expression having a known type.

Functions¶

`asc()` ¶

Sort an expression ascending.

`between(lower, upper)` ¶

Check if this expression is between lower and upper, inclusive.

Parameters:

Name	Type	Description	Default
`lower`	`Value`	Lower bound	required
`upper`	`Value`	Upper bound	required

Returns:

Type	Description
`BooleanValue`	Expression indicating membership in the provided range

`case()` ¶

Create a SimpleCaseBuilder to chain multiple if-else statements.

Add new search expressions with the .when() method. These must be comparable with this column expression. Conclude by calling .end()

Returns:

Type	Description
`SimpleCaseBuilder`	A case builder

Examples:

>>> import ibis
>>> t = ibis.table([('string_col', 'string')], name='t')
>>> expr = t.string_col
>>> case_expr = (expr.case()
...              .when('a', 'an a')
...              .when('b', 'a b')
...              .else_('null or (not a and not b)')
...              .end())
>>> case_expr
r0 := UnboundTable[t]
  string_col string
SimpleCase(base=r0.string_col, cases=[List(values=['a', 'b'])], results=[List(values=['an a', 'a b'])], default='null or (not a and not b)')

`cases(case_result_pairs, default=None)` ¶

Create a case expression in one shot.

Parameters:

Name	Type	Description	Default
`case_result_pairs`	`Iterable[tuple[ir.BooleanValue, Value]]`	Conditional-result pairs	required
`default`	`Value \| None`	Value to return if none of the case conditions are true	`None`

Returns:

Type	Description
`Value`	Value expression

`cast(target_type)` ¶

Cast expression to indicated data type.

Parameters:

Name	Type	Description	Default
`target_type`	`dt.DataType`	Type to cast to	required

Returns:

Type	Description
`Value`	Casted expression

`coalesce(*args)` ¶

Return the first non-null value from args.

Parameters:

Name	Type	Description	Default
`args`	`Value`	Arguments from which to choose the first non-null value	`()`

Returns:

Type	Description
`Value`	Coalesced expression

Examples:

>>> import ibis
>>> ibis.coalesce(None, 4, 5)
Coalesce((None, 4, 5))

`collect(where=None)` ¶

Return an array of the elements of this expression.

`desc()` ¶

Sort an expression descending.

`fillna(fill_value)` ¶

Replace any null values with the indicated fill value.

Parameters:

Name	Type	Description	Default
`fill_value`	`Scalar`	Value with which to replace `NA` values in `self`	required

Examples:

>>> import ibis
>>> table = ibis.table(dict(col='int64', other_col='int64'))
>>> result = table.col.fillna(5)
r0 := UnboundTable: unbound_table_0
  col       int64
  other_col int64
IfNull(r0.col, ifnull_expr=5)
>>> table.col.fillna(table.other_col * 3)
r0 := UnboundTable: unbound_table_0
  col       int64
  other_col int64
IfNull(r0.col, ifnull_expr=r0.other_col * 3)

Returns:

Type	Description
`Value`	`self` filled with `fill_value` where it is `NA`

`greatest(*args)` ¶

Compute the largest value among the supplied arguments.

Parameters:

Name	Type	Description	Default
`args`	`ir.Value`	Arguments to choose from	`()`

Returns:

Type	Description
`Value`	Maximum of the passed arguments

`group_concat(sep=',', where=None)` ¶

Concatenate values using the indicated separator to produce a string.

Parameters:

Name	Type	Description	Default
`sep`	`str`	Separator will be used to join strings	`','`
`where`	`ir.BooleanValue \| None`	Filter expression	`None`

Returns:

Type	Description
`StringScalar`	Concatenated string expression

`hash(how='fnv')` ¶

Compute an integer hash value.

Parameters:

Name	Type	Description	Default
`how`	`str`	Hash algorithm to use	`'fnv'`

Returns:

Type	Description
`IntegerValue`	The hash value of `self`

`identical_to(other)` ¶

Return whether this expression is identical to other.

Corresponds to IS NOT DISTINCT FROM in SQL.

Parameters:

Name	Type	Description	Default
`other`	`Value`	Expression to compare to	required

Returns:

Type	Description
`BooleanValue`	Whether this expression is not distinct from `other`

`isin(values)` ¶

Check whether this expression's values are in values.

Parameters:

Name	Type	Description	Default
`values`	`Value \| Sequence[Value]`	Values or expression to check for membership	required

Returns:

Type	Description
`BooleanValue`	Expression indicating membership

Examples:

Check whether a column's values are contained in a sequence

>>> import ibis
>>> table = ibis.table(dict(string_col='string'))
>>> table.string_col.isin(['foo', 'bar', 'baz'])
r0 := UnboundTable: unbound_table_1
  string_col string
Contains(value=r0.string_col, options=('foo', 'bar', 'baz'))

Check whether a column's values are contained in another table's column

>>> table2 = ibis.table(dict(other_string_col='string'))
>>> table.string_col.isin(table2.other_string_col)
r0 := UnboundTable: unbound_table_3
  other_string_col string
r1 := UnboundTable: unbound_table_1
  string_col string
Contains(value=r1.string_col, options=r0.other_string_col)

`isnull()` ¶

Return whether this expression is NULL.

`least(*args)` ¶

Compute the smallest value among the supplied arguments.

Parameters:

Name	Type	Description	Default
`args`	`ir.Value`	Arguments to choose from	`()`

Returns:

Type	Description
`Value`	Minimum of the passed arguments

`name(name)` ¶

Rename an expression to name.

Parameters:

Name	Type	Description	Default
`name`		The new name of the expression	required

Returns:

Type	Description
`Value`	`self` with name `name`

Examples:

>>> import ibis
>>> t = ibis.table(dict(a="int64"))
>>> t.a.name("b")
r0 := UnboundTable[unbound_table_...]
  a int64
b: r0.a

`notin(values)` ¶

Check whether this expression's values are not in values.

Parameters:

Name	Type	Description	Default
`values`	`Value \| Sequence[Value]`	Values or expression to check for lack of membership	required

Returns:

Type	Description
`BooleanValue`	Whether `self`'s values are not contained in `values`

`notnull()` ¶

Return whether this expression is not NULL.

`nullif(null_if_expr)` ¶

Set values to null if they equal the values null_if_expr.

Commonly use to avoid divide-by-zero problems by replacing zero with NULL in the divisor.

Parameters:

Name	Type	Description	Default
`null_if_expr`	`Value`	Expression indicating what values should be NULL	required

Returns:

Type	Description
`Value`	Value expression

`over(window)` ¶

Construct a window expression.

Parameters:

Name	Type	Description	Default
`window`	`win.Window`	Window specification	required

Returns:

Type	Description
`Value`	A window function expression

`substitute(value, replacement=None, else_=None)` ¶

Replace one or more values in a value expression.

Parameters:

Name	Type	Description	Default
`value`	`Value`	Expression or mapping	required
`replacement`	`Value \| None`	Expression. If an expression is passed to value, this must be passed.	`None`
`else_`	`Value \| None`	Expression	`None`

Returns:

Type	Description
`Value`	Replaced values

`to_projection()` ¶

Promote this value expression to a projection.

`typeof()` ¶

Return the data type of the expression.

The values of the returned strings are necessarily backend dependent.

Returns:

Type	Description
`StringValue`	A string indicating the type of the value

`Column` ¶

Bases: Value, JupyterMixin

Functions¶

`approx_median(where=None)` ¶

Return an approximate of the median of self.

The result may or may not be exact

Whether the result is an approximation depends on the backend.

Do not depend on the results being exact

Parameters:

Name	Type	Description	Default
`where`	`ir.BooleanValue \| None`	Filter in values when `where` is `True`	`None`

Returns:

Type	Description
`Scalar`	An approximation of the median of `self`

`approx_nunique(where=None)` ¶

Return the approximate number of distinct elements in self.

The result may or may not be exact

Whether the result is an approximation depends on the backend.

Do not depend on the results being exact

Parameters:

Name	Type	Description	Default
`where`	`ir.BooleanValue \| None`	Filter in values when `where` is `True`	`None`

Returns:

Type	Description
`Scalar`	An approximate count of the distinct elements of `self`

`arbitrary(where=None, how='first')` ¶

Select an arbitrary value in a column.

Parameters:

Name Type Description Default

where

ir.BooleanValue | None

A filter expression

None

how

Literal['first', 'last', 'heavy']

The method to use for selecting the element.

"first": Select the first non-NULL element
"last": Select the last non-NULL element
"heavy": Select a frequently occurring value using the heavy hitters algorithm. "heavy" is only supported by Clickhouse backend.

'first'

Returns:

Type	Description
`Scalar`	An expression

`argmax(key, where=None)` ¶

Return the value of self that maximizes key.

`argmin(key, where=None)` ¶

Return the value of self that minimizes key.

`count(where=None)` ¶

Compute the number of rows in an expression.

Parameters:

Name	Type	Description	Default
`where`	`ir.BooleanValue \| None`	Filter expression	`None`

Returns:

Type	Description
`IntegerScalar`	Number of elements in an expression

`max(where=None)` ¶

Return the maximum of a column.

`min(where=None)` ¶

Return the minimum of a column.

`mode(where=None)` ¶

Return the mode of a column.

`nth(n)` ¶

Return the nth value over a window.

Parameters:

Name	Type	Description	Default
`n`	`int \| ir.IntegerValue`	Desired rank value	required

Returns:

Type	Description
`Column`	The nth value over a window

`summary(exact_nunique=False, prefix='', suffix='')` ¶

Compute a set of summary metrics.

Parameters:

Name	Type	Description	Default
`exact_nunique`	`bool`	Compute the exact number of distinct values. Typically slower if `True`.	`False`
`prefix`	`str`	String prefix for metric names	`''`
`suffix`	`str`	String suffix for metric names	`''`

Returns:

Type	Description
`list[NumericScalar]`	Metrics list

`topk(k, by=None)` ¶

Return a "top k" expression.

Parameters:

Name	Type	Description	Default
`k`	`int`	Return this number of rows	required
`by`	`ir.Value \| None`	An expression. Defaults to `count`.	`None`

Returns:

Type	Description
`TableExpr`	A top-k expression

`value_counts(metric_name='count')` ¶

Compute a frequency table.

Parameters:

Name	Type	Description	Default
`metric_name`	`str`	Output column name of the `count()` metric	`'count'`

Returns:

Type	Description
`Table`	Frequency table expression

`Scalar` ¶

Bases: Value

Last update: August 5, 2022

Generic Expression APIs¶

Value ¶

Functions¶

asc() ¶

between(lower, upper) ¶

case() ¶

cases(case_result_pairs, default=None) ¶

cast(target_type) ¶

coalesce(*args) ¶

collect(where=None) ¶

desc() ¶

fillna(fill_value) ¶

greatest(*args) ¶

group_concat(sep=',', where=None) ¶

hash(how='fnv') ¶

identical_to(other) ¶

isin(values) ¶

isnull() ¶

least(*args) ¶

name(name) ¶

notin(values) ¶

notnull() ¶

nullif(null_if_expr) ¶

over(window) ¶

substitute(value, replacement=None, else_=None) ¶

to_projection() ¶

typeof() ¶

Column ¶

Functions¶

approx_median(where=None) ¶

approx_nunique(where=None) ¶

arbitrary(where=None, how='first') ¶

argmax(key, where=None) ¶

argmin(key, where=None) ¶

count(where=None) ¶

max(where=None) ¶

min(where=None) ¶

mode(where=None) ¶

nth(n) ¶

summary(exact_nunique=False, prefix='', suffix='') ¶

topk(k, by=None) ¶

value_counts(metric_name='count') ¶

Scalar ¶

`Value` ¶

`asc()` ¶

`between(lower, upper)` ¶

`case()` ¶

`cases(case_result_pairs, default=None)` ¶

`cast(target_type)` ¶

`coalesce(*args)` ¶

`collect(where=None)` ¶

`desc()` ¶

`fillna(fill_value)` ¶

`greatest(*args)` ¶

`group_concat(sep=',', where=None)` ¶

`hash(how='fnv')` ¶

`identical_to(other)` ¶

`isin(values)` ¶

`isnull()` ¶

`least(*args)` ¶

`name(name)` ¶

`notin(values)` ¶

`notnull()` ¶

`nullif(null_if_expr)` ¶

`over(window)` ¶

`substitute(value, replacement=None, else_=None)` ¶

`to_projection()` ¶

`typeof()` ¶

`Column` ¶

`approx_median(where=None)` ¶

`approx_nunique(where=None)` ¶

`arbitrary(where=None, how='first')` ¶

`argmax(key, where=None)` ¶

`argmin(key, where=None)` ¶

`count(where=None)` ¶

`max(where=None)` ¶

`min(where=None)` ¶

`mode(where=None)` ¶

`nth(n)` ¶

`summary(exact_nunique=False, prefix='', suffix='')` ¶

`topk(k, by=None)` ¶

`value_counts(metric_name='count')` ¶

`Scalar` ¶