Skip to content

DataFusion

filebadge

exportbadge

ibis.memtable Support memtable

The Datafusion backend does not currently support in-memory tables.

Please file an issue if you'd like the Datafusion backend to support in-memory tables.

Install

Install ibis and dependencies for the Apache Datafusion backend:

pip install 'ibis-framework[datafusion]'
conda install -c conda-forge ibis-datafusion
mamba install -c conda-forge ibis-datafusion

Connect

ibis.datafusion.connect

con = ibis.datafusion.connect()
con = ibis.datafusion.connect(
    config={"table1": "path/to/file.parquet", "table2": "path/to/file.csv"}
)

ibis.datafusion.connect is a thin wrapper around ibis.backends.datafusion.Backend.do_connect.

Connection Parameters

do_connect(config=None)

Create a Datafusion backend for use with Ibis.

Parameters:

Name Type Description Default
config Mapping[str, str | Path] | SessionContext | None

Mapping of table names to files.

None

File Support

read_csv(path, table_name=None, **kwargs)

Register a CSV file as a table in the current database.

Parameters:

Name Type Description Default
path str | Path

The data source. A string or Path to the CSV file.

required
table_name str | None

An optional name to use for the created table. This defaults to a sequentially generated name.

None
**kwargs Any

Additional keyword arguments passed to Datafusion loading function.

{}

read_parquet(path, table_name=None, **kwargs)

Register a parquet file as a table in the current database.

Parameters:

Name Type Description Default
path str | Path

The data source.

required
table_name str | None

An optional name to use for the created table. This defaults to a sequentially generated name.

None
**kwargs Any

Additional keyword arguments passed to Datafusion loading function.

{}

read_delta(source_table, table_name=None, **kwargs)

Register a Delta Lake table as a table in the current database.

Parameters:

Name Type Description Default
source_table str | Path

The data source. Must be a directory containing a Delta Lake table.

required
table_name str | None

An optional name to use for the created table. This defaults to a sequentially generated name.

None
**kwargs Any

Additional keyword arguments passed to deltalake.DeltaTable.

{}

Last update: August 1, 2023