DuckDB¶
Introduced in v3.0
duckdb
>= 0.5.0 requires duckdb-engine
>= 0.6.2
If you encounter problems when using duckdb
>= 0.5.0 you may need to
upgrade duckdb-engine
to at least version 0.6.2.
See this issue for more details.
Install¶
Install ibis and dependencies for the DuckDB backend:
pip install 'ibis-framework[duckdb]'
conda install -c conda-forge ibis-duckdb
mamba install -c conda-forge ibis-duckdb
Connect¶
API¶
Create a client by passing in a path to a DuckDB database to ibis.duckdb.connect
.
See ibis.backends.duckdb.Backend.do_connect
for connection parameter information.
ibis.duckdb.connect
is a thin wrapper around ibis.backends.duckdb.Backend.do_connect
.
Connection Parameters¶
do_connect(database=':memory:', path=None, read_only=False, temp_directory=None, **config)
¶
Create an Ibis client connected to a DuckDB database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
database |
str | Path
|
Path to a duckdb database. |
':memory:'
|
path |
str | Path
|
Deprecated, use |
None
|
read_only |
bool
|
Whether the database is read-only. |
False
|
temp_directory |
Path | str | None
|
Directory to use for spilling to disk. Only set by default for in-memory connections. |
None
|
config |
Any
|
DuckDB configuration parameters. See the DuckDB configuration documentation for possible configuration values. |
{}
|
Examples:
>>> import ibis
>>> ibis.duckdb.connect("database.ddb", threads=4, memory_limit="1GB")
Backend API¶
Backend
¶
Bases: BaseAlchemyBackend
Functions¶
read_csv(source_list, table_name=None, **kwargs)
¶
Register a CSV file as a table in the current database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source_list |
str | list[str] | tuple[str]
|
The data source(s). May be a path to a file or directory of CSV files, or an iterable of CSV files. |
required |
table_name |
str | None
|
An optional name to use for the created table. This defaults to a sequentially generated name. |
None
|
**kwargs |
Any
|
Additional keyword arguments passed to DuckDB loading function. See https://duckdb.org/docs/data/csv for more information. |
{}
|
Returns:
Type | Description |
---|---|
ir.Table
|
The just-registered table |
read_in_memory(dataframe, table_name=None)
¶
Register a Pandas DataFrame or pyarrow Table as a table in the current database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataframe |
pd.DataFrame | pa.Table
|
The data source. |
required |
table_name |
str | None
|
An optional name to use for the created table. This defaults to a sequentially generated name. |
None
|
Returns:
Type | Description |
---|---|
ir.Table
|
The just-registered table |
read_parquet(source_list, table_name=None, **kwargs)
¶
Register a parquet file as a table in the current database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source_list |
str | Iterable[str]
|
The data source(s). May be a path to a file, an iterable of files, or directory of parquet files. |
required |
table_name |
str | None
|
An optional name to use for the created table. This defaults to a sequentially generated name. |
None
|
**kwargs |
Any
|
Additional keyword arguments passed to DuckDB loading function. See https://duckdb.org/docs/data/parquet for more information. |
{}
|
Returns:
Type | Description |
---|---|
ir.Table
|
The just-registered table |
read_postgres(uri, table_name=None)
¶
Register a table from a postgres instance into a DuckDB table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
uri |
The postgres URI in form 'postgres://user:password@host:port' |
required | |
table_name |
The table to read |
None
|
Returns:
Type | Description |
---|---|
ir.Table
|
The just-registered table. |
register(source, table_name=None, **kwargs)
¶
Register a data source as a table in the current database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source |
str | Path | Any
|
The data source(s). May be a path to a file or directory of parquet/csv files, an iterable of parquet or CSV files, a pandas dataframe, a pyarrow table or dataset, or a postgres URI. |
required |
table_name |
str | None
|
An optional name to use for the created table. This defaults to the filename if a path (with hyphens replaced with underscores), or sequentially generated name otherwise. |
None
|
**kwargs |
Any
|
Additional keyword arguments passed to DuckDB loading functions for CSV or parquet. See https://duckdb.org/docs/data/csv and https://duckdb.org/docs/data/parquet for more information. |
{}
|
Returns:
Type | Description |
---|---|
ir.Table
|
The just-registered table |