Glossary

Key terms and concepts for understanding ADBC

ADBC

ADBC (Arrow Database Connectivity) is a modern alternative to ODBC and JDBC for analytics and AI applications. It is a multi-language API specification and driver standard that delivers data in Apache Arrow columnar format, enabling zero-copy operations for faster, more efficient data access.

Unlike Flight SQL, ADBC is not a client-server protocol. It requires only a client-side driver that adapts the protocol or API of a database, query engine, or data platform to the ADBC API.

ADBC API

The ADBC API is a set of standardized programming interfaces in multiple languages through which applications interact with ADBC drivers and driver managers.

ADBC client

An ADBC client is an application that uses the ADBC API to connect to data systems through one or more ADBC drivers. An ADBC client typically uses a driver manager to locate and dynamically load drivers at runtime. The drivers implement the database-specific protocols and APIs for the systems the client connects to.

ADBC connection profile

An ADBC connection profile is a TOML (.toml) configuration file that describes how to connect to a data system using an ADBC driver. It specifies which driver to use and what options to pass to it. Connection profiles centralize connection information outside of application code, making it easy to share settings across applications and manage different configurations for different environments. Connection profiles are loaded by the driver manager.

ADBC driver

An ADBC driver is a library that enables applications to interact with a data system through the ADBC API.

ADBC drivers can be implemented in different languages including C++, C#, Go, and Rust. A driver can be imported directly into an application that’s implemented in the same language, but more often it is compiled into a shared library (a .so file for Linux, a .dylib file for macOS, or a .dll file for Windows) and dynamically loaded into an application in any supported language using a driver manager.

Many ADBC drivers are available to install from driver registries. You can use dbc to search for and install them.

ADBC driver manager

An ADBC driver manager is a library that acts as an intermediary between a client application and ADBC drivers. Instead of linking directly to one or more drivers, the application links to the driver manager, which locates and dynamically loads drivers. ADBC driver managers are available for many languages including C++, C#, Go, Java, Node.js, R, Rust, and Python.

ADBC driver manifest

An ADBC driver manifest is a TOML (.toml) configuration file that describes an ADBC driver, its capabilities, and how to load it. Driver manifests simplify the installation, management, location, and loading of drivers.

ADBC driver registry

An ADBC driver registry is a repository of ADBC drivers available for discovery and installation. Drivers in a registry are pre-built, code-signed, and notarized. Columnar maintains the default public driver registry from which dbc installs drivers.

Apache Arrow

Apache Arrow is a universal columnar data format. The Arrow format can hold data in memory, send data over a network, or store data on disk. It supports zero-copy operations.

Apache Arrow is also the name of the open source project that created and maintains the Arrow columnar format and the set of tools built around it which enable fast data interchange and in-memory analytics across multiple languages. ADBC and Flight SQL are subprojects of the Arrow project.

Columnar format

A columnar (column-oriented) data format holds the values for each column in contiguous blocks of memory. This is in contrast to a row-oriented data format, which holds the values for each row in contiguous blocks of memory. Apache Arrow is a columnar format.

Columnar formats are widely used in analytic data systems because they speed up the most common types of analytic queries.

dbc

dbc is the command-line tool for installing and managing ADBC drivers. It downloads drivers from driver registries and installs driver manifests to help driver managers find and load them. It works across macOS, Linux, and Windows.

Flight SQL

Arrow Flight SQL is a high-performance client-server protocol for databases, query engines, and data platforms. It exchanges data between the client and server in Apache Arrow columnar format.

There is an ADBC driver for Flight SQL that makes any Flight SQL-compatible system accessible through the ADBC API.

JDBC

JDBC (Java Database Connectivity) is a data access API specification and driver standard for Java that delivers data in a row-oriented format.

ODBC

ODBC (Open Database Connectivity) is a multi-language data access API specification and driver standard that delivers data in a row-oriented format.

Zero-copy

A zero-copy operation is one in which data is transferred from one medium to another without creating any intermediate copies. When a data format supports zero-copy operations, its structure in memory is the same as its structure on disk or on the network. So, for example, data can be read off the network directly into a usable structure in memory without performing any intermediate copies or conversions. Apache Arrow supports zero-copy operations.