Using ADBC in Java
The version 21 release of the ADBC libraries introduces something new for Java developers: Java Native Interface (JNI) bindings to the C++ ADBC driver manager. Now Java applications can load and use dbc-installed ADBC drivers.
To make it easier to get started, we've added new Java examples to the ADBC Quickstarts repository, joining the existing examples for C++, Go, Python, R, and Rust. These ready-to-run examples cover all 10 drivers currently installable with dbc—BigQuery, DuckDB, Flight SQL, SQL Server, MySQL, PostgreSQL, Redshift, Snowflake, SQLite, and Trino—and we've tested each of them through the new JNI bindings. Although ADBC drivers can be written entirely in Java and avoid JNI, most ADBC drivers are implemented as shared libraries, and the drivers that dbc installs follow this shared-library model.
The current JNI layer supports querying and result fetching, but it does not yet bind every ADBC function. Work to fill in the remaining pieces is in progress. Even with the partial binding set, the existing functionality is enough for many read-oriented workflows.
How ADBC performance compares to JDBC
To get a basic sense of how ADBC performs in Java, we ran some benchmarks comparing ADBC with JDBC. For simplicity, we used DuckDB for both drivers, since it has a mature JDBC driver and was one of the first databases to support ADBC. An earlier blog post about DuckDB's support for ADBC included a C++ benchmark showing ADBC running 38× faster than ODBC. Building on that, we used DuckDB to look at the Java side and compare JDBC with ADBC directly.
We ran both drivers on the same task: a SELECT * query on the TPC-H SF-1 lineitem table, with the results materialized in memory in a row-oriented structure for JDBC and in an Arrow column-oriented structure for ADBC. You can review the benchmarking code and reproduce the results yourself by downloading the examples from this GitHub repository.
The results will vary depending on your system, data, and workload, but in our test on an M3 MacBook Air with 16 GB of memory, the results were as follows:
$ ./run-jdbc.sh
Result materialization: CachedRowSet
Query execution and result transfer time: 16315.58 ms
Number of rows: 6001215
JVM heap memory used: 5196 MB
$ ./run-adbc.sh
Result materialization: List<ArrowRecordBatch>
Query execution and result transfer time: 1261.18 ms
Number of rows: 6001215
JVM heap memory used: 139 MB
Arrow allocator memory used: 1000 MB
Compared to JDBC, the results show that ADBC:
- Ran about 13× faster
- Used about 5× less memory
If your workflow requires converting JDBC's row-oriented data into a column-oriented structure after fetching it, the speed difference can be even larger. It's also worth noting that DuckDB's JDBC driver includes non-standard methods for receiving Arrow data directly. This feature is implemented behind an optional flag in the JDBC benchmark code, but the ADBC implementation is still more efficient in this scenario.
Try it out
If you're a Java developer working with analytic query systems, we'd appreciate your feedback. The examples are available in the java directory of the ADBC Quickstarts repository, and we're continuing to improve the JNI bindings. Let us know what you think and what you'd like to see next.
If you have questions or want to share feedback, open an issue in the repo on GitHub or chat with us in the Columnar Community on Slack.