Skip to content

Enable multi-threaded execution for TableFunction#6

Draft
otegami wants to merge 2 commits into
mainfrom
refactor/extract-executor-module
Draft

Enable multi-threaded execution for TableFunction#6
otegami wants to merge 2 commits into
mainfrom
refactor/extract-executor-module

Conversation

@otegami
Copy link
Copy Markdown
Owner

@otegami otegami commented Apr 3, 2026

Summary

Enable multi-threaded execution for DuckDB::TableFunction on DuckDB >= 1.5.0 by introducing per-worker proxy threads.

DuckDB invokes table function callbacks from its own worker threads, which are not Ruby threads. Since rb_thread_call_with_gvl crashes when called from non-Ruby threads, we previously forced single-threaded execution. This PR gives each DuckDB worker thread a dedicated Ruby proxy thread that acquires the GVL on its behalf, making table function callbacks safe under multi-threaded DuckDB execution.

otegami added 2 commits May 30, 2026 11:47
Add a per-worker proxy: one dedicated Ruby thread per DuckDB worker
thread, using the same mutex/condvar hand-off protocol as the global
executor but private to a single worker. This lets callbacks from
different workers run concurrently instead of serializing through the
one global executor queue.

rbduckdb_function_executor_dispatch_via_proxy() routes the non-Ruby
thread path (Case 3) through a given proxy when non-NULL, falling back
to the global executor when NULL; the existing dispatch() now delegates
to it with NULL, so behavior is unchanged. Live proxies are held in a
GC-protected array. The new symbols are unused until table function
integration lands, so this commit is behavior-preserving (full suite
green).
Wire the execute path to per-worker proxy threads on DuckDB >= 1.5.0.
A local_init callback registered via duckdb_table_function_set_local_init
runs once per worker thread, creates a proxy (allocating its Ruby thread
under the GVL through the global executor, since local_init runs on a
non-Ruby thread), and stores it as thread-local init data. The execute
callback retrieves that proxy and dispatches through it via
rbduckdb_function_executor_dispatch_via_proxy, so callbacks from
different workers run concurrently instead of serializing on the single
global executor. DuckDB frees each proxy through rbduckdb_worker_proxy_destroy.

bind and init stay on the global executor (not on the hot path). On
DuckDB < 1.5.0 the local_init hook is absent and the execute callback
keeps using the global executor unchanged.

Verified: with SET threads=4 plus cardinality/max_threads hints, a
GVL-releasing callback reaches max_concurrent=4 (vs 2 on the global
executor) for a ~2x speedup; results are identical. The added test
asserts correctness of the local_init -> proxy -> destroy lifecycle
under multi-threaded execution (throughput is checked manually to avoid
CI flakiness).
@otegami otegami force-pushed the refactor/extract-executor-module branch from 8b1d2f0 to 83684e6 Compare May 30, 2026 03:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant