dmatth1

This is my personal blog.

arrow-cpp, arrow-rs, and Velox still ship the scalar Parquet bloom probe

arrow-go shipped AVX2/SSE4/NEON SBBF probes in 18.3.0 (PR #336). arrow-cpp and Velox ship the same scalar reference line-for-line; arrow-rs is the same algorithm in Rust. Those three cover the C-family Parquet ecosystem: DuckDB, ClickHouse, Polars, DataFusion, Trino, Presto, Spark via Gluten, StarRocks, Doris, pyarrow → pandas, Apache Drill. A C++ port of the arrow-go approach: bit-identical on 167M (query, filter) pairs, 3–5× in-cache on the probe microbenchmark, 1.5× out-of-L3 with a 4-way bulk path across row-group filters. ...

May 15, 2026 · 5 min · dmatth1