Skip to content

Performance Benchmarks

Rayforce-Py delivers exceptional performance, closely matching native Rayforce while significantly outperforming Pandas. Our benchmarks are based on the H2OAI Group By Benchmark standard.

Rayforce-Py 945 μs
1.00x
Native Rayforce 944 μs
1.00x
Polars 2,619 μs
2.61x
Pandas 6,626 μs
6.70x
~2.6x
Faster than Polars
~6.7x
Faster than Pandas
100%
Native Performance

Benchmarks run on: macOS M4 32GB, 1M rows, 100 groups, 50 runs (median), 20 warmup runs

Methodology

  • Dataset: 1,000,000 rows, 6 columns (id1, id2, id3, v1, v2, v3)
  • Timing: Median of 50 runs
  • Warmup: 20 runs per query to warm caches
  • Data: Deterministic (seed=42) for reproducibility

Q1: Group by id1, sum v1

Implementation Time (μs) vs Native vs Pandas vs Polars
Rayforce-Py 611 1.00x 5.82x 1.90x
Native Rayforce 612 1.00x 5.81x 1.90x
Polars 1,162 1.90x 3.06x 1.00x
Pandas 3,556 5.81x 1.00x 0.33x

Q2: Group by id1, id2, sum v1

Implementation Time (μs) vs Native vs Pandas vs Polars
Rayforce-Py 1,279 0.99x 10.65x 5.28x
Native Rayforce 1,290 1.00x 10.57x 5.23x
Polars 6,753 5.23x 2.02x 1.00x
Pandas 13,631 10.57x 1.00x 0.50x

Performance Insight

Multi-column group by operations show the largest performance advantage, with Rayforce-Py being 10.65x faster than Pandas and 5.28x faster than Polars.


Q3: Group by id3, sum v1, avg v3

Implementation Time (μs) vs Native vs Pandas vs Polars
Rayforce-Py 829 1.00x 5.85x 1.63x
Native Rayforce 828 1.00x 5.85x 1.63x
Polars 1,352 1.63x 3.58x 1.00x
Pandas 4,846 5.85x 1.00x 0.28x

Q4: Group by id3, avg v1, v2, v3

Implementation Time (μs) vs Native vs Pandas vs Polars
Rayforce-Py 1,044 1.00x 5.96x 1.52x
Native Rayforce 1,045 1.00x 5.95x 1.52x
Polars 1,584 1.52x 3.92x 1.00x
Pandas 6,216 5.95x 1.00x 0.25x

Q5: Group by id3, sum v1, v2, v3

Implementation Time (μs) vs Native vs Pandas vs Polars
Rayforce-Py 1,049 1.01x 6.55x 1.48x
Native Rayforce 1,043 1.00x 6.60x 1.49x
Polars 1,549 1.49x 4.44x 1.00x
Pandas 6,879 6.60x 1.00x 0.23x

Best Performance

Q5 shows Rayforce-Py performing 6.55x faster than Pandas and 1.48x faster than Polars, demonstrating excellent performance on multiple aggregations.


Q6: Group by id3, max(v1) - min(v2)

Implementation Time (μs) vs Native vs Pandas vs Polars
Rayforce-Py 859 1.02x 5.39x 3.86x
Native Rayforce 846 1.00x 5.47x 3.92x
Polars 3,316 3.92x 1.40x 1.00x
Pandas 4,627 5.47x 1.00x 0.72x

Query Rayforce-Py vs Native Rayforce-Py vs Pandas Rayforce-Py vs Polars
Q1 1.00x 5.82x 1.90x
Q2 0.99x 10.65x 5.28x
Q3 1.00x 5.85x 1.63x
Q4 1.00x 5.96x 1.52x
Q5 1.01x 6.55x 1.48x
Q6 1.02x 5.39x 3.86x
Average 1.00x 6.70x 2.61x

Performance Analysis

Rayforce-Py adds almost no overhead compared to native Rayforce, demonstrating the efficiency of the Python bindings. On average, Rayforce-Py is 6.70x faster than Pandas and 2.61x faster than Polars, making it an excellent choice for high-performance data processing.

Note: The slight performance advantage shown by Rayforce-Py over native Rayforce is due to measurement methodology differences. Native Rayforce benchmarks include memory deallocation overhead, while Rayforce-Py measurements exclude it. In practice, the performance difference is negligible and within measurement noise, demonstrating that the Python bindings introduce virtually no overhead.


Running Your Own Benchmarks

You can run the benchmarks yourself using the provided benchmark suite:

# Default (15 runs, 5 warmup)
make benchmarkdb

# Custom configuration
make benchmarkdb ARGS="--runs 20 --warmup 5"

For Accurate Results

  • Use at least 15-20 runs for statistical significance
  • Ensure your system is idle to minimize interference
  • Results use median (more robust than mean) with standard deviation reported

Learn More