-
Couldn't load subscription status.
- Fork 1.1k
Open
Description
Summary
I propose replacing the current gettimeofday()-based timing measurements in the benchmark suite with CPU cycle counting using hardware performance counters (RDTSC). This change would significantly improve benchmark reliability, precision, and consistency.
Current Issues with Time-Based Measurements
The current benchmarking implementation in bench.h uses wall-clock time measurements with microsecond precision:
static int64_t gettime_i64(void) {
struct timeval tv;
gettimeofday(&tv, NULL);
return (int64_t)tv.tv_usec + (int64_t)tv.tv_sec * 1000000LL;
}This approach has several significant limitations:
- Limited Precision: Microsecond resolution is insufficient for modern CPUs where cryptographic operations can complete in hundreds of nanoseconds. Requiring a lot of iterations to get an acceptable result
- System Interference: Wall-clock time is affected by:
- OS scheduler interrupts and context switches
- Other running processes competing for CPU time
- Power management and frequency scaling
- Thermal throttling
- Shared caches between cores
- High Variability: Benchmark results can vary by 50%+ between runs due to system noise
- Non-deterministic: Results depend on system load, making comparisons unreliable
- Overhead: System call overhead affects measurement accuracy
- Lack of comparison: doesn't provide a stable, reliable metric for team-wide performance discussions and comparisons.
Proposed Implementation
Use clocks that don't include the time that the process is paused etc. I propose something like clock_gettime(CLOCK_PROCESS_CPUTIME_ID) or perf stat -e cpu-clock
Metadata
Metadata
Assignees
Labels
No labels