409 lines
9.9 KiB
Markdown
409 lines
9.9 KiB
Markdown
# HeroDB Performance Benchmarking Guide
|
|
|
|
## Overview
|
|
|
|
This document describes the comprehensive benchmarking suite for HeroDB, designed to measure and compare the performance characteristics of the two storage backends: **redb** (default) and **sled**.
|
|
|
|
## Benchmark Architecture
|
|
|
|
### Design Principles
|
|
|
|
1. **Fair Comparison**: Identical test datasets and operations across all backends
|
|
2. **Statistical Rigor**: Using Criterion for statistically sound measurements
|
|
3. **Real-World Scenarios**: Mix of synthetic and realistic workload patterns
|
|
4. **Reproducibility**: Deterministic test data generation with fixed seeds
|
|
5. **Isolation**: Each benchmark runs in a clean environment
|
|
|
|
### Benchmark Categories
|
|
|
|
#### 1. Single-Operation CRUD Benchmarks
|
|
Measures the performance of individual database operations:
|
|
|
|
- **String Operations**
|
|
- `SET` - Write a single key-value pair
|
|
- `GET` - Read a single key-value pair
|
|
- `DEL` - Delete a single key
|
|
- `EXISTS` - Check key existence
|
|
|
|
- **Hash Operations**
|
|
- `HSET` - Set single field in hash
|
|
- `HGET` - Get single field from hash
|
|
- `HGETALL` - Get all fields from hash
|
|
- `HDEL` - Delete field from hash
|
|
- `HEXISTS` - Check field existence
|
|
|
|
- **List Operations**
|
|
- `LPUSH` - Push to list head
|
|
- `RPUSH` - Push to list tail
|
|
- `LPOP` - Pop from list head
|
|
- `RPOP` - Pop from list tail
|
|
- `LRANGE` - Get range of elements
|
|
|
|
#### 2. Bulk Operation Benchmarks
|
|
Tests throughput with varying batch sizes:
|
|
|
|
- **Bulk Insert**: 100, 1,000, 10,000 records
|
|
- **Bulk Read**: Sequential and random access patterns
|
|
- **Bulk Update**: Modify existing records
|
|
- **Bulk Delete**: Remove multiple records
|
|
|
|
#### 3. Query and Scan Benchmarks
|
|
Evaluates iteration and filtering performance:
|
|
|
|
- **SCAN**: Cursor-based key iteration
|
|
- **HSCAN**: Hash field iteration
|
|
- **KEYS**: Pattern matching (with various patterns)
|
|
- **Range Queries**: List range operations
|
|
|
|
#### 4. Concurrent Operation Benchmarks
|
|
Simulates multi-client scenarios:
|
|
|
|
- **10 Concurrent Clients**: Light load
|
|
- **50 Concurrent Clients**: Medium load
|
|
- **Mixed Workload**: 70% reads, 30% writes
|
|
|
|
#### 5. Memory Profiling
|
|
Tracks memory usage patterns:
|
|
|
|
- **Allocation Tracking**: Total allocations per operation
|
|
- **Peak Memory**: Maximum memory usage
|
|
- **Memory Efficiency**: Bytes per record stored
|
|
|
|
### Test Data Specifications
|
|
|
|
#### Dataset Sizes
|
|
- **Small**: 1,000 - 10,000 records
|
|
- **Medium**: 10,000 records (primary focus)
|
|
|
|
#### Data Characteristics
|
|
- **Key Format**: `bench:key:{id}` (predictable, sortable)
|
|
- **Value Sizes**:
|
|
- Small: 50-100 bytes
|
|
- Medium: 500-1000 bytes
|
|
- Large: 5000-10000 bytes
|
|
- **Hash Fields**: 5-20 fields per hash
|
|
- **List Elements**: 10-100 elements per list
|
|
|
|
### Metrics Collected
|
|
|
|
For each benchmark, we collect:
|
|
|
|
1. **Latency Metrics**
|
|
- Mean execution time
|
|
- Median (p50)
|
|
- 95th percentile (p95)
|
|
- 99th percentile (p99)
|
|
- Standard deviation
|
|
|
|
2. **Throughput Metrics**
|
|
- Operations per second
|
|
- Records per second (for bulk operations)
|
|
|
|
3. **Memory Metrics**
|
|
- Total allocations
|
|
- Peak memory usage
|
|
- Average bytes per operation
|
|
|
|
4. **Initialization Overhead**
|
|
- Database startup time
|
|
- First operation latency (cold cache)
|
|
|
|
## Benchmark Structure
|
|
|
|
### Directory Layout
|
|
|
|
```
|
|
benches/
|
|
├── common/
|
|
│ ├── mod.rs # Shared utilities
|
|
│ ├── data_generator.rs # Test data generation
|
|
│ ├── metrics.rs # Custom metrics collection
|
|
│ └── backends.rs # Backend setup helpers
|
|
├── single_ops.rs # Single-operation benchmarks
|
|
├── bulk_ops.rs # Bulk operation benchmarks
|
|
├── scan_ops.rs # Scan and query benchmarks
|
|
├── concurrent_ops.rs # Concurrent operation benchmarks
|
|
└── memory_profile.rs # Memory profiling benchmarks
|
|
```
|
|
|
|
### Running Benchmarks
|
|
|
|
#### Run All Benchmarks
|
|
```bash
|
|
cargo bench
|
|
```
|
|
|
|
#### Run Specific Benchmark Suite
|
|
```bash
|
|
cargo bench --bench single_ops
|
|
cargo bench --bench bulk_ops
|
|
cargo bench --bench concurrent_ops
|
|
```
|
|
|
|
#### Run Specific Backend
|
|
```bash
|
|
cargo bench -- redb
|
|
cargo bench -- sled
|
|
```
|
|
|
|
#### Generate Reports
|
|
```bash
|
|
# Run benchmarks and save results
|
|
cargo bench -- --save-baseline main
|
|
|
|
# Compare against baseline
|
|
cargo bench -- --baseline main
|
|
|
|
# Export to CSV
|
|
cargo bench -- --output-format csv > results.csv
|
|
```
|
|
|
|
### Output Formats
|
|
|
|
#### 1. Terminal Output (Default)
|
|
Real-time progress with statistical summaries:
|
|
```
|
|
single_ops/redb/set/small
|
|
time: [1.234 µs 1.245 µs 1.256 µs]
|
|
thrpt: [802.5K ops/s 810.2K ops/s 818.1K ops/s]
|
|
```
|
|
|
|
#### 2. CSV Export
|
|
Structured data for analysis:
|
|
```csv
|
|
backend,operation,dataset_size,mean_ns,median_ns,p95_ns,p99_ns,throughput_ops_sec
|
|
redb,set,small,1245,1240,1890,2100,810200
|
|
sled,set,small,1567,1550,2340,2890,638000
|
|
```
|
|
|
|
#### 3. JSON Export
|
|
Detailed metrics for programmatic processing:
|
|
```json
|
|
{
|
|
"benchmark": "single_ops/redb/set/small",
|
|
"metrics": {
|
|
"mean": 1245,
|
|
"median": 1240,
|
|
"p95": 1890,
|
|
"p99": 2100,
|
|
"std_dev": 145,
|
|
"throughput": 810200
|
|
},
|
|
"memory": {
|
|
"allocations": 3,
|
|
"peak_bytes": 4096
|
|
}
|
|
}
|
|
```
|
|
|
|
## Benchmark Implementation Details
|
|
|
|
### Backend Setup
|
|
|
|
Each benchmark creates isolated database instances:
|
|
|
|
```rust
|
|
// Redb backend
|
|
let temp_dir = TempDir::new()?;
|
|
let db_path = temp_dir.path().join("bench.db");
|
|
let storage = Storage::new(db_path, false, None)?;
|
|
|
|
// Sled backend
|
|
let temp_dir = TempDir::new()?;
|
|
let db_path = temp_dir.path().join("bench.sled");
|
|
let storage = SledStorage::new(db_path, false, None)?;
|
|
```
|
|
|
|
### Data Generation
|
|
|
|
Deterministic data generation ensures reproducibility:
|
|
|
|
```rust
|
|
use rand::{SeedableRng, Rng};
|
|
use rand::rngs::StdRng;
|
|
|
|
fn generate_test_data(count: usize, seed: u64) -> Vec<(String, String)> {
|
|
let mut rng = StdRng::seed_from_u64(seed);
|
|
(0..count)
|
|
.map(|i| {
|
|
let key = format!("bench:key:{:08}", i);
|
|
let value = generate_value(&mut rng, 100);
|
|
(key, value)
|
|
})
|
|
.collect()
|
|
}
|
|
```
|
|
|
|
### Concurrent Testing
|
|
|
|
Using Tokio for async concurrent operations:
|
|
|
|
```rust
|
|
async fn concurrent_benchmark(
|
|
storage: Arc<dyn StorageBackend>,
|
|
num_clients: usize,
|
|
operations: usize
|
|
) {
|
|
let tasks: Vec<_> = (0..num_clients)
|
|
.map(|client_id| {
|
|
let storage = storage.clone();
|
|
tokio::spawn(async move {
|
|
for i in 0..operations {
|
|
let key = format!("client:{}:key:{}", client_id, i);
|
|
storage.set(key, "value".to_string()).unwrap();
|
|
}
|
|
})
|
|
})
|
|
.collect();
|
|
|
|
futures::future::join_all(tasks).await;
|
|
}
|
|
```
|
|
|
|
## Interpreting Results
|
|
|
|
### Performance Comparison
|
|
|
|
When comparing backends, consider:
|
|
|
|
1. **Latency vs Throughput Trade-offs**
|
|
- Lower latency = better for interactive workloads
|
|
- Higher throughput = better for batch processing
|
|
|
|
2. **Consistency**
|
|
- Lower standard deviation = more predictable performance
|
|
- Check p95/p99 for tail latency
|
|
|
|
3. **Scalability**
|
|
- How performance changes with dataset size
|
|
- Concurrent operation efficiency
|
|
|
|
### Backend Selection Guidelines
|
|
|
|
Based on benchmark results, choose:
|
|
|
|
**redb** when:
|
|
- Need predictable latency
|
|
- Working with structured data (separate tables)
|
|
- Require high concurrent read performance
|
|
- Memory efficiency is important
|
|
|
|
**sled** when:
|
|
- Need high write throughput
|
|
- Working with uniform data types
|
|
- Require lock-free operations
|
|
- Crash recovery is critical
|
|
|
|
## Memory Profiling
|
|
|
|
### Using DHAT
|
|
|
|
For detailed memory profiling:
|
|
|
|
```bash
|
|
# Install valgrind and dhat
|
|
sudo apt-get install valgrind
|
|
|
|
# Run with DHAT
|
|
cargo bench --bench memory_profile -- --profile-time=10
|
|
```
|
|
|
|
### Custom Allocation Tracking
|
|
|
|
The benchmarks include custom allocation tracking:
|
|
|
|
```rust
|
|
#[global_allocator]
|
|
static ALLOC: dhat::Alloc = dhat::Alloc;
|
|
|
|
fn track_allocations<F>(f: F) -> AllocationStats
|
|
where
|
|
F: FnOnce(),
|
|
{
|
|
let _profiler = dhat::Profiler::new_heap();
|
|
f();
|
|
// Extract stats from profiler
|
|
}
|
|
```
|
|
|
|
## Continuous Benchmarking
|
|
|
|
### Regression Detection
|
|
|
|
Compare against baseline to detect performance regressions:
|
|
|
|
```bash
|
|
# Save current performance as baseline
|
|
cargo bench -- --save-baseline v0.1.0
|
|
|
|
# After changes, compare
|
|
cargo bench -- --baseline v0.1.0
|
|
|
|
# Criterion will highlight significant changes
|
|
```
|
|
|
|
### CI Integration
|
|
|
|
Add to CI pipeline:
|
|
|
|
```yaml
|
|
- name: Run Benchmarks
|
|
run: |
|
|
cargo bench --no-fail-fast -- --output-format json > bench-results.json
|
|
|
|
- name: Compare Results
|
|
run: |
|
|
python scripts/compare_benchmarks.py \
|
|
--baseline baseline.json \
|
|
--current bench-results.json \
|
|
--threshold 10 # Fail if >10% regression
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
1. **Inconsistent Results**
|
|
- Ensure system is idle during benchmarks
|
|
- Disable CPU frequency scaling
|
|
- Run multiple iterations
|
|
|
|
2. **Out of Memory**
|
|
- Reduce dataset sizes
|
|
- Run benchmarks sequentially
|
|
- Increase system swap space
|
|
|
|
3. **Slow Benchmarks**
|
|
- Reduce sample size in Criterion config
|
|
- Use `--quick` flag for faster runs
|
|
- Focus on specific benchmarks
|
|
|
|
### Performance Tips
|
|
|
|
```bash
|
|
# Quick benchmark run (fewer samples)
|
|
cargo bench -- --quick
|
|
|
|
# Verbose output for debugging
|
|
cargo bench -- --verbose
|
|
|
|
# Profile specific operation
|
|
cargo bench -- single_ops/redb/set
|
|
```
|
|
|
|
## Future Enhancements
|
|
|
|
Potential additions to the benchmark suite:
|
|
|
|
1. **Transaction Performance**: Measure MULTI/EXEC overhead
|
|
2. **Encryption Overhead**: Compare encrypted vs non-encrypted
|
|
3. **Persistence Testing**: Measure flush/sync performance
|
|
4. **Recovery Time**: Database restart and recovery speed
|
|
5. **Network Overhead**: Redis protocol parsing impact
|
|
6. **Long-Running Stability**: Performance over extended periods
|
|
|
|
## References
|
|
|
|
- [Criterion.rs Documentation](https://bheisler.github.io/criterion.rs/book/)
|
|
- [DHAT Memory Profiler](https://valgrind.org/docs/manual/dh-manual.html)
|
|
- [Rust Performance Book](https://nnethercote.github.io/perf-book/) |