Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Chapter 30: Performance Characteristics & Benchmarks


This appendix documents Pierre’s performance characteristics, optimization strategies, and benchmarking guidelines for production deployments.

Performance Overview

Pierre is designed for low-latency fitness data processing with the following targets:

OperationTarget LatencyNotes
Health check< 5msNo DB, no auth
JWT validation< 10msCached JWKS
Simple tool call< 50msCached data
Provider API call< 500msNetwork-bound
TSS calculation< 20msCPU-bound
Complex analysis< 200msMulti-algorithm

Algorithmic Complexity

Training Load Calculations

AlgorithmTime ComplexitySpace Complexity
Average Power TSSO(1)O(1)
Normalized Power TSSO(n)O(w) where w=window
TRIMPO(n)O(1)
CTL/ATL/TSBO(n)O(1) per activity
VO2max estimationO(1)O(1)

Normalized Power calculation:

#![allow(unused)]
fn main() {
// O(n) where n = power samples
// O(w) space for rolling window
pub fn calculate_np(power_stream: &[f64], window_seconds: u32) -> f64 {
    // 30-second rolling average of power^4
    let window_size = window_seconds as usize;
    let rolling_averages: Vec<f64> = power_stream
        .windows(window_size)           // O(n) iterations
        .map(|w| w.iter().sum::<f64>() / w.len() as f64)  // O(w) per window
        .collect();

    // Fourth root of mean of fourth powers
    let mean_fourth = rolling_averages.iter()
        .map(|p| p.powi(4))
        .sum::<f64>() / rolling_averages.len() as f64;

    mean_fourth.powf(0.25)
}
}

Database Operations

OperationComplexityIndex Used
Get user by IDO(1)PRIMARY KEY
Get user by emailO(log n)idx_users_email
List activities (paginated)O(k + log n)Composite index
Get OAuth tokenO(1)UNIQUE constraint
Usage analytics (monthly)O(log n)idx_api_key_usage_timestamp

Memory Characteristics

Static Memory

ComponentApproximate Size
Binary size~45 MB
Startup memory~50 MB
Per connection~8 KB
SQLite pool (10 conn)~2 MB
JWKS cache~100 KB
LRU cache (default)~10 MB

Dynamic Memory

Activity processing:

#![allow(unused)]
fn main() {
// Memory per activity analysis
// - Activity struct: ~500 bytes
// - Power stream (1 hour @ 1Hz): 3600 * 8 = 29 KB
// - Heart rate stream: 3600 * 8 = 29 KB
// - GPS stream: 3600 * 24 = 86 KB
// - Analysis result: ~2 KB
// Total per activity: ~150 KB peak
}

Concurrent request handling:

#![allow(unused)]
fn main() {
// Per-request memory estimate
// - Request parsing: ~4 KB
// - Auth context: ~1 KB
// - Response buffer: ~8 KB
// - Tool execution: ~50 KB (varies by tool)
// Total per request: ~65 KB average
}

Concurrency Model

Tokio Runtime Configuration

// Production runtime (src/bin/pierre-mcp-server.rs)
#[tokio::main(flavor = "multi_thread")]
async fn main() {
    // Worker threads = CPU cores
    // I/O threads = 2 * CPU cores
}

Connection Pooling

#![allow(unused)]
fn main() {
// SQLite pool configuration
SqlitePoolOptions::new()
    .max_connections(10)        // Max concurrent DB connections
    .min_connections(2)         // Keep-alive connections
    .acquire_timeout(Duration::from_secs(30))
    .idle_timeout(Some(Duration::from_secs(600)))
}

Rate Limiting

TierRequests/MonthBurst LimitWindow
Trial1,00010/min30 days
Starter10,00060/min30 days
Professional100,000300/min30 days
EnterpriseUnlimited1000/minN/A

Optimization Strategies

1. Lazy Loading

#![allow(unused)]
fn main() {
// Providers loaded only when needed
impl ProviderRegistry {
    pub fn get(&self, name: &str) -> Option<Arc<dyn FitnessProvider>> {
        // Factory creates provider on first access
        self.factories.get(name)?.create_provider()
    }
}
}

2. Response Caching

#![allow(unused)]
fn main() {
// LRU cache for expensive computations
pub struct Cache {
    lru: Mutex<LruCache<String, CacheEntry>>,
    default_ttl: Duration,
}

// Cache key patterns
// - activities:{provider}:{user_id} -> Vec<Activity>
// - athlete:{provider}:{user_id} -> Athlete
// - stats:{provider}:{user_id} -> Stats
// - analysis:{activity_id} -> AnalysisResult
}

3. Query Optimization

#![allow(unused)]
fn main() {
// Efficient pagination with cursor-based approach
pub async fn list_activities_paginated(
    &self,
    user_id: Uuid,
    cursor: Option<&str>,
    limit: u32,
) -> Result<CursorPage<Activity>> {
    // Uses indexed seek instead of OFFSET
    sqlx::query_as!(
        Activity,
        r#"
        SELECT * FROM activities
        WHERE user_id = ?1 AND id > ?2
        ORDER BY id
        LIMIT ?3
        "#,
        user_id,
        cursor.unwrap_or(""),
        limit + 1  // Fetch one extra to detect has_more
    )
    .fetch_all(&self.pool)
    .await
}
}

4. Zero-Copy Serialization

#![allow(unused)]
fn main() {
// Use Cow<str> for borrowed strings
pub struct ActivityResponse<'a> {
    pub id: Cow<'a, str>,
    pub name: Cow<'a, str>,
    // Avoids cloning when data comes from cache
}
}

Benchmarking Guidelines

Running Benchmarks

# Install criterion
cargo install cargo-criterion

# Run all benchmarks
cargo criterion

# Run specific benchmark
cargo criterion --bench tss_calculation

# Generate HTML report
cargo criterion --bench tss_calculation -- --save-baseline main

Example Benchmark

#![allow(unused)]
fn main() {
// benches/tss_benchmark.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};

fn tss_benchmark(c: &mut Criterion) {
    let activity = create_test_activity(3600); // 1 hour

    c.bench_function("tss_avg_power", |b| {
        b.iter(|| {
            TssAlgorithm::AvgPower.calculate(
                black_box(&activity),
                black_box(250.0),
                black_box(1.0),
            )
        })
    });

    c.bench_function("tss_normalized_power", |b| {
        b.iter(|| {
            TssAlgorithm::NormalizedPower { window_seconds: 30 }
                .calculate(
                    black_box(&activity),
                    black_box(250.0),
                    black_box(1.0),
                )
        })
    });
}

criterion_group!(benches, tss_benchmark);
criterion_main!(benches);
}

Expected Results

BenchmarkExpected TimeAcceptable Range
TSS (avg power)50 ns< 100 ns
TSS (normalized)15 µs< 50 µs
JWT validation100 µs< 500 µs
Activity parse200 µs< 1 ms
SQLite query500 µs< 5 ms

Production Monitoring

Key Metrics

#![allow(unused)]
fn main() {
// Prometheus metrics exposed at /metrics
counter!("pierre_requests_total", "method" => method, "status" => status);
histogram!("pierre_request_duration_seconds", "method" => method);
gauge!("pierre_active_connections");
gauge!("pierre_db_pool_connections");
counter!("pierre_provider_requests_total", "provider" => provider);
histogram!("pierre_provider_latency_seconds", "provider" => provider);
}

Alert Thresholds

MetricWarningCritical
Request latency p99> 500ms> 2s
Error rate> 1%> 5%
DB pool saturation> 70%> 90%
Memory usage> 70%> 90%
Provider latency p99> 2s> 10s

Profiling

CPU Profiling

# Using perf
perf record -g cargo run --release
perf report

# Using flamegraph
cargo install flamegraph
cargo flamegraph --bin pierre-mcp-server

Memory Profiling

# Using heaptrack
heaptrack cargo run --release
heaptrack_gui heaptrack.pierre-mcp-server.*.gz

# Using valgrind
valgrind --tool=massif ./target/release/pierre-mcp-server
ms_print massif.out.*

Key Takeaways

  1. Target latencies: Simple operations < 50ms, provider calls < 500ms.
  2. Algorithm efficiency: NP-TSS is O(n), use AvgPower-TSS for quick estimates.
  3. Memory footprint: ~50MB baseline, ~150KB per activity analysis.
  4. Connection pooling: 10 SQLite connections handle typical workloads.
  5. Cursor pagination: Avoids O(n) OFFSET performance degradation.
  6. LRU caching: Reduces provider API calls and computation.
  7. Prometheus metrics: Monitor latency, error rates, pool saturation.
  8. Benchmark before optimize: Use criterion for reproducible measurements.

Related Chapters:

  • Chapter 20: Sports Science Algorithms (algorithm complexity)
  • Chapter 25: Deployment (production configuration)
  • Appendix E: Rate Limiting (quota management)