Benchmarks¶

This page documents Sieve’s performance characteristics across three optimization phases, measured against a real OFAC SDN dataset (~20,000 entities).

Test Environment ¶

Hardware: Apple Silicon (M-series), 16GB RAM
JVM: Java 21 (Eclipse Temurin)
Dataset: OFAC SDN (~20,000 sanctioned entities, ~100,000 names including aliases)
Tool: Custom HTTP stress test using virtual threads (sieve-benchmark)

Optimization Evolution ¶

Version	Single-Thread Latency	Peak Throughput	Improvement
v1 — Baseline (linear scan)	590ms	4 req/s	—
v2 — N-gram + name cache	12ms	435 req/s	100×
v3 — All optimizations	7.5ms	931 req/s	230×

Detailed Results ¶

v1 — Baseline ¶

Full linear scan with per-query normalization. Every query normalizes every entity name and runs Jaro-Winkler against all ~100,000 names.

═══ Phase 2: Sustained Load ═══
Concurrency  Requests  Throughput  Avg (µs)     P50 (µs)     P99 (µs)     Errors
200          100       4           18,007,406   17,784,685   22,290,343   0

Peak throughput:  4 req/sec
Avg latency:      18,007,406 µs (18 seconds)

v2 — N-gram Index + Name Cache ¶

Pre-normalized names at index load time. Trigram inverted index reduces candidate set from 20,000 to ~50–200 entities per query.

═══ Phase 2: Sustained Load ═══
Concurrency  Requests  Throughput  Avg (µs)  P50 (µs)  P99 (µs)  Errors
200          100       435         131,320   125,504   225,753   0

Peak throughput:  435 req/sec
Avg latency:      131,320 µs

v3 — Full Optimization Suite ¶

All optimizations applied: threshold-aware early exit, ThreadLocal array reuse, length-ratio pre-filter, memoized normalization, reduced logging.

═══ Phase 1: Ramp-Up ═══
Concurrency  Requests  Throughput  Avg (µs)  P50 (µs)   P99 (µs)   Errors
1            250       131         7,549     7,207      12,358     0
10           250       755         12,624    11,486     34,036     0
50           250       903         46,084    39,016     151,771    0
100          250       943         83,796    80,736     228,419    0

═══ Phase 2: Sustained Load ═══
Concurrency  Requests  Throughput  Avg (µs)  P50 (µs)   P99 (µs)   Errors
200          1000      931         175,530   119,844    621,418    0

═══ Phase 3: Threshold Sensitivity ═══
Threshold  Throughput  Avg (µs)  P99 (µs)
0.70       1,042       76,814    292,844
0.80       894         84,522    364,805
0.85       899         92,824    379,863
0.90       1,109       71,145    266,972

Peak throughput:  931 req/sec
Avg latency:      175,530 µs (includes HTTP overhead)

Optimization Breakdown ¶

Optimization	Technique	Impact
Pre-normalized name cache	`ConcurrentHashMap` per entity	Eliminates ~100k regex ops/query
N-gram inverted index	Trigram → entity ID lookup	20,000 → ~50 candidates
Threshold-aware early exit	Length-ratio upper bound check	Skips impossible comparisons
ThreadLocal array reuse	Reusable `boolean[]` in JaroWinkler	Zero GC pressure in hot path
Length-ratio pre-filter	Skip if name lengths differ >3×	Further reduces candidate set
Memoized normalization	Cached `NameNormalizer` results	No redundant regex for repeated queries
Hot-path logging reduction	`log.info` → `log.debug`	Eliminates string formatting overhead
CompositeEngine fast path	Skip dedup for single engine	Reduces HashMap allocation

Where Time Is Spent (v3)¶

At 7.5ms single-thread latency, the matching engine itself accounts for <1ms. The remaining time is Spring Boot HTTP overhead:

Component	Estimated Time
JSON deserialization (Jackson)	~1–2ms
Bean validation (`@Valid`)	~0.5ms
Matching engine	~0.4ms
JSON serialization (response)	~2–4ms
HTTP/TCP + servlet dispatch	~1–2ms

Running Benchmarks ¶

Build and run the stress test:

# Build the project
mvn clean package -DskipTests

# Start the server (in another terminal)
java -jar sieve-server/target/sieve-server-0.1.0-SNAPSHOT.jar

# Run the stress test
java -jar sieve-benchmark/target/sieve-benchmark-0.1.0-SNAPSHOT.jar stress \
  --url=http://localhost:8080/api/v1/screen \
  --requests=1000 \
  --concurrency=200

Run JMH microbenchmarks (engine-level, no HTTP):

java -jar sieve-benchmark/target/sieve-benchmark-0.1.0-SNAPSHOT.jar jmh