Benchmarks¶
This page documents Sieve’s performance characteristics across three optimization phases, measured against a real OFAC SDN dataset (~20,000 entities).
Test Environment¶
Hardware: Apple Silicon (M-series), 16GB RAM
JVM: Java 21 (Eclipse Temurin)
Dataset: OFAC SDN (~20,000 sanctioned entities, ~100,000 names including aliases)
Tool: Custom HTTP stress test using virtual threads (
sieve-benchmark)
Optimization Evolution¶
Version |
Single-Thread Latency |
Peak Throughput |
Improvement |
|---|---|---|---|
v1 — Baseline (linear scan) |
590ms |
4 req/s |
— |
v2 — N-gram + name cache |
12ms |
435 req/s |
100× |
v3 — All optimizations |
7.5ms |
931 req/s |
230× |
Detailed Results¶
v1 — Baseline¶
Full linear scan with per-query normalization. Every query normalizes every entity name and runs Jaro-Winkler against all ~100,000 names.
═══ Phase 2: Sustained Load ═══
Concurrency Requests Throughput Avg (µs) P50 (µs) P99 (µs) Errors
200 100 4 18,007,406 17,784,685 22,290,343 0
Peak throughput: 4 req/sec
Avg latency: 18,007,406 µs (18 seconds)
v2 — N-gram Index + Name Cache¶
Pre-normalized names at index load time. Trigram inverted index reduces candidate set from 20,000 to ~50–200 entities per query.
═══ Phase 2: Sustained Load ═══
Concurrency Requests Throughput Avg (µs) P50 (µs) P99 (µs) Errors
200 100 435 131,320 125,504 225,753 0
Peak throughput: 435 req/sec
Avg latency: 131,320 µs
v3 — Full Optimization Suite¶
All optimizations applied: threshold-aware early exit, ThreadLocal array reuse, length-ratio pre-filter, memoized normalization, reduced logging.
═══ Phase 1: Ramp-Up ═══
Concurrency Requests Throughput Avg (µs) P50 (µs) P99 (µs) Errors
1 250 131 7,549 7,207 12,358 0
10 250 755 12,624 11,486 34,036 0
50 250 903 46,084 39,016 151,771 0
100 250 943 83,796 80,736 228,419 0
═══ Phase 2: Sustained Load ═══
Concurrency Requests Throughput Avg (µs) P50 (µs) P99 (µs) Errors
200 1000 931 175,530 119,844 621,418 0
═══ Phase 3: Threshold Sensitivity ═══
Threshold Throughput Avg (µs) P99 (µs)
0.70 1,042 76,814 292,844
0.80 894 84,522 364,805
0.85 899 92,824 379,863
0.90 1,109 71,145 266,972
Peak throughput: 931 req/sec
Avg latency: 175,530 µs (includes HTTP overhead)
Optimization Breakdown¶
Optimization |
Technique |
Impact |
|---|---|---|
Pre-normalized name cache |
|
Eliminates ~100k regex ops/query |
N-gram inverted index |
Trigram → entity ID lookup |
20,000 → ~50 candidates |
Threshold-aware early exit |
Length-ratio upper bound check |
Skips impossible comparisons |
ThreadLocal array reuse |
Reusable |
Zero GC pressure in hot path |
Length-ratio pre-filter |
Skip if name lengths differ >3× |
Further reduces candidate set |
Memoized normalization |
Cached |
No redundant regex for repeated queries |
Hot-path logging reduction |
|
Eliminates string formatting overhead |
CompositeEngine fast path |
Skip dedup for single engine |
Reduces HashMap allocation |
Where Time Is Spent (v3)¶
At 7.5ms single-thread latency, the matching engine itself accounts for <1ms. The remaining time is Spring Boot HTTP overhead:
Component |
Estimated Time |
|---|---|
JSON deserialization (Jackson) |
~1–2ms |
Bean validation ( |
~0.5ms |
Matching engine |
~0.4ms |
JSON serialization (response) |
~2–4ms |
HTTP/TCP + servlet dispatch |
~1–2ms |
Running Benchmarks¶
Build and run the stress test:
# Build the project
mvn clean package -DskipTests
# Start the server (in another terminal)
java -jar sieve-server/target/sieve-server-0.1.0-SNAPSHOT.jar
# Run the stress test
java -jar sieve-benchmark/target/sieve-benchmark-0.1.0-SNAPSHOT.jar stress \
--url=http://localhost:8080/api/v1/screen \
--requests=1000 \
--concurrency=200
Run JMH microbenchmarks (engine-level, no HTTP):
java -jar sieve-benchmark/target/sieve-benchmark-0.1.0-SNAPSHOT.jar jmh