Benchmarks

This page documents Sieve’s performance characteristics across three optimization phases, measured against a real OFAC SDN dataset (~20,000 entities).

Test Environment

  • Hardware: Apple Silicon (M-series), 16GB RAM

  • JVM: Java 21 (Eclipse Temurin)

  • Dataset: OFAC SDN (~20,000 sanctioned entities, ~100,000 names including aliases)

  • Tool: Custom HTTP stress test using virtual threads (sieve-benchmark)

Optimization Evolution

Version

Single-Thread Latency

Peak Throughput

Improvement

v1 — Baseline (linear scan)

590ms

4 req/s

v2 — N-gram + name cache

12ms

435 req/s

100×

v3 — All optimizations

7.5ms

931 req/s

230×

Detailed Results

v1 — Baseline

Full linear scan with per-query normalization. Every query normalizes every entity name and runs Jaro-Winkler against all ~100,000 names.

═══ Phase 2: Sustained Load ═══
Concurrency  Requests  Throughput  Avg (µs)     P50 (µs)     P99 (µs)     Errors
200          100       4           18,007,406   17,784,685   22,290,343   0

Peak throughput:  4 req/sec
Avg latency:      18,007,406 µs (18 seconds)

v2 — N-gram Index + Name Cache

Pre-normalized names at index load time. Trigram inverted index reduces candidate set from 20,000 to ~50–200 entities per query.

═══ Phase 2: Sustained Load ═══
Concurrency  Requests  Throughput  Avg (µs)  P50 (µs)  P99 (µs)  Errors
200          100       435         131,320   125,504   225,753   0

Peak throughput:  435 req/sec
Avg latency:      131,320 µs

v3 — Full Optimization Suite

All optimizations applied: threshold-aware early exit, ThreadLocal array reuse, length-ratio pre-filter, memoized normalization, reduced logging.

═══ Phase 1: Ramp-Up ═══
Concurrency  Requests  Throughput  Avg (µs)  P50 (µs)   P99 (µs)   Errors
1            250       131         7,549     7,207      12,358     0
10           250       755         12,624    11,486     34,036     0
50           250       903         46,084    39,016     151,771    0
100          250       943         83,796    80,736     228,419    0

═══ Phase 2: Sustained Load ═══
Concurrency  Requests  Throughput  Avg (µs)  P50 (µs)   P99 (µs)   Errors
200          1000      931         175,530   119,844    621,418    0

═══ Phase 3: Threshold Sensitivity ═══
Threshold  Throughput  Avg (µs)  P99 (µs)
0.70       1,042       76,814    292,844
0.80       894         84,522    364,805
0.85       899         92,824    379,863
0.90       1,109       71,145    266,972

Peak throughput:  931 req/sec
Avg latency:      175,530 µs (includes HTTP overhead)

Optimization Breakdown

Optimization

Technique

Impact

Pre-normalized name cache

ConcurrentHashMap per entity

Eliminates ~100k regex ops/query

N-gram inverted index

Trigram → entity ID lookup

20,000 → ~50 candidates

Threshold-aware early exit

Length-ratio upper bound check

Skips impossible comparisons

ThreadLocal array reuse

Reusable boolean[] in JaroWinkler

Zero GC pressure in hot path

Length-ratio pre-filter

Skip if name lengths differ >3×

Further reduces candidate set

Memoized normalization

Cached NameNormalizer results

No redundant regex for repeated queries

Hot-path logging reduction

log.infolog.debug

Eliminates string formatting overhead

CompositeEngine fast path

Skip dedup for single engine

Reduces HashMap allocation

Where Time Is Spent (v3)

At 7.5ms single-thread latency, the matching engine itself accounts for <1ms. The remaining time is Spring Boot HTTP overhead:

Component

Estimated Time

JSON deserialization (Jackson)

~1–2ms

Bean validation (@Valid)

~0.5ms

Matching engine

~0.4ms

JSON serialization (response)

~2–4ms

HTTP/TCP + servlet dispatch

~1–2ms

Running Benchmarks

Build and run the stress test:

# Build the project
mvn clean package -DskipTests

# Start the server (in another terminal)
java -jar sieve-server/target/sieve-server-0.1.0-SNAPSHOT.jar

# Run the stress test
java -jar sieve-benchmark/target/sieve-benchmark-0.1.0-SNAPSHOT.jar stress \
  --url=http://localhost:8080/api/v1/screen \
  --requests=1000 \
  --concurrency=200

Run JMH microbenchmarks (engine-level, no HTTP):

java -jar sieve-benchmark/target/sieve-benchmark-0.1.0-SNAPSHOT.jar jmh