Module Structure ================ Sieve is organized as a multi-module Maven project with strict dependency rules. Each module has a single responsibility and well-defined API boundary. Module Diagram -------------- .. mermaid:: graph TD SERVER["sieve-server
Spring Boot REST API"] --> MATCH SERVER --> INGEST CLI["sieve-cli
Picocli CLI"] --> MATCH CLI --> INGEST BENCH["sieve-benchmark
JMH + stress tests"] --> MATCH BENCH --> INGEST MATCH["sieve-match
Matching engines"] --> CORE INGEST["sieve-ingest
List fetchers & parsers"] --> CORE ADDR["sieve-address
Address normalization"] --> CORE SERVER --> ADDR style CORE fill:#e1f5fe,stroke:#0288d1 style MATCH fill:#fff3e0,stroke:#f57c00 style INGEST fill:#e8f5e9,stroke:#388e3c style SERVER fill:#fce4ec,stroke:#c62828 style CLI fill:#f3e5f5,stroke:#7b1fa2 style BENCH fill:#fff8e1,stroke:#f9a825 style ADDR fill:#e0f2f1,stroke:#00897b Directory Layout ---------------- .. code-block:: text sieve-aml/ ├── sieve-core/ # Zero-dependency domain module ├── sieve-ingest/ # List fetchers and XML parsers ├── sieve-match/ # Matching engine implementations ├── sieve-address/ # Address normalization (libpostal) ├── sieve-server/ # Spring Boot REST API + JPA persistence ├── sieve-cli/ # Picocli command-line interface ├── sieve-benchmark/ # JMH microbenchmarks + HTTP stress tests ├── Dockerfile # Multi-stage Docker build ├── docker-compose.yml # Server + PostgreSQL └── pom.xml # Parent POM Module Details -------------- sieve-core ^^^^^^^^^^ **Zero compile dependencies** (beyond ``slf4j-api``). Contains: - **Domain model** — Java 21 records: ``SanctionedEntity``, ``NameInfo``, ``Address``, ``Identifier``, ``SanctionsProgram`` - **Enums** — ``EntityType``, ``ListSource``, ``NameType``, ``NameStrength``, ``ScriptType``, ``IdentifierType`` - **Index abstraction** — ``EntityIndex`` interface, ``InMemoryEntityIndex`` implementation - **Match SPI** — ``MatchEngine``, ``MatchResult``, ``ScreeningRequest`` All other modules depend on ``sieve-core``. sieve-ingest ^^^^^^^^^^^^ Fetches and parses sanctions lists from official sources. Contains: - **ListProvider SPI** — interface for sanctions list fetchers - **IngestionOrchestrator** — coordinates multi-source ingestion with reporting - **Provider implementations:** - ``OfacSdnProvider`` — OFAC SDN (StAX XML streaming parser, ETag delta detection) - ``EuConsolidatedProvider`` — EU Consolidated Financial Sanctions - ``UnConsolidatedProvider`` — UN Security Council Consolidated List - ``UkHmtProvider`` — UK HM Treasury sieve-match ^^^^^^^^^^^^ Matching engine implementations: - **ExactMatchEngine** — normalized exact string comparison - **FuzzyMatchEngine** — Jaro-Winkler similarity with threshold - **CompositeMatchEngine** — runs multiple engines, deduplicates, sorts - **NormalizedNameCache** — pre-computed normalized names per entity - **NgramIndex** — trigram inverted index for candidate selection - **JaroWinkler** — pure Java string similarity algorithm - **NameNormalizer** — string normalization with memoization sieve-address ^^^^^^^^^^^^^ Address normalization using `libpostal `_ (optional native dependency). Provides structured address parsing and normalization for address-based matching. sieve-server ^^^^^^^^^^^^^ Spring Boot 3.3 REST API: - **Controllers** — screening, list management, health - **DTOs** — request/response records with Bean Validation + OpenAPI schemas - **JPA persistence** — optional PostgreSQL storage via Flyway migrations - **Configuration** — ``@ConfigurationProperties`` with YAML binding - **Error handling** — RFC 7807 Problem Details via ``@RestControllerAdvice`` sieve-cli ^^^^^^^^^^ Standalone Picocli CLI (no Spring dependency): - **Commands** — ``fetch``, ``screen``, ``stats``, ``export`` - CI/CD-friendly exit codes (0 = clear, 1 = match, 2 = error) sieve-benchmark ^^^^^^^^^^^^^^^^ Performance testing: - **JMH microbenchmarks** — engine-level throughput measurement - **HTTP stress test** — high-concurrency load test against the REST API - Uses virtual threads for maximum concurrency Dependency Rules ---------------- 1. ``sieve-core`` has **zero compile dependencies** beyond SLF4J 2. Dependencies flow in one direction: ``server/cli → match → ingest → core`` 3. No dependency cycles between modules 4. ``sieve-match`` does not depend on ``sieve-ingest`` (decoupled via ``EntityIndex``) 5. ``sieve-server`` is the only module with Spring dependencies 6. ``sieve-cli`` is the only module with Picocli dependencies