Java API Reference

Auto-generated API documentation from Javadoc comments using Doxygen and Breathe.

Note

This section is generated automatically from the source code. To rebuild, run doxygen Doxyfile from the project root, then make html from the docs/ directory.

Core Domain Model

Warning

doxygenclass: Cannot find class “dev::sieve::core::model::SanctionedEntity” in doxygen xml output for project “sieve” from directory: /home/runner/work/sieve-aml/sieve-aml/docs/../doxygen/xml

Warning

doxygenclass: Cannot find class “dev::sieve::core::model::NameInfo” in doxygen xml output for project “sieve” from directory: /home/runner/work/sieve-aml/sieve-aml/docs/../doxygen/xml

Warning

doxygenclass: Cannot find class “dev::sieve::core::model::Address” in doxygen xml output for project “sieve” from directory: /home/runner/work/sieve-aml/sieve-aml/docs/../doxygen/xml

Warning

doxygenclass: Cannot find class “dev::sieve::core::model::Identifier” in doxygen xml output for project “sieve” from directory: /home/runner/work/sieve-aml/sieve-aml/docs/../doxygen/xml

Warning

doxygenclass: Cannot find class “dev::sieve::core::model::SanctionsProgram” in doxygen xml output for project “sieve” from directory: /home/runner/work/sieve-aml/sieve-aml/docs/../doxygen/xml

Index

interface EntityIndex

Abstraction over a store of SanctionedEntity instances.

Implementations must be thread-safe. The index supports bulk loading via addAll, individual inserts via add, and various query methods.

Subclassed by dev.sieve.core.index.InMemoryEntityIndex

Public Functions

void addAll(Collection<SanctionedEntity> entities)

Adds all entities in the given collection to the index.

Parameters:

entities – the entities to add, must not be

void add(SanctionedEntity entity)

Adds a single entity to the index.

Parameters:

entity – the entity to add, must not be

void clear()

Removes all entities from the index.

int size()

Returns the total number of entities in the index.

Returns:

entity count, always

Collection<SanctionedEntity> all()

Returns an unmodifiable view of all entities in the index.

Returns:

all entities, never

Collection<SanctionedEntity> findBySource(ListSource source)

Returns all entities from a specific sanctions list source.

Parameters:

source – the list source to filter by, must not be

Returns:

matching entities, never

Optional<SanctionedEntity> findById(String id)

Looks up a single entity by its source-specific ID.

Parameters:

id – the entity ID, must not be

Returns:

the entity if found, or empty

IndexStats stats()

Returns statistical information about the index contents.

Returns:

current index statistics, never

dev::sieve::core::index::InMemoryEntityIndex : public dev.sieve.core.index.EntityIndex

Thread-safe, in-memory implementation of EntityIndex.

Backed by a ConcurrentHashMap keyed on entity ID, with a secondary index by ListSource for efficient filtered queries. All mutating operations are safe for concurrent access from multiple threads.

Public Functions

inline InMemoryEntityIndex()

Creates a new, empty in-memory entity index.

inline void addAll(Collection<SanctionedEntity> entities)

Adds all entities in the given collection to the index.

Parameters:

entities – the entities to add, must not be

inline void add(SanctionedEntity entity)

Adds a single entity to the index.

Parameters:

entity – the entity to add, must not be

inline void clear()

Removes all entities from the index.

inline int size()

Returns the total number of entities in the index.

Returns:

entity count, always

inline Collection<SanctionedEntity> all()

Returns an unmodifiable view of all entities in the index.

Returns:

all entities, never

inline Collection<SanctionedEntity> findBySource(ListSource source)

Returns all entities from a specific sanctions list source.

Parameters:

source – the list source to filter by, must not be

Returns:

matching entities, never

inline Optional<SanctionedEntity> findById(String id)

Looks up a single entity by its source-specific ID.

Parameters:

id – the entity ID, must not be

Returns:

the entity if found, or empty

inline IndexStats stats()

Returns statistical information about the index contents.

Returns:

current index statistics, never

Match Engine

interface MatchEngine

Service Provider Interface for sanctions screening match engines.

Implementations encapsulate a specific matching algorithm (e.g., exact match, fuzzy match, phonetic match) and produce scored results against the entities in an EntityIndex.

Public Functions

List<MatchResult> screen(ScreeningRequest request, EntityIndex index)

Screens the given request against all applicable entities in the index.

Results are filtered by the request’s threshold and optional entity type / source filters, then returned in descending score order.

Parameters:
  • request – the screening request containing the name and filters

  • index – the entity index to screen against

Returns:

matching results sorted by score descending, never

Warning

doxygenclass: Cannot find class “dev::sieve::core::match::MatchResult” in doxygen xml output for project “sieve” from directory: /home/runner/work/sieve-aml/sieve-aml/docs/../doxygen/xml

Warning

doxygenclass: Cannot find class “dev::sieve::core::match::ScreeningRequest” in doxygen xml output for project “sieve” from directory: /home/runner/work/sieve-aml/sieve-aml/docs/../doxygen/xml

class FuzzyMatchEngine : public MatchEngine

Match engine that uses Jaro-Winkler fuzzy string similarity.

Compares the screening query against each entity’s primary name and all aliases, keeping the best (highest) score per entity. Results below the request’s threshold are discarded.

Public Functions

inline FuzzyMatchEngine(NormalizedNameCache nameCache, NgramIndex ngramIndex)

Creates a fuzzy match engine with shared name cache and n-gram index.

inline FuzzyMatchEngine(NormalizedNameCache nameCache)

Creates a fuzzy match engine with a shared name cache (no n-gram filtering).

inline FuzzyMatchEngine()

Creates a fuzzy match engine with its own name cache and n-gram index.

class ExactMatchEngine : public MatchEngine

Match engine that performs exact (post-normalization) name comparison.

Names are normalized by lowercasing, trimming, and collapsing whitespace before comparison. Checks the entity’s primary name and all aliases. Produces a score of 1.0 for exact matches and 0.0 otherwise.

Public Functions

inline ExactMatchEngine(NormalizedNameCache nameCache, NgramIndex ngramIndex)

Creates an exact match engine with shared name cache and n-gram index.

inline ExactMatchEngine(NormalizedNameCache nameCache)

Creates an exact match engine with a shared name cache (no n-gram filtering).

inline ExactMatchEngine()

Creates an exact match engine with its own name cache and n-gram index.

class CompositeMatchEngine : public MatchEngine

Composite match engine that delegates to multiple underlying engines.

Runs all registered engines, deduplicates results by entity ID, and keeps the highest score per entity. The final result list is sorted by score in descending order.

Public Functions

inline CompositeMatchEngine(List<MatchEngine> engines)

Creates a composite engine delegating to the given engines.

Parameters:

engines – the match engines to delegate to, must not be or empty

Throws:
  • NullPointerException – if is

  • IllegalArgumentException – if is empty

class NgramIndex

Trigram-based inverted index for fast candidate selection.

At build time, every entity’s normalized primary name and alias names are decomposed into overlapping 3-character trigrams. An inverted map is constructed from each trigram to the set of entity IDs containing that trigram.

At query time, the query string is decomposed into trigrams, the candidate entity IDs are collected by trigram overlap, and only entities sharing a minimum fraction of trigrams are returned. This typically reduces the candidate set from tens of thousands to tens of entities.

Thread-safe. Automatically rebuilds when the underlying index size changes.

Public Functions

inline void ensureBuilt(EntityIndex index, NormalizedNameCache nameCache)

Ensures the index is built and up-to-date for the given entity index.

Parameters:
  • index – the entity index

  • nameCache – the pre-normalized name cache (must already be built)

inline Collection<SanctionedEntity> candidates(String normalizedQuery)

Returns candidate entities whose names share trigrams with the given normalized query.

Candidates are ranked by the number of shared trigrams and filtered by a minimum overlap ratio. The result is a subset of all entities — typically 1%.

Parameters:

normalizedQuery – the query string, already normalized

Returns:

candidate entities, never

inline int size()

Returns the total number of indexed entities.

class NormalizedNameCache

Cache of pre-normalized entity names for use by match engines.

Normalizing names (lowercasing, trimming, collapsing whitespace) is expensive when repeated for every entity on every query. This cache pre-computes normalized forms once when entities are loaded and serves them on subsequent lookups, eliminating redundant work.

Thread-safe. Automatically rebuilds when the index size changes (indicating new data).

Public Functions

inline record NormalizedEntry(String primaryName, List<String> aliases)

Pre-normalized names for a single entity.

Parameters:
  • primaryName – the normalized primary name

  • aliases – the normalized alias names, in the same order as the entity’s alias list

inline void ensureBuilt(EntityIndex index)

Ensures the cache is built and up-to-date for the given index.

If the index size has changed since the last build, the cache is rebuilt. This method should be called once at the start of each screening operation.

Parameters:

index – the entity index to cache names for

inline NormalizedEntry get(SanctionedEntity entity)

Returns the pre-normalized names for the given entity, computing on cache miss.

Parameters:

entity – the entity to look up

Returns:

the pre-normalized entry, never

inline void invalidate()

Clears the cache, forcing a full rebuild on the next ensureBuilt call.

Algorithms

class JaroWinkler

Pure-Java implementation of the Jaro-Winkler string similarity algorithm.

Jaro-Winkler is a string metric commonly used in record linkage and name matching. It produces a similarity score between 0.0 (no similarity) and 1.0 (exact match), with a prefix bonus that favors strings sharing a common prefix.

This implementation follows the original Winkler (1990) formulation with a default prefix scaling factor of 0.1 and a maximum prefix length of 4 characters.

Public Static Functions

static inline double similarity(String s1, String s2)

Computes the Jaro-Winkler similarity between two strings.

Both strings are compared as-is (no normalization is applied). Callers should pre-process strings (e.g., lowercasing, trimming) before invoking this method if case-insensitive comparison is desired.

Parameters:
  • s1 – the first string, may be or empty

  • s2 – the second string, may be or empty

Returns:

similarity score in the range [0.0, 1.0]

static inline double similarityWithThreshold(String s1, String s2, double threshold)

Computes the Jaro-Winkler similarity, returning 0.0 early if it cannot meet the threshold.

Parameters:
  • s1 – the first string

  • s2 – the second string

  • threshold – minimum score; returns 0.0 if the result cannot meet this

Returns:

similarity score, or 0.0 if below threshold

Ingestion

interface ListProvider

Service Provider Interface for fetching and parsing a specific sanctions list.

Each implementation is responsible for a single ListSource: downloading the raw data, parsing it into SanctionedEntity records, and tracking metadata for delta detection.

Subclassed by dev.sieve.ingest.eu.EuConsolidatedProvider, dev.sieve.ingest.ofac.OfacSdnProvider, dev.sieve.ingest.uk.UkHmtProvider, dev.sieve.ingest.un.UnConsolidatedProvider

Public Functions

ListSource source()

Returns the sanctions list source this provider handles.

Returns:

the list source, never

ListMetadata metadata()

Returns metadata from the most recent successful fetch.

Returns:

the metadata, never

List<SanctionedEntity> fetch()

Fetches and parses the sanctions list into normalized entities.

Throws:

ListIngestionException – if fetching or parsing fails

Returns:

the parsed entities, never

boolean hasUpdates(ListMetadata previousMetadata)

Checks whether the remote list has been updated since the given metadata snapshot.

Implementations should use lightweight mechanisms such as HTTP (ETag) or headers to avoid downloading the full file.

Parameters:

previousMetadata – metadata from a prior fetch to compare against

Returns:

if the remote list has changed, otherwise

class IngestionOrchestrator

Orchestrates the ingestion of sanctions lists from all registered ListProviders.

Runs each provider, merges results into the EntityIndex, and produces a detailed IngestionReport. Supports both full and selective (source-filtered) ingestion runs.

Public Functions

inline IngestionOrchestrator(List<ListProvider> providers)

Creates an orchestrator with the given list of providers.

Parameters:

providers – the providers to orchestrate, must not be or empty

Throws:
  • NullPointerException – if is

  • IllegalArgumentException – if is empty

inline IngestionReport ingest(EntityIndex index)

Runs all registered providers and loads their entities into the given index.

Parameters:

index – the entity index to populate

Returns:

an ingestion report summarizing the results

inline IngestionReport ingest(EntityIndex index, Set<ListSource> sources)

Runs only the providers matching the given sources and loads their entities into the index.

Providers not in the set are reported as ProviderResult.Status#SKIPPED.

Parameters:
  • index – the entity index to populate

  • sources – the sources to include, or to run all providers

Returns:

an ingestion report summarizing the results

inline ListMetadata getMetadata(ListSource source)

Returns cached metadata for the given source from the last successful fetch.

Parameters:

source – the list source

Returns:

the metadata, or if the source has not been fetched