Architecture Review · Deepening opportunities

biometry-fraud-decision

Post spec-first refactor (branch refactor/spec-first-codegen). The transport rewrite is clean — handlers are thin by design and the spec is the source of truth. The friction that remains is in the decision core (leaky voice seam, stringly-typed signal contract) and in display semantics smeared across views. This report surfaces deepenings, not bugs.

Deep

lots of behaviour behind a small interface — high leverage

Shallow

interface nearly as complex as the implementation

Seam

where behaviour can change without editing in place. 2 adapters = real seam

Deletion test

delete the module — does complexity vanish (pass-through) or reappear across callers (earns keep)?

The map

flowchart LR subgraph CORE["decision core"] O["Orchestrator.Decide"] AUR["auraya.Client
(no interface)"] EVA["evaforensics
(branch inside voice)"] RUL["RulesConfig.Apply
weights map[string]int"] end subgraph VEND["phone vendors (real seam ✓)"] PV["PhoneVendor"] XC["xconnect"]; SK["sekura"]; SN["smartnum"] end subgraph WEB["transport"] HX["dash handlers ×7"] MAP["mappers ×2
(dup signal codec)"] end subgraph UI["frontend"] PG["pages ×7
inline risk-colour / time fmt / windows"] end O -->|"uniform"| PV --> XC & SK & SN O -->|"special-cased ❌"| AUR --> EVA O --> RUL PV -. "string signal names" .-> RUL HX --> MAP classDef bad fill:#3a1416,stroke:#b04349,color:#ffd7d9 classDef ok fill:#10241a,stroke:#2f7a52,color:#bff0d4 classDef warn fill:#2a2410,stroke:#9a7d2e,color:#f0e4b0 class AUR,EVA bad class PV,XC,SK,SN ok class RUL,MAP,PG warn

Green = a real seam already (the PhoneVendor family). Red = the leaky voice path. Amber = stringly-typed or duplicated display logic.

Candidates

Strong

C1 · Give voice its own seam — a VoiceVendor family

Files

decide/orchestrator.go · decide/orchestrator.go (ports) · vendors/auraya/client.go · vendors/evaforensics/client.go

Problem

Phone vendors sit behind a clean PhoneVendor seam — two real adapters (real + mock) per vendor, fanned out uniformly. Voice does not. The orchestrator holds a raw *auraya.Client and special-cases it in resolveVoice() / callAuraya(): it decodes audio, parses crossmatch speakers, decides whether to call EVA Forensics, times EVA, and falls back to a hidden syntheticMockHits(). EVA is not an adapter — it's a case branch baked into the voice path. The three voice inputs (pre-computed from EVA-for-Genesys, raw-audio ArmorVox, synthetic detection) are entangled in one ~55-line method on the orchestrator.

Why it's friction

Leaky seam: the orchestrator knows Auraya's async internals. The CLAUDE.md says the EVA spec is likely to change — that edit lands inside orchestration, not in an adapter.
Untestable: there's no way to mock the voice path. Tests pass Auraya: nil and exercise only the pre-computed branch. The "fraudster matches the legit user" voice-spoof edge case can't be tested without hand-faking HTTP.
Can't wire EVA independently (e.g. standalone synthetic detection) — it's trapped inside callAuraya().

Solution (plain English)

Lift voice resolution to the same shape as PhoneVendor: a small VoiceVendor seam that takes the request and returns []RawSignal. Three adapters behind it — pre-computed (reads voice.result/synthetic/watchlist_hit), ArmorVox crossmatch (raw audio), EVA synthetic — selected by the same auto/real/mock policy the phone vendors use. The orchestrator then fans voice out exactly like every other vendor and stops knowing how voice is computed.

Before / After

flowchart TB O["Orchestrator.Decide"] O --> RV["resolveVoice()"] RV --> CA["callAuraya()
decode · crossmatch · classify"] CA --> B1{"EVA configured?"} B1 -->|yes| EVA["evaforensics.Detect"] B1 -->|no| MK["syntheticMockHits()"] classDef bad fill:#3a1416,stroke:#b04349,color:#ffd7d9 class RV,CA,B1,EVA,MK bad

flowchart TB O2["Orchestrator.Decide"] O2 -->|"fan out, uniform"| VV["VoiceVendor"] VV --> A1["precomputed"] VV --> A2["armorvox crossmatch"] VV --> A3["eva synthetic"] classDef ok fill:#10241a,stroke:#2f7a52,color:#bff0d4 class VV,A1,A2,A3 ok

Benefits

Leverage: the orchestrator's fan-out becomes one uniform loop over vendors — voice included. Locality: all "how is voice computed" knowledge moves behind one interface; an EVA spec change touches one adapter. Tests: the interface becomes the test surface — fake a VoiceVendor in-process and assert the spoof edge cases that are impossible today. Deletion test: collapsing it back would scatter audio-decode + crossmatch-parse + EVA-dispatch across the orchestrator again — it earns its keep.

Strong

C2 · Close the stringly-typed signal-name contract

Files

domain/rules.go · domain/types.go (RawSignal) · vendors/*/client.go + mock.go · decide/orchestrator.go

Problem

The seam between vendors and the rules engine is an untyped string. Vendors emit RawSignal{Name:"sim_swap_24h"}; RulesConfig.Apply looks the name up in Weights map[string]int and — on a miss — silently skips it (if !ok { continue }). A typo in a vendor ("sim_swap24h") or a renamed weight produces a signal that contributes zero, with no error, no log, no failing test. The rules engine is tested purely in isolation; the orchestrator tests assert signals are present but never assert the resulting score/level/action. So the one seam most likely to drift has the least enforcement.

The gap, concretely

vendor emits        rules.Weights has        result
"sim_swap_24h"  →   "sim_swap_24h": 60   →   +60   ✓
"sim_swap24h"   →   (no key)             →   +0    ✗  silent
                                              no test catches this

Solution (plain English)

Make the signal vocabulary a single named set shared by both sides of the seam — vendors produce from it, the rules config is keyed by it, and seeding validates that every weight name is a known signal (and vice-versa) at startup. Pair it with one end-to-end test that injects a vendor signal and asserts the final score/level/action, plus the threshold boundaries (30/60/80) that currently have no edge-case test. The point isn't ceremony — it's turning a silent-zero into a compile error or a startup failure.

Before / After

flowchart LR V["vendors
(free-form strings)"] -->|"name: string"| R["rules.Weights
map string→int"] R -->|"miss = skip silently"| Z["+0 risk, no signal"] classDef bad fill:#3a1416,stroke:#b04349,color:#ffd7d9 class V,R,Z bad

flowchart LR S["Signal vocabulary
(one named set)"] --> V2["vendors emit from it"] S --> R2["weights keyed by it"] V2 & R2 --> CHK["startup check + e2e score test"] CHK --> OK["drift = loud failure"] classDef ok fill:#10241a,stroke:#2f7a52,color:#bff0d4 class S,V2,R2,CHK,OK ok

Benefits

Locality: the signal vocabulary becomes one authority instead of being implied by string equality across ~6 files. Tests: a vendor→rules contract test makes the most drift-prone seam the test surface. Leverage: small change, removes a whole class of invisible scoring bugs. Pairs naturally with C1 (the new voice adapters emit from the same vocabulary).

Worth exploring

C3 · One transport edge — error→HTTP & param decoding

Files

transport/rest/dash/handlers.go · external/decide.go · (new) a small respond/errors helper

Problem

Across the 7 dash handlers, return c.JSON(500, ErrorResponse{Error: err.Error()}) appears verbatim ~12 times, and only GetEvent knows that mongo.ErrNoDocuments → 404. That mapping knowledge isn't shared — add a semantic error (say a 409) and there's no registry to put it in; the next handler author won't know to check. Separately, query-param defaulting is done three different ways (inline strconv in ListEvents, a parseSince() helper in GetSummary, inline string default in GetAnalytics) — the generated *Params structs are accepted then ignored. No single place answers "how do we default params / map errors?"

Solution (plain English)

Concentrate the edge: one error responder that maps domain/storage errors to status+body (register ErrNoDocuments→404 once; everything else 500), wired as Echo's error handler or a one-line respondErr(c, err). And one small params decoder for the recurring limit/since/bucket defaults so handlers state intent, not parsing. Handlers shrink to: decode → call → map.

Before / After

7 handlers, each carrying its own error + parse code

~12× repeated error return · 3 param styles · 1 lonely 404 check

handlers thin; mapping concentrated

respondErr() + decodeParams() — one home each

Benefits & caveat

Locality: error-status policy and param defaults each get one home. Tests: the responder is unit-testable once instead of implicitly per handler. Honest deletion test: this is the most "hygiene" of the strong-ish candidates — the win is real but bounded (~40 duplicated lines). Do it with C1/C2, not instead of them.

Speculative

C4 · Shared signal codec across the two mappers

Files

transport/rest/external/mappers.go · transport/rest/dash/mappers.go

Problem

domainSignals() and signalValueToBool() are byte-for-byte identical in both mapper files — the only difference is externalapi.Signal vs internalapi.Signal (codegen emits two structurally-identical types). Deletion test: removing one copy concentrates ~30 lines; it's duplication, not a missing module.

Solution & honest caveat

A generic helper (Go generics over the two structs, or map via a single canonical intermediate) folds the duplication. Caveat: the two types exist because codegen keeps the surfaces independent — forcing them to share fights that grain. Low leverage; only worth it if the signal codec grows (e.g. richer value typing). Listed for completeness, not urgency.

Strong

C5 · A presentational module for Decision & Signal

Files

ui/src/pages/{DecisionDetail,Feed,Analytics,Rules,Summary}.tsx · components/ui.tsx

Problem

Domain display semantics are smeared across pages as inline ternaries. The risk-colour rule — "negative weight is green (lowers risk), positive is red (raises risk)" — is hand-written in three places (DecisionDetail signal table, Rules editor, Analytics). Timestamp formatting is done three different ways (toLocaleTimeString, toLocaleString, a local formatBucket). The signal table exists only in DecisionDetail; any new view that shows signals re-implements it. LevelPill/ActionPill live correctly in ui.tsx but as generic UI, not as a "render a Decision" vocabulary.

Solution (plain English)

A thin presentational module that owns how the domain looks: a riskColor(n) / signed-number formatter, a formatDecisionTime(), and a reusable <SignalTable>. Pages compose these instead of re-deriving them. The domain's visual language gets one definition.

Before / After

flowchart TB RC["risk-colour rule"] D1["DecisionDetail"]; R1["Rules"]; A1["Analytics"]; F1["Feed"] D1 --- RC; R1 --- RC; A1 --- RC D1 -.->|"own signal table"| ST1["table A"] F1 -.->|"own time fmt"| T1["fmt A"] D1 -.-> T2["fmt B"] classDef bad fill:#3a1416,stroke:#b04349,color:#ffd7d9 class RC,ST1,T1,T2 bad

flowchart TB P["decision-view module
riskColor · formatTime · SignalTable"] D2["DecisionDetail"]; R2["Rules"]; A2["Analytics"]; F2["Feed"] D2 --> P; R2 --> P; A2 --> P; F2 --> P classDef ok fill:#10241a,stroke:#2f7a52,color:#bff0d4 class P ok

Benefits

Locality: a colour-semantics or timezone change is one edit, not a grep across pages. Leverage: the hypothetical "Signals Explorer" page composes <SignalTable> for free. Tests: the formatters become pure and unit-testable; today they're trapped in JSX.

Worth exploring

C6 · A consistent data-access contract for pages

Files

ui/src/pages/*.tsx · lib/mic.ts + TryIt.tsx · (new) polling constants / useRecorder

Problem

The generated hooks are great, but the usage contract drifts: three styles of spreading queryOptions (spread+override vs bare call vs call-with-params), three error/loading conventions (early-return vs inline badge vs local-state mutation catch), and magic refetchInterval numbers (Feed 3s, Summary 5s, Analytics none — intentional?). Separately, the record→encode flow is coherent in mic.ts but TryIt carries 9 useState for the recorder sub-machine and the WINDOWS time-selector is duplicated in Summary and Analytics.

Solution (plain English)

Settle one convention and give the recurring pieces a home: a tiny polling constants object (documents why 3s vs 5s), one error/empty presentational wrapper pages share, a useRecorder() hook that swallows the 9 useStates, and a single <TimeWindowSelector>. Smaller than C5 but removes the "which pattern do I copy?" tax for the next page author.

Benefits

Locality: polling cadence and window options each get one source. Leverage: useRecorder makes the capture flow reusable beyond Try-It. AI-navigability: one obvious pattern per concern beats three.

Top recommendation

Start with C1 (voice seam), carrying C2 (signal contract) with it.

C1 is the one true deepening in the codebase: it turns the orchestrator's special-cased, untestable voice branch into one more adapter behind a small seam — the exact pattern that already works for phone vendors. It's where both the core and the transport explorers (and I) felt the most friction, it's the area the CLAUDE.md flags as most likely to change (EVA spec), and it unlocks tests that are impossible today. C2 rides along naturally: the new voice adapters should emit signals from the same named vocabulary, and closing the silent-skip gap protects the scoring that C1 reshapes. C3–C6 are real but bounded cleanups — schedule them after, and prefer C5 over C3/C4/C6 if you want the next-biggest locality win on the UI side.

C1 → deepest, highest leverage C2 → pairs with C1, kills silent-zero bugs C5 → best UI locality win C3/C4/C6 → bounded hygiene

No CONTEXT.md or ADRs found — domain terms taken from the code (Decision, Signal, Orchestrator, PhoneVendor, rules engine). If we pursue a candidate, I can seed CONTEXT.md with the vocabulary (e.g. VoiceVendor, signal vocabulary) as decisions crystallise.