AI / Agentic Systems

“This dataclass enforces the information barrier specified in the epistemic_review contract. It EXCLUDES: scores, decision_score, total_score; probabilities, base_probability_pct, effective_probability; valuations, target_price, floor_price, base_case, worst_case; numeric targets, cagr_pct, downside_pct.”

edenfintech-scanner-python, epistemic_reviewer.py

What this looks like for me

Most of my LLM work has gone into making model output trustworthy enough to act on, not the models themselves, and not the prompting craft in isolation. What that usually means in practice: isolating the LLM behind a typed interface, writing contracts that the model has to honour, building a second-opinion review stage that is structurally forbidden from seeing certain information, and logging every call so a later run can be reconstructed.

The scanner is the longest-running example. Its agent graph has three roles (analyst, validator, epistemic reviewer) connected through frozen dataclasses that enforce an information barrier at the type level, not by prompting. There is a probability-anchoring detector, a bias-check stage, and a three-agent unanimous exception panel for cases where the deterministic screen and the scored view disagree. The LLM sits inside an adapter layer behind a Callable[[dict], dict] transport, so the core pipeline is stdlib-only.

Other pieces of the same instinct show up in smaller projects. yt-ts swaps between DeepSeek and OpenAI through the same OpenAI SDK by changing a base_url, driven by an external YAML system prompt with few-shot examples and a pipe-table I/O contract. InsiderSignalResearch built a research-sprint framework where each agent role has explicit data-discipline rules baked into its prompt. devai.co.za is where earlier experiments (an NL-to-SQL Gradio app, an AI stock analyst, custom GPTs) sit in public view.

Projects that back this

edenfintech-scanner-python: three-role agent graph with code-enforced information barriers and constrained decoding.
yt-ts: provider-agnostic LLM client for structured extraction; regex pre-filter, YAML-prompt contract, pipe-table parser.
InsiderSignalResearch: research-sprint framework of specialised agent roles with enforced data-discipline rules.
devai.co.za: archive of earlier AI side-projects (InvestAI analyst platform, NL-to-SQL app, YouTube summariser, custom GPTs). Inspirations credited: CrewAI, Fabric, PraisonAI.

Decisions that shaped how I do it

Why I keep LLMs behind typed adapters: code-level contracts over prompt discipline; transport injection; external prompts.

Playbooks I use here

A file-driven planning framework for AI-assisted coding: each phase reads and writes a file; mixed-model workflow (Opus → Gemini → Sonnet); crash-resumable.
Adversarial review via a second AI model: piping work to a different-vendor model for a critical pass; the findings I’d have missed come from a reader who isn’t invested in the design choice.

Open questions I’m holding

Is prompt engineering more leveraged than the harness, long-term?: as base models get better at following loose prompts, does the compounding return shift back to prompt-craft or stay with typed adapters, file-driven planning, and review loops?

What I’m usually asked to do

Design an agent workflow that won’t quietly drift when the model changes
Retrofit an existing LLM call with proper logging, dedup, and reproducibility
Write a system prompt + structured-output contract so downstream code can rely on it
Wire a multi-role review stage onto an analysis pipeline without leaking context between roles

Knowledge graph