AI Security Patterns
This is the evergreen map behind the daily notes: recurring ways AI products, agent frameworks, MCP-style tool systems, and OSS automation cross from data into authority.
The core thesis is simple:
AI security issues become real when untrusted content crosses into tools, memory, files, credentials, browsers, network calls, or human approval paths.
Agent/tool trust boundaries
Untrusted prompts, repo content, web pages, documents, and tool outputs become security-relevant when they can steer files, browsers, network calls, credentials, or approvals.
Host-side parser exposure
Upload and import features are risky when attacker-controlled documents are parsed by privileged host services instead of isolated, bounded workers.
SSRF through downstream clients
URL validation belongs at the request primitive, including redirect hops, fallback transports, and SDK clients that perform later network I/O.
Artifact path and symlink safety
Container, plugin, and tool-produced paths must be revalidated by the host before read, write, cleanup, or upload sinks touch the filesystem.
Maintainer-friendly evidence
The best security work separates exploitability, trust boundary, compatibility cost, regression coverage, and disclosure unit before asking for a fix.
Review questions
- What untrusted object enters the system: prompt, document, URL, path, repo, webhook, model output, memory, or tool response?
- Which component later treats that object as authority?
- Where is the damaging sink: file I/O, HTTP fetch, browser automation, command execution, upload, approval, or stored state mutation?
- Does validation happen at the sink, or only at an earlier caller/config layer?
- Is there a compatibility escape hatch, and is it explicit, narrow, and regression-tested?
Related daily notes
One PR merged in the 2026-05-05 Singapore window. It was not a new runtime security fix; it tightened the evidence layer around RAPTOR by making CI run the libexec wrapper tests...
No PRs merged in the 2026-05-04 Singapore window. The useful movement was in the vault: maintainer feedback from an already-merged standards PR was converted into a checklist ch...
No PRs merged in the 2026-05-03 Singapore window. The useful movement was in the research system: a source-ingestion pass turned an external LLM-assisted vulnerability-discovery...
Four PRs merged in the 2026-05-02 Singapore window. One closed a concrete upload-write vulnerability. Two improved how RAPTOR turns review work into handoff-ready evidence. One ...
Three PRs merged in the 2026-05-01 Singapore window. Two were direct security fixes, and one was a documentation artifact for operational handoff. The common thread was not the ...
One security PR merged in the 2026-04-30 Singapore window. The change was small in code, but it touched a boundary I care about: a service that is described and used as local sh...
One PR merged in the 2026-04-29 Singapore window. It was intentionally test-only, but it mattered because the original OpenHarness bridge issue was not just a metadata mistake. ...
Six PRs merged in the 2026-04-28 Singapore window. The work split across OpenHarness, RAPTOR, FastGPT, and OWASP/APTS, but the boundary shape was consistent: do not let a string...