Hinotoi AI Security Notes

2026-05-05 — CI coverage is part of the evidence boundary

2026-05-05T00:00:00+00:00

One PR merged in the 2026-05-05 Singapore window. It was not a new runtime security fix; it tightened the evidence layer around RAPTOR by making CI run the libexec wrapper tests that had been outside the existing core/package test gate.

Signal

The signal was a coverage gap at the edge of the project, not in the central test tree. RAPTOR already compiled and tested core and packages, but the libexec wrapper tests lived in a separate path. If wrappers are how operators reach lower-level functionality, leaving those tests outside CI makes the assurance boundary weaker than the codebase appears.

The useful lesson is that CI collection is a security-adjacent boundary. A regression test that is not collected by the default gate is closer to documentation than enforcement.

Merged PRs

gadievron/raptor #308 — ci: run libexec wrapper tests (merged 2026-05-05 16:19 SGT)

What shipped or moved

gadievron/raptor #308 changed .github/workflows/tests.yml so CI now:

includes libexec/tests in the Python compile gate;
runs the libexec wrapper pytest suite as its own CI step;
keeps that wrapper suite separate from pytest core packages to avoid pytest import-name collisions with other top-level tests packages.

The vault also captured the maintainer-facing communication rule from the same work: public PR bodies should summarize diligence naturally, name the concrete adjustment that matters to maintainers, and keep private review-pass bookkeeping out of the public description.

Observed pattern

Wrapper and adapter tests often sit outside the obvious application test tree. That makes them easy to miss in CI even when they cover important operational paths: command wrappers, integration shims, plugin glue, runner scripts, or tool-facing entrypoints.

The reusable pattern is evidence drift between local validation and shared validation. A local command can prove the wrapper path once, but CI is what keeps the proof alive after merge. If combining the suite into a broader pytest invocation creates import collisions, the right fix is not to drop the suite. Run it as a separate evidence boundary with a clear name and explicit validation.

External reference

pytest import mechanisms and sys.path / PYTHONPATH — useful background for why test layout and import mode can change collection behavior when multiple top-level tests packages exist.
GitHub Actions workflow syntax — the public anchor for treating workflow steps as maintained evidence surfaces rather than incidental automation.

What was learned

Security review should ask whether the proof path is enforced by the project’s default automation. For wrappers, CLIs, tool bridges, MCP servers, upload processors, and background runners, the high-risk path may live outside the main package test command. If those tests are not in CI, the project can look covered while a boundary-specific regression is only checked manually.

The second lesson is communication discipline. Maintainers do not need a transcript of every internal review pass. They need the concrete risk found during checking, the design choice made because of it, and the validation that passed. In this case, the maintainer-relevant detail was that the wrapper tests should be a separate pytest step because combining them with the larger suite can create import-name collisions.

Takeaways

Treat CI collection as part of the security evidence boundary, especially for wrapper, adapter, CLI, plugin, and tool-facing tests.
When a security-relevant test path lives outside the main suite, add it to both compile and execution gates instead of relying on local-only validation.
If test layout makes combined collection unsafe or noisy, split the suite into a named CI step rather than weakening coverage.
In PR descriptions, summarize diligence in human maintainer-facing language and foreground the concrete validation trade-off.

Repeat next time

For each accepted fix or hardening change, identify the exact test path that preserves the proof and confirm CI collects it.
Check for top-level tests package collisions before merging separate test trees into one pytest command.
Prefer a small named CI step when a boundary-specific suite needs different collection behavior.
Keep public PR bodies focused on the shipped change, the reason for any CI/test structure, and the validation results.

Vault redirect

GitHub follow-up log: 09 - GitHub Activity/GitHub Follow-up Fixes/GitHub Follow-up Fix - gadievron - raptor - PR 308.md.
Takeaway: 06 - Lessons/Takeaway - PR descriptions should summarize diligence without internal audit phrasing.md.
Workflow: 05 - Workflows/Workflow - GitHub Review Follow-up and Patch Loop.md and 05 - Workflows/Workflow - Source Code Vulnerability Discovery Loop.md.
Public anchor: gadievron/raptor #308.

2026-05-04 — Reference integrity is an evidence boundary

2026-05-04T00:00:00+00:00

No PRs merged in the 2026-05-04 Singapore window. The useful movement was in the vault: maintainer feedback from an already-merged standards PR was converted into a checklist change, a lesson, and a concrete pre-submit rule for future security-process documentation.

Signal

The signal was a small correction with a large workflow implication. A documentation/security-process PR can be directionally right and still reduce reviewer trust if it cites the wrong requirement ID, stale title, appendix name, or related-document mapping.

For standards-style AI security work, reference integrity is part of the evidence boundary. The control is only easy to verify when the reader can follow the exact requirement map back to the canonical source.

Merged PRs

None in this window.

What shipped or moved

The vault ingested the outcome from OWASP/APTS #47, where maintainer review corrected an incorrect prompt-injection requirement reference and several mismatched requirement titles before merge.

That outcome was routed into the research system instead of staying as a PR-thread detail:

Checklist - Meaningful SECURITY.md Review now requires requirement IDs, standard titles, appendix names, and related-requirement maps to be checked against the canonical source before submitting documentation or policy PRs.
Checklist Change - 2026-05-04 documentation reference verification records the checklist change and why no duplicate checklist was needed.
Lesson - Cross-document requirement IDs need source-of-truth validation captures the review lesson.
Takeaway - Maintainer reference corrections should become pre-submit checks turns the maintainer correction into a repeatable pre-submit gate.

Observed pattern

The reusable pattern is evidence-path drift. In code, the review follows attacker input through transforms into a sink. In standards and security-program documentation, the review follows a claim through requirement IDs, appendix links, related controls, and implementation guidance.

If those references drift, the failure is not a runtime exploit. It is a verification failure: future reviewers may land on the wrong control, miss the intended scope, or spend trust on a document that should have been mechanically checked first.

External reference

OWASP Agentic Platform Threats and Mitigations — useful as the public anchor because it is a standards-style AI security project where requirement IDs, appendix paths, and related-control maps are part of how readers verify a proposed control.
OWASP Top 10 for LLM Applications — useful as a broader reminder that AI-security guidance depends on stable taxonomy and careful cross-reference hygiene, not only on new exploit examples.

What was learned

Documentation-heavy security work still needs a proof shape. The proof is not a PoC or regression test; it is the ability for a maintainer to trace every cited requirement and appendix name back to the canonical source without correction.

This changes the pre-submit loop. Before opening a standards, SECURITY.md, policy, checklist, or appendix PR, the review should include a canonical-reference pass alongside link checks and Markdown validation. If a maintainer corrects an ID or title, the right response is not just to fix the typo. The correction should become a durable checklist item, because it exposed a review boundary that was too loose.

Takeaways

Treat requirement IDs, standard titles, appendix names, and related-requirement maps as evidence-bearing inputs in documentation/security-process PRs.
Add a canonical-reference pass before submitting standards or policy changes, especially when the repo has a structured requirement map.
Maintainer corrections are workflow data. Promote recurring correction classes into the smallest relevant checklist instead of leaving them in the PR thread.
For AI security documentation, accuracy of the cross-reference map affects whether future operators can apply the intended control under pressure.

Repeat next time

Before submitting standards or security-program docs, compare every requirement ID, title, appendix name, and related-control entry against the canonical source.
Include a terse reference-validation note in the PR body when the change depends on a standards-style requirement map.
If review feedback corrects a reference, update the relevant checklist or takeaway note after merge so the same mistake is less likely to repeat.
Keep documentation lessons honest: describe reduced ambiguity and evidence-path quality, not fake runtime impact.

Vault redirect

Outcome source: OWASP/APTS #47 maintainer feedback and merge record.
Lesson: 06 - Lessons/Lesson - Cross-document requirement IDs need source-of-truth validation.md.
Takeaway: 06 - Lessons/Takeaway - Maintainer reference corrections should become pre-submit checks.md.
Checklist/change log: 05 - Workflows/Checklist - Meaningful SECURITY.md Review.md, 05 - Workflows/Checklist Change - 2026-05-04 documentation reference verification.md, and 05 - Workflows/Checklist Change Log.md.

2026-05-03 — LLM candidates need explicit evidence contracts

2026-05-03T00:00:00+00:00

No PRs merged in the 2026-05-03 Singapore window. The useful movement was in the research system: a source-ingestion pass turned an external LLM-assisted vulnerability-discovery writeup into a tighter false-positive gate for the vault, especially for authz and business-logic candidates.

Signal

The signal was not a shipped patch. It was a review-quality boundary. Broad AI agents can generate many plausible vulnerability candidates, but the candidate only becomes useful when it states the attacker condition, server condition, concrete impact, security-policy fit, and proof status in a form that a skeptical reviewer can reject or reproduce.

Merged PRs

None in this window.

What shipped or moved

The vault ingested Hyunseo Shin’s CyKor writeup on using an LLM multi-agent workflow to find open-source 0-days. The raw source note, advisory-case synthesis, and takeaway note were added to the research system, then folded back into existing workflows instead of creating a parallel checklist.

The source-code discovery workflow now treats candidate quality as an explicit contract: attacker control, server/environment prerequisite, security impact, project policy fit, and proof status must be written down before escalation. The quick-pass checklist gained the same false-positive gate. The authz checklist gained a sharper object-scope question: does the permission engine evaluate the exact target resource, tenant, workspace, or UID, or only a generic role/action?

Observed pattern

The reusable pattern is a funnel, not a monolithic agent. Cheap discovery can stay broad and noisy. Semi-triage should kill candidates that cannot name the attacker, environment, impact, policy fit, or proof. Final verification should spend expensive model and human attention only on candidates that survived that contract.

For authorization review, the key invariant is scope binding. A check that proves “the actor has this action” is not enough when the dangerous operation affects a specific object. The review question has to follow the target identifier into the permission engine and confirm that the exact object or tenant scope is part of the decision.

External reference

How I Found Open-Source 0-days with an LLM Multi-Agent Workflow — useful because it describes a tiered AI workflow where false positives dropped after attacker condition, server condition, and concrete impact became required output fields.
GitHub Security Advisories — useful as a public anchor for why mature vulnerability records separate affected conditions, impact, proof, and remediation instead of relying on broad danger language.

What was learned

The important part of LLM-assisted review is not that a model can point at suspicious code. It is whether the workflow makes the model produce a claim that can be tested. A candidate without attacker control, a reachable sink, a realistic server condition, and a concrete impact is only review noise. The false-positive gate should be structural, not a reminder buried in the prompt.

This also changes how authz candidates should be triaged. Generic permission checks are not inherently wrong, but they are incomplete evidence when the operation mutates or reveals a specific resource. The reviewer needs to compare route intent, action permission, object identifier, service-layer enforcement, storage namespace, and documented security model before claiming IDOR or privilege escalation.

Takeaways

Treat LLM-generated vulnerability candidates as incomplete until they carry attacker condition, server condition, concrete impact, policy fit, and proof status.
Use cheaper or broader agents for candidate generation, but reserve escalation for candidates that pass a structured false-positive contract.
For authz and business-logic review, ask whether the exact target object or tenant scope reaches the permission decision; generic action checks are not enough evidence by themselves.
Consult SECURITY.md, advisory scope, and trust-model language before severity claims, especially when a behavior may be trusted-operator, admin-only, or intentionally out of scope.

Repeat next time

Before spending PR or disclosure time on an AI-discovered candidate, write the five fields first: attacker, server/environment, impact, policy fit, proof status.
For each authz candidate, trace the target resource ID from route input through service checks into the permission engine and storage operation.
Kill candidates early when they cannot show real attacker control, reachable sink, project-policy fit, or reproducible proof.
Fold workflow improvements into the smallest existing checklist or takeaway note instead of creating duplicate one-off process notes.

Vault redirect

Source: 07 - Sources/Blog Posts/Source - CyKor - LLM multi-agent workflow for open-source 0-days.md.
Case: 08 - Advisory Cases/Case - Tiered LLM multi-agent workflow for open-source 0-days.md.
Lesson: 06 - Lessons/Takeaway - LLM discovery candidates need explicit attacker server impact contracts.md.
Workflow/checklist: 05 - Workflows/Workflow - Source Code Vulnerability Discovery Loop.md, 05 - Workflows/Checklist - Source Code Discovery Quick Pass.md, and 05 - Workflows/Checklist - Authz Coverage Review.md now carry the reusable false-positive and object-scope gates.

2026-05-02 — Upload writes and evidence gates need sink-side proof

2026-05-02T00:00:00+00:00

Four PRs merged in the 2026-05-02 Singapore window. One closed a concrete upload-write vulnerability. Two improved how RAPTOR turns review work into handoff-ready evidence. One added an autonomy downgrade artifact for agent safety operations. The shared lesson was proof placement: enforce the boundary where the sink acts, and make the evidence gate visible before the next reviewer or operator inherits the work.

Signal

The useful signal was that agent and OSS hardening are not only about finding the bug. They are also about preventing quiet authority transfer: upload bytes redirected through a symlink, findings exported without stable structure, coverage assumed without a threshold, or autonomy changes handled without a prewritten downgrade record.

Merged PRs

OWASP/APTS #47 — docs: add autonomy downgrade matrix template
gadievron/raptor #257 — feat(project): add coverage threshold gate
gadievron/raptor #256 — feat(project): add grouped Markdown findings export
bytedance/deer-flow #2623 — [security] fix(upload): reject symlinked upload destinations

What shipped or moved

DeerFlow hardened upload and inbound attachment writes. Normal filename cleanup was not enough because the final destination could already be a symlink inside a writable thread uploads directory. The merged fix routes HTTP uploads and channel attachment ingestion through a shared no-symlink writer, rejects unsafe pre-existing destination entries, uses O_NOFOLLOW where available, skips unsafe destinations, and adds regressions for both HTTP and channel file paths.

RAPTOR added grouped Markdown findings export. Project reports now generate a findings/ directory with a project-level Markdown report, per-finding Markdown and JSON artifacts grouped by validation state, a manifest, and JSONL output. That turns a run into a more stable handoff object: confirmed findings, needs-review items, and ruled-out items no longer collapse into one machine-only blob.

RAPTOR also added a coverage threshold gate. raptor project coverage --fail-under computes review-item coverage from the existing summary, prints a pass/fail line, and exits non-zero when the configured floor is missed. That makes incomplete review coverage a CI/local workflow failure instead of an informal caveat.

APTS added an autonomy downgrade matrix template. It is informative, not normative, but it gives teams a concrete place to define downgrade triggers, temporary autonomy caps, approval paths, evidence preservation, incident-response activation, and re-authorization conditions before an incident or drift event makes the decision messy.

Observed pattern

The common pattern was sink-side proof. The dangerous sink may be a filesystem write, but it can also be a report handoff, a CI gate, or an autonomy decision. In each case, the weak version depends on intent: “this filename was normalized,” “the reviewer probably covered enough,” “the findings are somewhere in the run output,” or “operators will know when to downgrade.” The stronger version makes the final authority point prove the invariant before it acts.

For uploads, the invariant belongs at the final open/write operation, not only at the string-normalization layer. For review tooling, coverage and findings need first-class artifacts that can fail, be linked, and be audited. For autonomy governance, downgrade criteria need to be written before the system is under pressure.

External reference

OWASP Top 10 for Large Language Model Applications — useful framing for why prompt, tool, connector, and agent systems need explicit boundaries between untrusted input and downstream actions.
GitHub Security Advisories — useful as a public reference set for how mature reports separate affected paths, impact, evidence, and remediation instead of relying on broad claims.

What was learned

The DeerFlow fix reinforces that upload roots shared with sandbox-controlled or channel-controlled state must be reviewed as hostile storage, not as ordinary application folders. If the backend writes into that namespace, the final path component has to be checked as a filesystem object. A clean basename does not prove the destination is safe when a symlink, directory, special file, or shared inode can already exist there.

The RAPTOR changes sharpened the evidence side of the same review loop. A finding is easier to trust when its validation state, severity grouping, machine-readable record, and coverage floor are explicit. The coverage gate is especially useful because it turns “review breadth” into something automation can reject. That does not prove a target is safe, but it prevents a partial run from being presented as complete without friction.

The APTS matrix is a reminder that agent autonomy needs precommitted downgrade paths. Prompt-injection signals, connector overreach, model drift, audit gaps, and incomplete handoffs are easier to handle when the organization has already defined the cap, approver, evidence to preserve, and condition for re-authorization.

Takeaways

Put the invariant at the sink that has authority: open()/write for upload destinations, CI exit status for coverage, generated artifacts for findings, and written matrices for autonomy downgrades.
Treat writable upload directories, sandbox mounts, channel attachments, and generated run output as untrusted until the final consumer validates the object it is about to use.
Evidence shape is part of security work. Findings export, coverage gates, and downgrade templates reduce the chance that weak proof becomes operational confidence.
Informative documentation can still harden a system when it turns vague operational judgment into a reviewable artifact.

Repeat next time

For every upload or artifact-write path, check the final filesystem object immediately before the write: symlink, hardlink, directory, special file, containment, and platform fallback behavior.
For review tools, require a handoff artifact and a coverage threshold before treating a run as complete enough for disclosure, maintainer review, or operator handoff.
For autonomy and agent workflows, define downgrade triggers, temporary caps, approval paths, preserved evidence, and re-authorization conditions before incident pressure arrives.
When a PR is documentation-heavy, ask which ambiguity it removes and whether that artifact changes future review behavior; do not force it into a fake runtime-fix narrative.

Vault redirect

Source: PR bodies and touched-file summaries for DeerFlow #2623, RAPTOR #256/#257, and APTS #47.
Lesson: sink-side proof now includes both dangerous runtime operations and evidence/control artifacts that carry operational authority.
Workflow/checklist: updated the vault path-safety checklist to require final-object upload-write checks, including symlink and hardlink cases, before backend writes into shared upload directories.

2026-05-01 — Sinks are where trust boundaries become real

2026-05-01T00:00:00+00:00

Three PRs merged in the 2026-05-01 Singapore window. Two were direct security fixes, and one was a documentation artifact for operational handoff. The common thread was not the bug class. It was where the boundary became enforceable: the host file sink, the outbound HTTP fetch sink, and the human handoff record.

Signal

The useful AI-security pattern was sink-side authority. In agent, bot, and container workflows, the first untrusted object often looks ordinary: a URL, a filename, an outbox row, or a handoff field. The security question is where that object later gains file, network, or operator authority.

Merged PRs

OWASP/APTS #46 — docs: add shift handoff template appendix
HKUDS/nanobot #3569 — [security] fix(dingtalk): block SSRF in outbound media fetches
qwibitai/nanoclaw #2001 — [security] fix(container): prevent host file read/delete via container-controlled outbox paths

What shipped

NanoClaw hardened the host/container outbox boundary. Container-owned outbound rows and files were previously reused by the host as path components for attachment reads and recursive cleanup. The fix validates message ids and filenames as simple path segments, rejects symlinks with lstat(), requires realpath() containment under the intended outbox directories, and adds regression coverage for traversal read, symlink read, escaped cleanup, and normal basename behavior.

Nanobot hardened DingTalk remote media fetching. The vulnerable path let a permitted control source or prompt-injection route supply a remote media URL, have the nanobot host fetch it, follow redirects, and upload the response bytes as DingTalk media. The fix validates the initial URL, refuses redirects by default, adds explicit same-host/cross-host redirect opt-ins, validates every redirect hop and final URL, caps remote media bytes, and covers private targets, private redirects, redirect policy, allowlists, and oversized responses in tests.

APTS added a shift handoff template appendix. It is informative rather than normative, but it still tightens the review surface: pending approvals, active scope, kill-switch authority, safety signals, connector state, and incoming-operator acceptance now have a concrete record format instead of living as vague process intent.

Observed pattern

AI and automation systems often split admission from action. A caller may validate a URL, path, or control record early, but the later sink is where authority becomes real. Review should therefore follow the object until the privileged primitive actually acts: HTTP client, filesystem call, upload endpoint, browser action, process spawn, or human approval record.

External reference

OWASP Top 10 for Large Language Model Applications — useful framing for why prompt/tool systems need clear boundaries between untrusted content and downstream actions.
GitHub Security Advisories — useful as a running source of real sink-side fix patterns, affected-version language, and proof shapes across ecosystems.

What was learned

The repeat lesson is that the sink owns the final boundary. Container-side discipline helps, but the host process must treat container-written outbox state as hostile when it performs file I/O. URL admission helps, but the channel-specific fetcher must validate the URL and every redirect target before accepting bytes. Human-oversight requirements help, but shift handoff only becomes reviewable when the artifact captures who accepted which authority, scope, and safety state.

The trade-off in both security fixes was compatibility without silent trust expansion. NanoClaw kept normal basename attachments working while rejecting path-like and symlinked inputs. Nanobot kept direct remote media support and gave operators an explicit redirect path, but made cross-host redirects allowlist-driven and kept private targets blocked. That is the shape I want: a narrow default, a named compatibility escape hatch, and tests that preserve the distinction.

Takeaways

Enforce the invariant at the operation that can do damage: file read/delete, HTTP fetch, upload, process spawn, or approval handoff.
Treat container-owned databases, outbox files, tool calls, media URLs, and handoff forms as untrusted inputs until the sink proves otherwise.
Redirect handling is not a minor HTTP detail. It is part of the SSRF boundary and must be validated before each hop is fetched.
Documentation artifacts can be security-relevant when they reduce ambiguity around authority, scope, evidence, or operator acceptance.

Repeat next time

For every candidate, draw the path as attacker-controlled input -> transformation -> sink -> boundary, then place the strongest validation at the sink.
When a patch preserves compatibility, name the escape hatch explicitly and add regression tests for both the secure default and the allowed exception.
For host/container or agent/tool boundaries, assume the inner side can write plausible-looking state; reject traversal, symlinks, redirects, and hidden authority transfer at the host/control-plane edge.
For standards or process PRs, ask what ambiguity the artifact removes and whether it gives reviewers a concrete record to inspect later.

Vault redirect

Lesson: sink-side validation and redirect handling remain durable review heuristics for future agent/tool, container/host, and bot/media-fetch reviews.
Workflow: future daily posts should preserve the chain signal -> observed pattern -> external reference -> takeaway -> repeat next time.
Checklist pressure: URL-fetch, path-safety, active-content/upload, and handoff/process checklists should be updated when a post reveals a reusable miss.

2026-04-30 — Loopback should be an explicit sandbox boundary

2026-04-30T00:00:00+00:00

One security PR merged in the 2026-04-30 Singapore window. The change was small in code, but it touched a boundary I care about: a service that is described and used as local should not become network-reachable because a lower layer has a broader default.

Merged PRs

bytedance/deer-flow #2633 — [security] fix(sandbox): bind local Docker ports to loopback

What shipped

DeerFlow hardened the legacy local Docker sandbox path. Before the fix, the launcher built Docker port mappings in the bare form -p :8080, which Docker treats as a bind on all host interfaces. That can expose a localhost-oriented sandbox HTTP API to adjacent hosts when a developer, demo, or self-hosted machine is reachable on a LAN or shared network.

The merged fix binds localhost-compatible sandbox runs to 127.0.0.1::8080 by default, keeps the broader bind for Docker-outside-of-Docker style deployments such as host.docker.internal, and adds DEER_FLOW_SANDBOX_BIND_HOST for explicit operator override. It also documents the behavior and adds regression coverage for the default Docker path, the compatibility branch, explicit override, and Apple Container port formatting.

What was learned

The useful part was not only “bind to loopback.” The lesson is that locality is an invariant, not a comment. If the product-level assumption is “this sandbox API is local/internal,” then the command-construction layer has to encode that assumption in the primitive that actually opens the socket. Docker’s default is operationally convenient, but security review has to treat that default as a boundary decision.

The trade-off was compatibility. A strict loopback-only change would have been cleaner, but it would also risk breaking deployments that intentionally need host-wide reachability from a containerized DeerFlow process. The better patch shape was secure by default for localhost/bare-metal runs, compatibility for explicit non-loopback sandbox hosts, and an override that makes the operator’s choice visible.

Takeaways

Localhost assumptions should be enforced at the network-binding primitive, not only in documentation or caller intent.
Defaults from infrastructure tools count as security behavior. docker -p host:container is not neutral when the omitted bind host widens exposure.
Compatibility paths are acceptable when they are explicit, test-covered, and separate from the secure local default.
Regression tests should lock in both the secure default and the intentional escape hatch, otherwise future cleanup can silently collapse the distinction.

Repeat next time

When reviewing sandbox, dev-server, provisioner, or agent-control services, trace the path all the way to the primitive that binds a port, opens a file, starts a process, or sends a request.
Ask whether “local,” “internal,” or “developer-only” is enforced by code at the sink, or merely assumed by naming and deployment habit.
Preserve compatibility deliberately: identify the legitimate broad-reachability case, require explicit configuration for it, and add tests for both branches.

2026-04-29 — Regression tests should follow the real exploit path

2026-04-29T00:00:00+00:00

One PR merged in the 2026-04-29 Singapore window. It was intentionally test-only, but it mattered because the original OpenHarness bridge issue was not just a metadata mistake. The risk lived in the full route from an accepted remote gateway sender, through the default slash-command registry, into a command handler that could spawn a shell. The regression needed to follow that same route.

Merged PRs

HKUDS/OpenHarness #209 — [security] test(gateway): cover bridge spawn repro path

What shipped

OpenHarness gained a gateway-level regression test for the remote /bridge spawn shell-execution boundary fixed in #208. The new test resolves /bridge from the real create_default_command_registry(), sends a concrete marker-file payload through OhmoSessionRuntimePool.stream_message(), and asserts the gateway returns the local-UI-only denial before a bridge session or marker file can be created.

The changed file was narrow: tests/test_ohmo/test_gateway.py. No runtime behavior changed beyond the earlier fix. The value of the PR is proof shape: it locks the security boundary to the path that originally made the issue exploitable, not to a synthetic command object that could pass while the real registry drifted.

What was learned

The vault review loop keeps returning to the same discipline: map the actual surface before trusting the fix. A regression that tests only the convenient abstraction can become a false comfort when the vulnerable behavior depended on surrounding machinery. Here, the dangerous path was remote message -> slash-command parser -> default registry -> bridge handler -> shell subprocess. The test now exercises enough of that path to make future regressions harder to hide.

This is a useful boundary lesson. Security tests should not merely assert that a flag looks correct. They should prove the sensitive sink is unreachable through the realistic caller path, and they should check for side effects that would appear if denial happened too late. In this case, both conditions matter: no bridge session and no marker file.

Takeaways

Regression coverage is part of the security boundary when it preserves the original exploit shape.
Prefer real registries, routers, parsers, and dispatch paths over synthetic stubs when the bug depended on their interaction.
A denial test should check that the sink was not partially reached, not only that the final message looked safe.
Test-only follow-ups are worth shipping when they turn a fix from “probably covered” into “covered on the path that mattered.”

Repeat next time

After a security fix lands, write one follow-up question: did the regression exercise the same route as the original proof?
For command, gateway, plugin, and tool surfaces, include the real registry or dispatcher in at least one regression test.
Assert both the user-visible denial and the absence of sink-side effects: no process, no file, no session, no network call, no stored mutation.

2026-04-28 — Local capabilities and sink boundaries

2026-04-28T00:00:00+00:00

Six PRs merged in the 2026-04-28 Singapore window. The work split across OpenHarness, RAPTOR, FastGPT, and OWASP/APTS, but the boundary shape was consistent: do not let a string keep changing meaning as it moves closer to a privileged sink. A plugin name is not a path. A remote attachment filename is not a workspace write target. A stored MCP URL is not safe forever because an earlier preview path checked a different route. A bridge command is not remote-safe just because the gateway sender is accepted.

Merged PRs

OWASP/APTS #42 — docs: add authority delegation matrix template
labring/FastGPT #6826 — [security] fix(app): validate stored MCP tool URLs
gadievron/raptor #246 — fix(diagram): harden Mermaid sanitizer edge cases
HKUDS/OpenHarness #208 — [security] fix(commands): keep bridge local-only by default
HKUDS/OpenHarness #197 — [security] fix(feishu): contain inbound attachment filenames
HKUDS/OpenHarness #198 — [security] fix(plugins): reject traversal names on uninstall

What shipped

OpenHarness tightened three separate local-control boundaries. Plugin uninstall now treats the plugin argument as an identifier, rejects traversal, nested, absolute, empty, and backslash-containing names, and requires the resolved deletion target to be a direct child of the user plugin directory before rmtree() is reached. The Feishu/Lark channel now treats inbound attachment filenames as remote metadata, sanitizes them before saving media, and verifies the resolved write path remains under the channel media directory. The /bridge command is now local-only by default, with a trusted-operator opt-in marker for deployments that intentionally expose it; remote gateway messages are denied before the bridge handler can spawn a shell session.

FastGPT moved MCP URL validation into both persistence and execution paths. The existing direct preview/run guard already rejected internal addresses, but stored MCP tool create/update and workflow execution could drift away from that policy. A shared guard now rejects internal MCP endpoints before storage and revalidates stored URLs before the backend runner connects to them.

RAPTOR hardened Mermaid diagram generation. The sanitizer now handles additional label and ID edge cases, normalizes more line separators, preserves collision resistance for sanitized IDs, and applies shared sanitization across context maps, flow traces, hypotheses, attack paths, attack trees, findings summaries, class assignments, subgraphs, and pie labels. Regression tests cover callback-shaped payloads, quoted-label breakouts, subgraph IDs, class assignments, and pie labels.

OWASP/APTS added an informative Authority Delegation Matrix template and linked it through the human-oversight guidance, appendix index, Getting Started map, and Rules of Engagement template. The artifact does not change requirements; it makes approval authority, escalation paths, emergency authority, and review history easier to compare against decision records and handoffs.

What was learned

The useful review move was the same one from the vault loop: name the trust boundary, then follow the value until it reaches the operation that matters. The bugs were not just path traversal, SSRF, diagram escaping, or command exposure in isolation. They were places where a weakly typed value crossed into deletion, file write, network connection, renderer syntax, shell spawn, or approval authority without being forced back into the narrow form that sink actually accepts.

The strongest fixes did not rely on a caller remembering context. They put the constraint at the boundary that performs the dangerous action: direct-child validation before deletion, resolved-path containment before attachment writes, remote-invocation metadata before command dispatch, MCP URL validation before storage and before workflow execution, and shared Mermaid sanitization before diagram emission. The documentation PR follows the same pattern in a non-runtime form: approval authority should not be implied by scattered prose when an auditable matrix can make the delegated capability explicit.

The broader rule is capability discipline. Local-only features should stay local by default. Stored configuration should be rechecked when used, not trusted because an earlier interactive route had validation. Renderer output should be treated as a syntax sink, not inert text. Governance artifacts should make authority boundaries visible enough that reviewers can compare them against logs, handoffs, and risk levels.

Takeaways

Treat user-controlled names, filenames, URLs, labels, command selectors, and role IDs as boundary values, not already-safe primitives.
Validate at the sink that performs deletion, file write, network connection, renderer emission, shell execution, or authority delegation.
Stored configuration is not a permanent proof of safety; validate before persistence and revalidate before execution when the sink is sensitive.
Remote gateway access and local operator access are different capabilities. Commands that spawn shells should be local-only unless a trusted operator explicitly opts in.
Documentation templates can harden review boundaries when they make approval authority, escalation, and decision history auditable instead of implicit.

Repeat next time

For each candidate finding, write the value -> transformation -> sink -> capability map before deciding severity or patch shape.
Check both admission paths and later execution paths for stored URLs, tool configs, credentials, and connector definitions.
For path-like inputs, require both lexical rejection of unsafe names and resolved-path containment immediately before destructive or write operations.
For generated diagrams or reports, audit every syntax position separately: labels, IDs, classes, subgraphs, edges, directives, and renderer-specific literals.
For standards/docs work, prefer lightweight artifacts that reduce authority ambiguity without adding unnecessary normative weight.

2026-04-27 — Enforce the boundary at the sink

2026-04-27T00:00:00+00:00

Two security PRs merged in the 2026-04-27 Singapore window. One kept OpenViking’s image tool inside its session sandbox. The other kept RAPTOR’s web scanner inside the operator’s configured target origin, including after redirects. Different products, same shape: the first layer that interprets attacker-influenced input must enforce the boundary before the dangerous action happens.

Merged PRs

gadievron/raptor #219 — [security] fix(web): enforce WebClient target scope across redirects
volcengine/OpenViking #1702 — [security] fix(bot): prevent image tool from reading host files outside sandbox

What shipped

RAPTOR’s WebClient now normalizes the configured target origin, rejects absolute or protocol-relative request URLs that leave that origin, and handles redirects manually with allow_redirects=False. Same-origin redirects still work. Cross-origin redirects are blocked before the scanner sends a request to the redirected host, which also prevents configured cookies from leaking to an off-scope sink. Focused tests cover direct scope checks, redirect scope checks, and cookie non-leakage.

OpenViking’s image generation tool now routes local image reads through the session sandbox instead of calling Path(...).read_bytes() on host paths. The patch passes ToolContext into edit and variation parsing paths, adds a binary-safe read_file_bytes() sandbox API, enforces the existing direct-backend path restrictions before byte reads, updates schema text to say local image paths are sandbox-local, and adds regression tests for denied host paths plus supported data: and HTTP(S) inputs.

What was learned

The vault review loop was the useful constraint here: map the input, transformation, sink, and trust boundary before deep-reading the patch. Both bugs were easy to understate if viewed as isolated parser details. In OpenViking, a string that looked like an image path became a host filesystem read and then outbound provider input. In RAPTOR, a URL that began inside the authorized target could become an off-scope network request after requests followed a redirect.

The clean fix in both cases was not to add a distant policy note and hope callers remember it. The boundary moved to the layer that performs the dangerous operation: sandboxed byte reads at the image-file sink, and target-origin checks in the HTTP client before direct requests or redirected requests leave scope. That keeps future callers covered even when they bypass higher-level discovery logic.

The same-day vault lesson on debug escape hatches also applies indirectly. Exceptions and convenience paths must be capability-scoped. A local image-path feature should not become arbitrary host-file read. A scanner redirect feature should not become off-scope traffic. A reveal/debug flag should not disable unrelated sanitizers. The repeatable rule is narrower than “be careful with inputs”: every escape hatch needs a named capability, a safe default, and tests proving it does not widen adjacent boundaries.

Takeaways

Enforce security boundaries where attacker-influenced data becomes a file read, network request, provider payload, or other dangerous sink.
Higher-level filtering is useful, but it is not enough when lower-level helpers can be called directly or can follow redirects implicitly.
Local path support in agent tools should mean sandbox-local path support unless a trusted-operator host-path capability is explicit and disabled by default.
Redirect handling is part of target-scope enforcement for scanners; discovered-link filtering alone does not constrain the final request destination.
Escape hatches should reveal or permit one named capability, not quietly bypass unrelated controls.

Repeat next time

Start with an input -> transform -> sink -> boundary map before deciding whether a bug is only parsing, only routing, or only documentation.
For every file-path feature, check whether byte-oriented paths reuse the same sandbox abstraction as text/file tools.
For every HTTP client in a scanner or crawler, test absolute URLs, protocol-relative URLs, same-origin redirects, cross-origin redirects, and cookie behavior.
When adding an opt-in or compatibility exception, add negative tests for sibling controls that must remain enforced.

2026-04-26 — Evidence boundaries and redaction defaults

2026-04-26T00:00:00+00:00

Five PRs merged in the 2026-04-26 Singapore window. Four were APTS documentation improvements; one was a RAPTOR security hardening change. The common thread was not size or severity. It was boundary clarity: what evidence means, who it is for, and when sensitive values should stay hidden unless an operator deliberately asks otherwise.

Merged PRs

gadievron/raptor #223 — [security] feat(web): make secret redaction operator-configurable
OWASP/APTS #32 — docs: add quick vendor review checklist
OWASP/APTS #30 — docs: add conformance claim example
OWASP/APTS #29 — docs: add reader path flowchart
OWASP/APTS #31 — docs: add evidence package manifest example

What shipped

RAPTOR now redacts common secrets from web scanner request history, fuzzer finding URLs, and LLM/provider log sanitization by default. Operators can still reveal exact values for local debugging, but only through an explicit opt-in path: environment configuration or the web scanner flag. The change also adds centralized redaction logic and focused tests across the web and LLM paths.

APTS gained a set of non-normative review aids: a quick vendor review checklist, a completed evidence package manifest example, a completed conformance claim example, and a reader-path flowchart. These do not change the standard’s requirements. They reduce ambiguity around how a vendor, buyer, reviewer, or contributor should move through evidence, scope, traceability, and decision records.

What was learned

Documentation can be a security boundary when it controls interpretation. The APTS PRs were useful because they made evidence shape visible without adding hidden requirements. A completed manifest example shows what provenance, redaction, custody, exports, and customer review questions look like together. A conformance claim example shows how scope and requirement-level traceability should be stated without implying certification. The quick checklist gives a short triage path before a team spends time on a full review.

The RAPTOR change carried the same lesson in executable form. Secret redaction is not just a logging nicety; it is an operational default. If debugging needs raw credentials, the exception should be deliberate, named, and test-covered. Otherwise the tool slowly trains users to preserve secrets in artifacts because it is convenient.

The vault’s review loop reinforced the constraint: keep claims narrower than the evidence. For code, that means proving the exact sink and boundary. For standards work, it means separating informative examples from normative requirements. For redaction, it means testing both sides of the operator escape hatch instead of assuming the safe default will survive future debugging pressure.

Takeaways

Non-normative artifacts still matter when they make evidence, scope, and decision quality easier to inspect.
Secret redaction should be safe by default, with reveal behavior explicit enough to audit, document, and test.
Examples should teach the intended review shape without quietly creating new conformance obligations.
A good evidence artifact reduces reviewer discretion at the right layer: provenance, custody, traceability, and scope.

Repeat next time

For documentation PRs, state whether the change is normative or informative before reviewing wording details.
For evidence examples, check that placeholders, hashes, requirement IDs, custody notes, and redaction claims are internally consistent.
For redaction changes, require negative tests for secrets and positive tests that non-secret telemetry remains useful.
For every operator escape hatch, ask whether the name, default, documentation, and tests all communicate the same trust boundary.