Update the skills to support creating and using markers#17
Conversation
There was a problem hiding this comment.
Really nice expansion — the marker lifecycle walkthrough, the deploy-time vs save-time validation split, and the cleanup-order section are exactly the kind of detail an agent needs. I cross-checked the documented contracts against the current server and gateway behavior and left a handful of suggestive line comments. Two themes worth calling out:
- Release ordering — a few sections describe server behavior that's still in flight (
writableKeySchema, marker tools available without the feature flag). Might be worth holding this until that ships, or noting the pairing in the PR description. - Marker scope — session state is durably scoped to tenant + user, so a couple of "fresh session" verification/recovery steps won't behave as described.
One small thing I couldn't attach to a line since the file isn't in the diff: dtwo/README.md's skills table still scopes dtwo-gateway-policy to "policies and pipelines" — a mention of markers (and gated intent) there would help routing.
Also, good news from the same pass: the canonical writer Rego checks out end-to-end (egress response content really is exposed as input.payload.text, a list of strings), dtwo-delete-policy behaves exactly as documented, and every cross-referenced section name resolves. 🎉
|
|
||
| ### Marker Registry Tools | ||
|
|
||
| Markers are session-state flags that one policy writes and other policies read to gate on (see Managing Markers). These tools are **always registered** on the DTwo MCP server — they do not depend on any feature flag. |
There was a problem hiding this comment.
suggestion: As of current server builds, the marker tools are still registered behind the same feature flag as the intent tools — the change that makes them unconditionally available is still in flight on the server side. If this doc is meant to land together with that change, all good — but the emphatic "always registered … do not depend on any feature flag" could send agents after tools that aren't there on deployments that predate it. Maybe soften to "registered by default (as of )", or hold the merge until the server change ships?
| | `dtwo-add-policy` | Validate and create a new policy (requires name, description, policy, packageName, direction) | | ||
| | `dtwo-update-policy` | Update an existing policy's draft — any field (policy, packageName, name, description, direction, tags). Validates Rego when both policy and packageName are provided | | ||
| | `dtwo-add-policy` | Validate and create a new policy (requires name, description, policy, packageName, direction). Optionally pass `writableKeySchema` to declare the session-state keys the policy is authorized to write — required for any policy that emits a marker (see Managing Markers) | | ||
| | `dtwo-update-policy` | Update an existing policy's draft — any field (policy, packageName, name, description, direction, tags, `writableKeySchema`). Validates Rego when both policy and packageName are provided. `writableKeySchema` is tri-state: omit → leave unchanged, `null` → clear, `[]` → explicit-empty, `[...]` → set | |
There was a problem hiding this comment.
note: writableKeySchema (here and on dtwo-add-policy above, including the tri-state semantics) matches the in-flight server change exactly, but currently released servers don't accept the field yet — an agent following this against one of those would get a schema rejection. Purely a merge-ordering thing; a note about the pairing in the PR description would cover it. 👍
| - `name` — the session-state key, matching the registered marker exactly (e.g. `marker:acme:pii_detected`). | ||
| - **Marker key format enforcement.** The tool boundary rejects `writableKeySchema[].name` values starting with `marker:` unless they match `^marker:[A-Za-z0-9][A-Za-z0-9_-]*:[A-Za-z0-9][A-Za-z0-9_-]*$` — exactly two colon-separated segments of alphanumerics, underscores, or hyphens, each segment starting with an alphanumeric (case-insensitive). Bare keys (no `marker:` prefix) are structurally validated by the backend instead. Registry *existence* — that the key is actually in the marker registry — is enforced separately at deploy time by the `marker-not-in-registry` rule (the deploy fails with `UnknownMarkerReferences`; see the Deploy-time validator note below). | ||
| - `jsonSchema` — a **stringified JSON object** (a JSON Schema) for the value the policy writes. Rejected at the tool boundary if it doesn't parse as a JSON object (arrays and primitives fail). Use it strictly (`additionalProperties: false`, `required` lists) so drift is caught. Add `"x-d2-is-marker": true` for marker keys. | ||
| - `ttlSeconds` — per-key TTL. For a marker key this must be **≥ the marker's registered `minimumTtlSeconds`**; a lower value is rejected at policy save time. |
There was a problem hiding this comment.
suggestion: I couldn't find this save-time rejection in the current implementation (the TTL is only checked for being a positive integer, and the deploy-time validator checks registry membership, not TTL) — so a below-floor value currently saves and deploys quietly. Until enforcement ships, maybe phrase it as "must be ≥ the marker's registered minimumTtlSeconds (keep them in sync manually — not yet enforced)"? Same tweak applies to the twin sentence in dtwo-policy-rego (§ writableKeySchema) and the short-TTL testing tip below, which leans on this check existing.
| 1. **Confirm the deploy and attachment** as for any policy — poll `dtwo-get-deployment` to `completed`, then `dtwo-get-gateway-pipelines` to confirm both the writer and the reader landed with the expected `evalNamespace` and version pins. | ||
| 2. **Trigger the writer first, in one session.** Make the tool call that satisfies the writer's condition (e.g. a response containing PII). This is what stamps the marker — nothing is active until the writer fires. | ||
| 3. **Then exercise the reader in that same session.** Call the reader's guarded tool and confirm it now denies (or transforms) as intended. Because markers are session-scoped, this only proves out within the session where the writer fired. | ||
| 4. **Confirm the negative case in a fresh session.** With no writer having fired (or after the TTL expires), the reader's tool should succeed — proving the reader blocks only when the marker is active, not unconditionally. |
There was a problem hiding this comment.
suggestion: This step (and the "start a new session" recovery advice in Marker constraints today below) may not behave as described: the gateway scopes session state durably to tenant + user, not per connection — so a marker stamped by the same user persists into a brand-new session until its TTL expires, and this "fresh session" negative test would spuriously deny. Suggest making the negative case TTL-based (short TTL + wait it out, as your testing tip already recommends) and describing recovery as TTL-expiry only. Relatedly, dtwo-policy-rego's "scoped by tenant and user (not by server)" wording is the accurate one — worth standardizing both files on that and dropping the "session-scoped" phrasing.
|
|
||
| | Tool | Purpose | | ||
| |------|---------| | ||
| | `dtwo-set-intent` | Declare the current session intent (the working intent captured by the gateway's egress capture policy). Accepts only intents registered in the registry | |
There was a problem hiding this comment.
nit: "Accepts only intents registered in the registry" slightly overstates the current behavior — the tool presently resolves against a built-in vocabulary (registry-driven resolution looked like planned follow-up work), so an intent that's in the registry but not in the built-in set is rejected. Maybe "Accepts only intents from the built-in vocabulary (registry-driven resolution planned)"?
| |------|---------| | ||
| | `dtwo-list-gateways` | List gateways with optional filters (name, status, uid) | | ||
| | `dtwo-get-gateway` | Fetch a single gateway by UID | | ||
| | `dtwo-get-gateway-config` | Fetch the gateway's YAML configuration. Used here **read-only** to discover `mcp_servers[].name` and tool names when authoring policies (see Tool Discovery). Editing gateway YAML belongs to the companion `dtwo-gateway-config` skill | |
There was a problem hiding this comment.
suggestion: dtwo-get-gateway-config returns the draft configuration (as the companion dtwo-gateway-config skill notes), which can diverge from what's actually deployed. Since Tool Discovery here feeds the Dtwo upstream-name check that the Intent Capture section flags as a silent-bypass risk, a one-line caveat like "note this returns the draft — confirm names against the published/deployed config before relying on them" would close that loop nicely.
| A policy that emits a marker must declare the key in its `writableKeySchema` (on `dtwo-add-policy` / `dtwo-update-policy`), or the gateway drops the write. Each entry is: | ||
|
|
||
| - `name` — the session-state key, matching the registered marker exactly (e.g. `marker:acme:pii_detected`). | ||
| - **Marker key format enforcement.** The tool boundary rejects `writableKeySchema[].name` values starting with `marker:` unless they match `^marker:[A-Za-z0-9][A-Za-z0-9_-]*:[A-Za-z0-9][A-Za-z0-9_-]*$` — exactly two colon-separated segments of alphanumerics, underscores, or hyphens, each segment starting with an alphanumeric (case-insensitive). Bare keys (no `marker:` prefix) are structurally validated by the backend instead. Registry *existence* — that the key is actually in the marker registry — is enforced separately at deploy time by the `marker-not-in-registry` rule (the deploy fails with `UnknownMarkerReferences`; see the Deploy-time validator note below). |
There was a problem hiding this comment.
nit: "(case-insensitive)" is a bit of a trap here — the validation accepts either case but nothing normalizes, and keys are exact-match strings, so a case mismatch between the registry entry and session_writes silently drops the write. Suggest dropping "(case-insensitive)" or replacing with "mixed case accepted but never normalized — keys are exact-match, so keep casing identical everywhere (lowercase recommended)".
| The Rego bodies and the two coupling requirements below are detailed in `dtwo-policy-rego` (Intent-capture policies). The two requirements that block them silently if missed: | ||
|
|
||
| - **Upstream server must be named `Dtwo`.** Both policies fire on tool names case-folding to `dtwo-set-intent` / `dtwo-set_intent`. The DTwo MCP upstream entry in the gateway's `mcp_servers` config **must** be named `Dtwo` (anything case-folding to `dtwo`). A different name silently bypasses capture and deadlocks the ingress gate. | ||
| - **UID placeholder swap.** Both policies share the capture policy's own UID (`REPLACE-WITH-INTENT-CAPTURE-POLICY-UID`). After `dtwo-add-policy` returns the real UID, replace the placeholder in both bodies and re-save. Until swapped, the ingress gate denies every non-`set_intent` call and the egress capture treats every set as first-set (no transition enforcement). |
There was a problem hiding this comment.
suggestion: The egress starter policy's unswapped-placeholder behavior is the opposite of "treats every set as first-set (no transition enforcement)" — it deliberately hard-denies every set_intent while the placeholder is present, so the unconfigured state is loud rather than silent. Worth fixing so an agent debugging a blanket set_intent deny gets pointed at the missed swap rather than away from it. Same sentence appears in dtwo-policy-rego (§ Intent-capture policies).
| _pii_active | ||
| } | ||
|
|
||
| reason := "PII was detected earlier in this session; outbound Slack sends are blocked. Wait for the marker TTL to expire or start a new session." if { |
There was a problem hiding this comment.
nit: Tiny follow-on from the scope discussion — since markers persist across sessions for the same user (durable tenant+user scope), "or start a new session" in this user-facing deny reason (and the bullet just below) would set the wrong expectation. "Wait for the marker TTL to expire" alone is the accurate advice.
| "payload": {}, // Hook-specific data (see Payload by Hook Type) | ||
| "tool_metadata": {} | null, // Tool definition (name, url, auth_type, gateway_id, input_schema, ...) | ||
| "context": {}, // Mirror of top-level fields (correlation_id, payload, tool_metadata, ...) | ||
| "context": {}, // Mirror of top-level fields (correlation_id, payload, tool_metadata, ...) plus context.session.policies[<writer_uid>][<key>] for reading markers / session state — see Session State & Markers |
There was a problem hiding this comment.
nit: Keeping "Mirror of top-level fields" while adding context.session.policies slightly breaks the mirror invariant — there's no top-level input.session, so an author generalizing the mirror rule might write input.session.policies[...], which silently never matches. Maybe "Mirror of top-level fields …; additionally carries context.session.policies[<writer_uid>][] (context-only, no top-level equivalent)"? The PARC Context bullet above could pick up session in its enumeration at the same time.
oschaaf
left a comment
There was a problem hiding this comment.
Some comments/suggestions via Claude for you to assess, with that LGTM.
Implements #1167