Incidents & watchers

Watchers are conditional rules over events or scheduled checks. When a watcher's condition holds, it fires an incident — a stateful, dedup'd record that goes through firing → acknowledged → resolved (or escalated).

This is the AI Automation equivalent of "monitoring alerts" — but designed for business signals (high-risk transactions per hour, KYC funnel drop-off, queue depth on a partner integration) instead of infrastructure metrics.

Watcher object

{
  "id": "wch_01HXY...",
  "name": "PEP screening backlog > 50",
  "source": {
    "kind": "cron_check",
    "cronExpression": "*/5 * * * *",
    "serviceSlug": "aml",
    "fetchPath": "/api/screenings?status=pending&limit=1",
    "fetchMethod": "GET"
  },
  "condition": { "field": "total", "op": "gt", "value": 50 },
  "severity": "critical",
  "silenceWindowSec": 900,
  "dedupKey": "$.severity",
  "channelIntegrationIds": ["int_slack_compliance..."],
  "emailTo": ["mlro@bankacme.id"],
  "status": "active",
  "triggerToken": "wtok_...",
  "triggerSecret": "wsec_... (only on read)"
}

Source kinds

`source.kind`	What it does
`sibling_event`	Subscribes to a sibling product's webhook events (e.g. `aml.alert.created`). Evaluates condition against the event payload.
`flow_output`	Re-evaluates a flow's output every time the flow finishes. Catch "LLM said something concerning."
`cron_check`	On a schedule, fetches from a sibling API and evaluates condition against the response.
`manual`	Push mode — your service POSTs to the watcher's public HMAC-signed URL with arbitrary payload.

Condition operators

Op	Meaning
`gt` `lt` `gte` `lte`	Numeric comparison.
`eq` `ne`	Strict equality / inequality.
`contains`	Substring (string) or membership (array).
`in`	Value is in a fixed list.
`regex`	RFC 9485 — anchored regex match against a string.

Dedup + silence

silenceWindowSec suppresses re-firings of the same logical incident within the window. The dedupKey field (jsonPath-style) is evaluated against the firing payload to decide what counts as "same logical incident." Without a dedup key, every match creates a new incident.

A common pattern: dedupKey: "$.customerId" so multiple alerts for the same customer collapse into one incident with fireCount ticking up.

Create a watcher

POST/api/watchers

Auth · API keyScope · watchers:write

curl -X POST .../api/watchers \
  -H "Authorization: Bearer $QE_API_KEY" \
  -d '{
    "name": "Sandbox SAR submitted by non-MLRO",
    "source": {
      "kind": "sibling_event",
      "serviceSlug": "aml",
      "eventName": "sar.submitted"
    },
    "condition": { "field": "submittedByRole", "op": "ne", "value": "mlro" },
    "severity": "critical",
    "silenceWindowSec": 0,
    "channelIntegrationIds": ["int_slack_compliance..."]
  }'

Response includes triggerToken + triggerSecret (push-mode endpoint), even for non-manual sources — switch the source kind later without re-issuing credentials.

Test a watcher

POST/api/watchers/{id}/test

Auth · API keyScope · watchers:write

{ "payload": { "submittedByRole": "analyst", "sarId": "sar_test..." } }

Returns { matched: true, severity, dedupKey } without actually firing an incident. Use before activating.

Incident lifecycle

firingnewly created

acknowledgedsomeone is on it

resolvedcondition cleared

Re-firings within the silence window bump fireCount and update lastFiredAt but don't create a new incident row. firing can also transition directly to resolved (no ack required), and either step can escalate to the next-tier owner.

Incident endpoints

GET/api/incidents

Auth · API keyScope · incidents:read

GET/api/incidents/{id}

Auth · API keyScope · incidents:read

POST/api/incidents/{id}/ack

Auth · API keyScope · incidents:write

POST/api/incidents/{id}/resolve

Auth · API keyScope · incidents:write

POST/api/incidents/{id}/escalate

Auth · API keyScope · incidents:write

POST/api/incidents/{id}/comments

Auth · API keyScope · incidents:write

List filters: status (firing · acknowledged · escalated · resolved), severity, watcherId, q (free-text on title), limit (default 100, max 500). The list response also includes totalsByStatus so dashboards don't need a second roundtrip.

{
  "data": {
    "incidents": [
      {
        "id": "inc_01HXY...",
        "watcherId": "wch_01HXY...",
        "watcher": { "id": "wch_01HXY...", "name": "PEP screening backlog > 50" },
        "severity": "critical",
        "status": "firing",
        "title": "PEP screening backlog > 50 (currently 87)",
        "summary": "...",
        "fireCount": 4,
        "firstFiredAt": "2026-05-25T07:35:00Z",
        "lastFiredAt": "2026-05-25T08:10:00Z"
      }
    ],
    "totalsByStatus": { "firing": 3, "acknowledged": 1, "resolved": 27 }
  }
}

Ack / resolve / escalate

curl -X POST .../api/incidents/inc_01HXY.../ack \
  -d '{ "note": "Working on it — adding two more screeners for the next hour." }'
 
curl -X POST .../api/incidents/inc_01HXY.../resolve \
  -d '{ "note": "Backlog drained; root cause was a paused screening rule." }'
 
curl -X POST .../api/incidents/inc_01HXY.../escalate \
  -d '{ "to": "usr_compliance_director_...", "policy": "compliance_chain" }'

Webhooks

Event	Fires when
`incident.fired`	New incident created.
`incident.refired`	Same incident re-fired within the silence window.
`incident.acknowledged`	Someone acked.
`incident.resolved`	Resolved.
`incident.escalated`	Manually escalated.
`incident.auto_resolved`	Self-resolved without action (cron_check watcher's condition flipped back).

See Webhooks → for payload shape.

Watchers vs flow logic

A watcher's job is detection. A flow's job is response. If your incident should automatically do something, point a flow's webhook trigger at the incident.fired event — the flow runs the playbook (open a ticket, page on-call, etc.) and the watcher stays focused on "is this still bad?"

Agents & conversations Digests