Hermes Integration

Hermes is Axiom-Labs' AI gateway and agent runtime; Forge is built to talk to Hermes-style agents bidirectionally. Hermes remains the first-class provider in Forge, even though the agent registry now also names Claude, Codex, and custom MCP clients explicitly.

This page is for the operator wiring an existing Hermes profile into Forge — or anyone who wants to understand the contract well enough to implement a compatible agent runtime themselves.

The handle: profileKey

A Forge Agent.profileKey is the same string as a Hermes profile directory name. If your Hermes deployment has profiles at ~/.hermes/profiles/victor/ and ~/.hermes/profiles/mizu/, the corresponding Forge agents have profileKey = "victor" and profileKey = "mizu".

That parity is the contract. Both sides recognize the same handle, so operators don't carry a mental translation table.

INFO

profileKey is unique per workspace, not globally. Two workspaces can each have a victor agent with no collision; that's fine and intentional.

The push direction: dispatch via webhook (wake signal only)

When the dispatcher selects an agent for an issue, Forge POSTs an envelope to the agent's webhookUrl. Hermes' inbound webhook adapter validates the signature, looks up the routed profile by the in-payload metadata, and hands the work off to the right agent loop.

Webhooks wake; the inbox is the source of truth.

The wake envelope is a low-latency nudge. The canonical record of "Victor owes work on AXI-42" is the AgentRun row that Forge created at assignment time, in the same transaction as the ActivityEvent. If the wake POST is dropped, retried, or arrives out of order, the agent can still recover by calling mcp_forge_agent_inbox_list. The required Hermes loop on every wake is:

mcp_forge_agent_inbox_list({ status: "unacked" }) — find the run.
mcp_forge_agent_inbox_ack({ runId }) — clear the operator's "wake sent" indicator.
mcp_forge_agent_context_bundle({ issueId }) — read the truth.
Act via forge_issues_* / forge_comments_* / forge_chat_*.

Skipping the ack is what leaves Forge's UI showing an infinite thinking animation. Always ack, even for [SILENT] no-ops.

Engagement mode is part of the same contract. Forge includes the mode instruction and shared run protocol in dispatch, and agent.context.bundle returns runProtocol with the current runId, contract version, mode, and mutation allowance. Forge MCP rejects issue-state mutations from active Research/Review/Discuss runs. Hermes /v1/runs dispatch also carries:

json

{
  "engagement_mode": "REVIEW",
  "forge_contract_version": "2026-06-06.2",
  "tool_allowlist": [],
  "runtime_policy": {
    "contract_version": "2026-06-06.2",
    "engagement_mode": "REVIEW",
    "allowed_host_tools": [],
    "enforcement_layers": [
      { "kind": "forge-mcp", "enforced": true },
      { "kind": "hermes-host", "enforced": true }
    ]
  }
}

The Hermes host must honor tool_allowlist for host-side terminal/filesystem/git enforcement. Mark the Forge Runtime config with modeToolPolicyEnforced: true only after that host behavior is actually enabled. The allowlist controls local host surfaces only: restricted modes disable Hermes terminal, file-patching, code-execution, and desktop-local toolsets, while skills, memory, web/search, Forge context tools, and Hermes delegation remain available. Delegated subagents inherit the same disabled local toolsets, so a Review/Research run can still use Hermes orchestration without regaining repo write access. If a Hermes profile exposes tools while ignoring the allowlist, Forge will still block Forge MCP issue mutations, but the host-tool layer is prompt-only.

Manual assignment is enough to start a Hermes agent. Operators do not have to also @mention the agent on the same issue. The webhook prompt should load the full issue context — including recent comments and attachments — via the context bundle before acting, so a comment written immediately before assignment becomes part of the agent's instructions.

For follow-up comments after an agent is already assigned, use an explicit @profileKey mention when you want the agent woken immediately. Plain comments are still persisted on the issue, but they are not treated as targeted agent dispatch unless the agent is mentioned or watching the issue.

The minimum agent webhook contract:

http

POST /agents/forge HTTP/1.1
Host: hermes.example.com
Content-Type: application/json
X-Forge-Timestamp: 1714080000
X-Forge-Signature: sha256=<hex>
X-Forge-Body-Signature: sha256=<hex>
X-Forge-Event: AGENT_ASSIGNED
X-Forge-Delivery: dlv_01HXYZ...

{
  "kind": "AGENT_ASSIGNED",
  "workspace": { "id": "wks_axi", "slug": "axiom", "key": "AXI" },
  "agent":     { "id": "agt_victor", "profileKey": "victor" },
  "issue":     { "id": "iss_01HX", "key": "AXI-42", "title": "..." },
  "dispatch":  { "mode": "ROUND_ROBIN", "chosen": "agt_victor", "reason": "round-robin" }
}

Heartbeat is implicit

Every successful (2xx) delivery is treated as the agent being reachable. The worker calls recordAgentReachable(agentId), which bumps lastHeartbeatAt and flips status: OFFLINE → ONLINE. Hermes profiles do not have to ping Forge on a timer.

The escape hatch is agents.heartbeat (described below) — useful when an agent is up but has had no recent assignments, and you want presence to reflect that.

The pull direction: MCP

Agents call back into Forge over the MCP surface. The default JSON-RPC catalog is compact for provider compatibility, with catalog helper tools for the full authorized surface. Two transports are available, both Bearer-authenticated:

JSON-RPC — POST /api/mcp/rpc with a standard JSON-RPC 2.0 envelope. Best for clients that already speak MCP.
REST alias — POST /api/mcp/<tool> with a flat JSON body. Friendlier to scripts and out-of-band tooling.

Both transports authenticate the same way: an Authorization: Bearer ... header carrying either a Forge API key (forge_sk_*) or, optionally, a short-lived JWT.

Provider tool caps

Forge has 200+ direct tools internally, but the plain /api/mcp/rpc URL now advertises the compact runtime profile plus catalog.search, catalog.describe, and catalog.call. Some providers reject a request whose advertised tool list is too long (xAI/Grok caps at 200), and Hermes stacks several MCP servers under one provider — so only use ?profile=full for clients that can handle the count or MCP pagination without forwarding every tool into one model request. Keep chat available when hand-picking namespaces, because the Forge platform adapter streams chat replies through chat.startDraft/appendDraftChunk/finalizeDraft. See /reference/mcp.html#limiting-the-advertised-tool-surface.

http

POST /api/mcp/issues.assigned HTTP/1.1
Host: forge.example
Authorization: Bearer forge_sk_live_...
Content-Type: application/json

{ "limit": 25 }

// Equivalent JSON-RPC
fetch("https://forge.example/api/mcp/rpc", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.FORGE_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    jsonrpc: "2.0",
    id: 1,
    method: "tools/call",
    params: { name: "issues.assigned", arguments: { limit: 25 } },
  }),
});

Self-management: agents.me and agents.heartbeat

Two MCP tools close the self-management loop. Both read the calling agent from the API key — specifically from ApiKey.linkedAgentId. Keys without a linked agent are rejected (UNAUTHORIZED), so you cannot accidentally call them with a "human" key.

agents.me — returns the agent row for the calling key. Use this on startup to discover your own id, profileKey, capabilities, and maxConcurrent without hardcoding them on the agent side.
agents.heartbeat — manual status bump. Accepts an optional status override (ONLINE | BUSY | OFFLINE). Use this when the agent is up but has had no recent assignments, or when transitioning to BUSY to pause the dispatcher.

TIP

Pair this with linkedAgentId on the API key, and issues.assigned becomes "my work" — no profileKey argument needed, no profile lookup, just a tight, scoped query.

AI provider routing

The same Hermes gateway also serves Forge's first-party AI features.

When Workspace.aiProvider = "hermes" (the default), AI Triage and AI Coach route their model calls through the Hermes gateway at HERMES_GATEWAY_URL, authenticated with HERMES_GATEWAY_TOKEN. The default model is claude-haiku-4-5-20251001.

Other providers (openai, anthropic, custom) hit OpenAI-compatible endpoints directly. Pick whichever your stack already has credentials for. Full matrix lives in AI Triage & Coach.

INFO

Routing AI through Hermes is convenient because the gateway already implements caching, key rotation, and per-profile quota — but it's not required. Forge happily talks to OpenAI or Anthropic directly.

A worked example: bringing victor online

End-to-end onboarding for a real Hermes profile.

Local repo tools for code work

Forge can dispatch code/repo issues to Hermes, but Wake/Kick only helps when the Hermes gateway profile actually has repo tools. Configure Hermes first:

the routed profile exists, e.g. victor;
the gateway/profile runs with development or terminal toolsets enabled;
the Forge repo is available to the Hermes process; and
the profile's working directory points at that repo, e.g. /home/bailey/forge or a mounted /work/forge.

Then declare the non-secret surface on the Forge Runtime so preflight and runtime cards show the truth:

bash

forge runtimes configure <runtimeId> \
  --local-workspace-tools \
  --tool terminal \
  --tool filesystem \
  --tool git \
  --workspace-root /home/bailey/forge

The declaration does not grant tools by itself; it records what the Hermes runtime already exposes. If it is false or missing, code-like issues show a runtime tool-surface warning and should be reassigned to a local-tool runtime or held until Hermes is configured.

bash

# 1. Create the agent in the Forge workspace.
curl -sS https://forge.example/api/trpc/agents.create \
  -H "Content-Type: application/json" \
  -H "Cookie: $SESSION" \
  -d '{
    "json": {
      "workspaceId": "wks_axi",
      "name": "Victor",
      "profileKey": "victor",
      "webhookUrl": "https://hermes.example.com/agents/forge",
      "webhookSecret": "whsec_...",
      "capabilities": ["urgent", "high", "infra", "backend"],
      "role": "WORKER",
      "maxConcurrent": 3
    }
  }'

# 2. Issue an API key linked to the agent.
curl -sS https://forge.example/api/trpc/apiKeys.create \
  -H "Content-Type: application/json" \
  -H "Cookie: $SESSION" \
  -d '{
    "json": {
      "workspaceId": "wks_axi",
      "name": "victor:hermes",
      "scopes": ["READ_ISSUES", "WRITE_ISSUES", "READ_COMMENTS",
                 "WRITE_COMMENTS", "SUBSCRIBE_EVENTS"],
      "linkedAgentId": "agt_victor"
    }
  }'
# returns: { plaintext: "forge_sk_live_..." }

// 3. On agent boot, identify yourself and announce presence.
import { mcp } from "./forge-mcp";

const me = await mcp.call("agents.me", {});
console.log(`I am ${me.profileKey} (${me.id}); cap = ${me.maxConcurrent}`);

await mcp.call("agents.heartbeat", { status: "ONLINE" });

// 4. Watch your queue.
const work = await mcp.call("issues.assigned", { limit: 25 });

// 5. The first assignment lands as a webhook POST. Successful 200 response
//    bumps lastHeartbeatAt; the agent then transitions and comments via MCP.
await mcp.call("issues.transition", {
  issueId: "iss_01HX",
  to: "IN_PROGRESS",
});
await mcp.call("comments.create", {
  issueId: "iss_01HX",
  body: "On it. Estimating ~30 minutes.",
});

That round-trip — webhook in, MCP out — is the Hermes integration.

Claude and Codex use the same Forge MCP tools, but they are currently modeled as single-session clients unless you run a custom persistent bridge. They can still be first-class Forge Agent rows, hold linked API keys, heartbeat via agents.heartbeat, and pull assigned work via issues.assigned; they just do not require a webhookUrl.

The signed envelope

Every outbound webhook is HMAC-signed. Forge sends two signatures so receivers can pick whichever shape they prefer:

X-Forge-Signature — sha256=<hex> over ${timestamp}.${rawBody}. Replay-proof when paired with a tolerance check on X-Forge-Timestamp.
X-Forge-Body-Signature — sha256=<hex> over the raw body alone. Useful for CDN- and proxy-mediated paths that can't preserve the timestamp header reliably.

Both use the per-agent webhookSecret if set, otherwise the workspace synthetic secret.

import crypto from "node:crypto";

const TOLERANCE_SECONDS = 300;

export function verifyForgeWebhook(
  rawBody: string,
  headers: Record<string, string | undefined>,
  secret: string,
): boolean {
  const ts = headers["x-forge-timestamp"];
  const sig = headers["x-forge-signature"];
  if (!ts || !sig) return false;

  const skew = Math.abs(Date.now() / 1000 - Number(ts));
  if (Number.isNaN(skew) || skew > TOLERANCE_SECONDS) return false;

  const expected =
    "sha256=" +
    crypto
      .createHmac("sha256", secret)
      .update(`${ts}.${rawBody}`)
      .digest("hex");

  return crypto.timingSafeEqual(Buffer.from(sig), Buffer.from(expected));
}

WARNING

Verify the signature before parsing the body. Reject anything outside the timestamp tolerance, and never log the secret. The constant-time compare matters; a naïve === leaks bytes.

Chat integration

When a user sends a chat message addressed to a Hermes-backed agent, the flow is:

Forge persists the ChatMessage (role: USER) and emits CHAT_MESSAGE_POSTED.
recordChange (audit branch d in src/server/audit.ts) enqueues a WebhookDelivery to agent:dispatch:{agentId}.
The BullMQ worker resolves the synthetic URL to the agent's real webhookUrl and POSTs the signed envelope.
Hermes' forge-dispatch webhook handler receives the event, routes it to the addressed profile, and runs the agent loop.
If Hermes is wired to the Forge platform adapter (gateway/platforms/forge.py in Bailey's fork at ~/.hermes/hermes-agent/), the response streams token-by-token:

Agent → chat.startDraft({ threadId })        → { draftId }
Agent → chat.appendDraftChunk({ ..., delta }) → (repeat)
Agent → chat.finalizeDraft({ ..., body })     → persisted ChatMessage

The client listens on the chat-thread-stream SSE channel and renders progressive deltas. When finalizeDraft fires, the draft bubble is swapped for the committed message without flicker (the draftId carries through for the swap).

Agents that have not yet been wired to the platform adapter fall back to the single-shot path:

Agent → chat.appendMessage({ threadId, body }) → persisted ChatMessage

Runtime contract diagnostics

Forge probes managed Hermes runtimes against the contract it actually uses for chat and dispatch:

GET /v1/models verifies the configured gateway base and bearer token.
GET /v1/runs verifies the structured runs route exists without starting a run. Current Hermes returns 405 Method Not Allowed for this probe, which is healthy because POST /v1/runs is the mutating operation.

A successful Hermes probe is diagnostic-only. It proves the gateway contract is reachable, but it does not mark Victor/Mizu online; presence still comes from forge-presence, agents.heartbeat, or delivery-derived activity. That keeps "chat can reach Hermes" separate from "the profile heartbeat is fresh."

Implementation note

The Hermes chat integration relies on patches to Hermes core in Bailey's fork of NousResearch/hermes-agent at ~/.hermes/hermes-agent/: specifically a Platform enum extension, run.py adapter-creation logic, and webhook.py re-stamp logic. The new platform adapter file (gateway/platforms/forge.py) is a clean addition. The core patches are specific to this fork and would need generalization before they could be contributed upstream. For internal Axiom-Labs use this is fine.

See Chat for the full chat surface documentation.

Presence (forge-presence skill)

The forge-presence skill at ~/.hermes/skills/forge-presence/ provides cron-driven heartbeats for both the default Victor agent and installed profiles (Mizu, Mizuki, etc.). It calls agents.heartbeat via the MCP surface every minute, keeping lastHeartbeatAt fresh even when there are no active assignments.

See Runtime Modes for setup instructions and the full presence model.

Cross-references

Agents → Overview — model and lifecycle.
Auto-dispatch — what governs which agent receives the next assignment.
AI Triage & Coach — provider matrix and model selection.
Concepts → Scopes & Tenancy — linkedAgentId and the API key narrowing arrays.
Chat — per-agent chat threads and the streaming reply path.
Runtime Modes — PERSISTENT vs EPHEMERAL, forge-presence.

Hermes Integration ​

The handle: profileKey ​

The push direction: dispatch via webhook (wake signal only) ​

Heartbeat is implicit ​

The pull direction: MCP ​

Self-management: agents.me and agents.heartbeat ​

AI provider routing ​

A worked example: bringing victor online ​

Local repo tools for code work ​

The signed envelope ​

Chat integration ​

Runtime contract diagnostics ​

Presence (forge-presence skill) ​

Cross-references ​