OpenTelemetry Integrations¶
Hangar is the runtime governance layer for MCP servers. It is not an observability platform. Hangar exports governance telemetry -- enforcement decisions, MCP server lifecycle events, capability violations, and identity-aware audit trails -- through the OpenTelemetry (OTEL) interoperability contract. Partner backends visualize it.
This page covers how to connect Hangar to each supported backend and what governance data flows through.
MCP Attribute Taxonomy¶
Every span, metric, and audit log emitted by Hangar carries MCP-specific attributes defined in src/mcp_hangar/observability/conventions.py. These attributes form a stable contract that partner backends consume without Hangar-specific plugins.
MCP Server attributes¶
| Attribute | Type | Description |
|---|---|---|
mcp.server.id | string | Unique MCP server identifier (e.g. math-server) |
mcp.server.mode | string | Operational mode: subprocess, docker, remote |
mcp.server.state | string | Lifecycle state: COLD, INITIALIZING, READY, DEGRADED, DEAD |
mcp.server.group_id | string | MCP Server group membership |
mcp.server.image | string | Container image reference (docker mode) |
mcp.server.has_capabilities | string | Whether MCP server declares capabilities (true/false) |
mcp.server.enforcement_mode | string | Declared enforcement mode: alert, block, quarantine |
Tool invocation attributes¶
| Attribute | Type | Description |
|---|---|---|
mcp.tool.name | string | Tool name as advertised by the MCP server |
mcp.tool.duration_ms | float | Call duration in milliseconds |
mcp.tool.status | string | Result: success, error, timeout, blocked |
mcp.tool.cold_start | string | Whether this call triggered a cold start (true/false) |
mcp.tool.args_hash | string | Argument hash for audit (raw arguments are never exported) |
mcp.tool.response_tokens | int | Approximate token count of tool response |
mcp.session.id | string | MCP protocol session identifier |
mcp.agent.id | string | Agent or client identifier |
mcp.user.id | string | Human user identity behind the agent request |
mcp.correlation_id | string | Correlation ID for multi-step agent workflows |
Enforcement attributes¶
| Attribute | Type | Description |
|---|---|---|
mcp.enforcement.policy_result | string | Policy evaluation result: allow, deny, quarantine |
mcp.enforcement.policy_name | string | Name of the evaluated policy |
mcp.enforcement.action | string | Action taken: none, alert, block, quarantine, rate_limit |
mcp.enforcement.violation_type | string | Violation category: egress_undeclared, tool_schema_drift, resource_limit_exceeded |
mcp.enforcement.egress_destination | string | Destination involved in egress violation (host:port) |
mcp.enforcement.violation_count | int | Accumulated violations for this MCP server in this session |
Audit attributes¶
| Attribute | Type | Description |
|---|---|---|
mcp.audit.principal_type | string | Principal type: api_key, jwt, oidc, anonymous |
mcp.audit.principal_id | string | Principal identifier (API key ID, JWT sub claim) |
mcp.audit.principal_roles | string | Roles held at call time (comma-separated) |
mcp.audit.authenticated | string | Whether request passed authentication (true/false) |
mcp.audit.authorized | string | Whether request passed authorization (true/false) |
mcp.audit.data_sensitivity | string | Response classification: public, internal, confidential, restricted |
Behavioral attributes¶
| Attribute | Type | Description |
|---|---|---|
mcp.behavioral.matches_baseline | string | Whether call matches baseline pattern (true/false) |
mcp.behavioral.anomaly_score | float | Anomaly score (0.0 = normal, 1.0 = highly anomalous) |
mcp.behavioral.rule_id | string | Detection rule that matched |
mcp.behavioral.pattern_step | int | Sequence position in a detected multi-step pattern |
mcp.behavioral.pattern_name | string | Name of the detected behavioral pattern |
mcp.behavioral.deviation_type | string | Type of behavioral deviation detected |
Caller attributes (identity propagation)¶
| Attribute | Type | Description |
|---|---|---|
mcp.caller.type | string | Caller type: human, agent, service, anonymous |
mcp.caller.id | string | Caller identifier (user ID, service account, API key ID) |
mcp.caller.roles | string | Roles held at invocation time (comma-separated) |
Cost attributes (FinOps)¶
| Attribute | Type | Description |
|---|---|---|
mcp.cost.cents | int | Cost of invocation in hundredths of a cent |
mcp.cost.model | string | Pricing model: token, duration, fixed, composite |
mcp.cost.input_tokens | int | Input tokens consumed (LLM-backed tools) |
mcp.cost.output_tokens | int | Output tokens produced (LLM-backed tools) |
mcp.cost.currency | string | ISO 4217 currency code (default: USD) |
Risk attributes (semantic analysis)¶
Attributes in the mcp.risk.* namespace carry detection rule match signals from the semantic analysis engine. The engine evaluates multi-step tool call sequences per agent session against detection rules (e.g. credential exfiltration, privilege escalation). Partner backends such as OpenLIT, Grafana, and SIEM tools can filter spans by mcp.risk.severity = critical to surface high-risk events.
Requires the ENTERPRISE license tier. The mcp.risk.score and mcp.risk.session_anomaly_score attributes are populated by the WeightedRiskScorer when behavioral profiling is active. The scorer aggregates signals from BehavioralDeviationDetected, DetectionRuleMatched, and CapabilityViolationDetected events with exponential time decay.
| Attribute | Type | Description |
|---|---|---|
mcp.risk.rule_id | string | Matched detection rule identifier (e.g. credential-exfiltration) |
mcp.risk.pattern_name | string | Human-readable name of the matched detection pattern |
mcp.risk.severity | string | Severity: critical, high, medium, low |
mcp.risk.response_action | string | Recommended response: alert, throttle, suspend, block |
mcp.risk.session_id | string | Session ID where the match was detected |
mcp.risk.matched_tools | string | Comma-separated tool names that formed the matched sequence |
mcp.risk.score | float | Aggregate session risk score (0.0 = no risk, 1.0 = maximum risk) |
mcp.risk.session_anomaly_score | float | Per-session anomaly score relative to baseline behavior |
Health attributes¶
| Attribute | Type | Description |
|---|---|---|
mcp.health.result | string | Health check result: passed, failed, timeout |
mcp.health.consecutive_failures | int | Number of consecutive failures |
mcp.health.duration_ms | float | Health check response time in milliseconds |
Prometheus metric names¶
| Metric | Type | Description |
|---|---|---|
mcp_hangar_tool_calls_total | Counter | Total tool invocations |
mcp_hangar_tool_call_duration_seconds | Histogram | Tool call latency distribution |
mcp_hangar_mcp_server_state | Gauge | Current MCP server lifecycle state (0=cold, 1=initializing, 2=ready, 3=degraded, 4=dead) |
mcp_hangar_cold_start_phase_duration_seconds | Histogram | Duration of cold start phases (labels: mcp_server, phase) |
mcp_hangar_cold_starts_in_progress | Gauge | Cold starts currently in progress |
mcp_hangar_health_checks_total | Counter | Total health checks |
mcp_hangar_circuit_breaker_state | Gauge | Circuit breaker state per MCP server |
mcp_hangar_capability_violations_total | Counter | Total capability violations |
mcp_hangar_detection_rule_matches_total | Counter | Total detection rule matches (labels: rule_id, severity) |
mcp_hangar_tool_schema_drifts_total | Counter | Total tool schema drift detections |
mcp_hangar_cost_cents_total | Counter | Total attributed cost in hundredths of a cent (labels: mcp_server, tool, cost_model) |
mcp_hangar_cost_attributions_total | Counter | Total cost attribution computations (labels: mcp_server, tool) |
OTEL Collector¶
The OTEL Collector is the recommended entry point for governance telemetry. It receives OTLP from Hangar and routes spans, metrics, and logs to any supported backend -- Prometheus, Jaeger, OpenLIT, Datadog, or a custom pipeline.
Example: examples/otel-collector/
Getting started¶
Set these environment variables before starting Hangar:
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
OTEL_SERVICE_NAME=mcp-hangar
MCP_TRACING_ENABLED=true
Start the reference stack:
This starts:
| Service | Port | Purpose |
|---|---|---|
| Hangar | 8080 | MCP control plane with OTLP export |
| OTEL Collector | 4317 (gRPC), 4318 (HTTP) | Telemetry receiver and router |
| Prometheus | 9090 | Metrics storage and query |
The collector config (otel-collector-config.yaml) receives OTLP on both gRPC and HTTP, exports metrics to Prometheus, and prints spans and logs to console for debugging. Replace the logging exporter with your production backend.
What flows through the collector¶
- Traces: Tool invocation spans carrying
mcp.server.id,mcp.tool.name,mcp.tool.status, and enforcement attributes. - Logs: Audit log records for tool invocation events and MCP server state transitions, exported by
OTLPAuditExporter. - Metrics: Prometheus metrics scraped from Hangar's
/metricsendpoint or forwarded through the collector's Prometheus exporter.
OpenLIT¶
OpenLIT provides a trace explorer, session analytics, and cost attribution UI. Hangar exports governance telemetry; OpenLIT visualizes it. They connect through the OTEL Collector -- Hangar does not send data directly to OpenLIT.
Example: examples/openlit/
Getting started¶
This starts Hangar, an OTEL Collector, and OpenLIT. Open the OpenLIT dashboard at http://localhost:3000.
Environment variables for Hangar (already set in the docker-compose):
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
OTEL_SERVICE_NAME=mcp-hangar
MCP_TRACING_ENABLED=true
What governance data is visible in OpenLIT¶
In the OpenLIT trace explorer, filter on MCP governance attributes:
- By MCP server:
mcp.server.id = "math-server" - By tool:
mcp.tool.name = "add" - By user:
mcp.user.id = "alice" - By enforcement action:
mcp.enforcement.action = "block" - By violation type:
mcp.enforcement.violation_type = "egress_undeclared"
MCP Server lifecycle events (COLD, INITIALIZING, READY, DEGRADED, DEAD) appear as audit log records with mcp.server.state attributes.
Langfuse¶
Langfuse provides LLM-specific observability: input/output recording, token counting, user session tracking, and evaluation workflows. It complements the OTEL governance telemetry path -- Langfuse handles LLM observability while OTEL handles governance observability.
- OTEL path: Enforcement decisions, capability violations, MCP server lifecycle, audit trails. Exported via OTLP to any OTEL-compatible backend.
- Langfuse path: Tool call input/output, token counts, user session traces. Exported via the
LangfuseObservabilityAdapter.
Example: examples/langfuse/
Getting started¶
You need a running Langfuse instance -- either Langfuse Cloud or a self-hosted deployment.
Set these environment variables:
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_HOST=https://cloud.langfuse.com # or your self-hosted URL
Enable Langfuse in config.yaml:
observability:
langfuse:
enabled: true
# Keys are read from environment variables
# Never put secret keys in config files
Secret handling
LANGFUSE_SECRET_KEY is a secret. Use environment variables, HashiCorp Vault, or Kubernetes secrets. Never commit secrets to config files or source control.
How Hangar maps to Langfuse concepts¶
| Langfuse concept | Hangar mapping |
|---|---|
| Trace | One MCP session (mcp.session.id) |
| Span | MCP Server tool invocation |
| Generation | Tool call with input/output |
| User | mcp.user.id from identity propagation |
When Hangar propagates caller identity, Langfuse traces carry the same user_id and session_id as the MCP OTEL spans. This enables cross-referencing Langfuse traces with governance enforcement events in OTEL backends.
Grafana¶
Hangar exposes Prometheus metrics at the /metrics HTTP endpoint. Grafana can scrape these directly or consume them through the OTEL Collector's Prometheus exporter.
Pre-built Grafana dashboards are available in the monitoring/ directory:
- Overview dashboard: MCP server states, tool call rates, error rates
- MCP Server details dashboard: Per-MCP server metrics, health check history
- Alerts dashboard: Circuit breaker state, violation counts
Getting started¶
Start the monitoring stack:
| Service | URL | Credentials |
|---|---|---|
| Grafana | http://localhost:3000 | admin / admin |
| Prometheus | http://localhost:9090 | -- |
| Alertmanager | http://localhost:9093 | -- |
Start Hangar in HTTP mode so the /metrics endpoint is available:
Key Prometheus queries¶
# Tool call rate per mcp_server (last 5 minutes)
rate(mcp_hangar_tool_calls_total[5m])
# 95th percentile tool call latency
histogram_quantile(0.95, rate(mcp_hangar_tool_call_duration_seconds_bucket[5m]))
# MCP servers currently in DEGRADED state (3=degraded)
mcp_hangar_mcp_server_state == 3
# Circuit breaker open count
mcp_hangar_circuit_breaker_state == 1
Integration stance¶
Hangar is not trying to become an observability platform. OpenTelemetry-compatible tools -- OpenLIT, Langfuse, Grafana, OTEL Collector, Datadog, Honeycomb -- are extensions to the visibility layer around Hangar. Hangar defines the governance telemetry contract (the MCP attribute taxonomy above) and exports it via OTEL. Partner tools consume, correlate, alert, and visualize.
This separation means:
- Hangar focuses on runtime security enforcement, lifecycle management, and governance policy.
- Partner tools focus on visualization, alerting, session analytics, and cost attribution.
- The OTEL interoperability contract ensures any OTEL-compatible tool works without Hangar-specific plugins.