Sentinel operates with significant autonomy — it classifies incidents, executes tool calls, and proposes remediation commands. That autonomy requires strict boundaries. The guardrail system is the enforcement layer: a set of deterministic checks and a semantic LLM judge that together ensure the pipeline can never be manipulated into harmful behavior, no matter what appears in the logs it analyzes. There are four guardrails, each positioned at a specific point in the pipeline, plus a two-node LangGraph subgraph that orchestrates the combination of rule-based and semantic checks.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/nicolas344/Sentinel-SoftServe/llms.txt
Use this file to discover all available pages before exploring further.
Why Deterministic Rules (Not Just LLMs)
The four core guardrails inguardrails.py do not call OpenAI. They use compiled regex and simple logic. This design choice is deliberate:
- Fast — no network round-trip, microsecond execution
- Free — does not count toward the LLM rate limit budget
- Auditable — the exact patterns are in the source code, testable with plain unit tests
- Deterministic — same input always produces the same result
llm_guardrail.py) is a second layer for semantic edge cases that regex cannot catch — it runs after the rule layer, not instead of it.
Guardrail 1 — Input Check
Function:guardrails.check_input(title, logs) → GuardrailResultWhen: Before any LLM sees the incident (first node in the pipeline) This guardrail protects the pipeline from prompt injection attacks hidden in log content. An attacker who can write to application logs could try to override the agent’s instructions by embedding text like
ignore all previous instructions in a stack trace.
What it does:
- Truncates logs to
_MAX_LOG_CHARS = 4000characters. Logs beyond this limit are dropped — they add noise and expand the injection surface. - Scans the combined title + truncated logs for 8 prompt injection patterns:
- Neutralizes — does not abort. Each individual log line is checked against the same patterns; matching lines are replaced with
[LÍNEA NEUTRALIZADA POR GUARDRAIL — posible inyección]. The sanitized logs are passed downstream so the incident can still be triaged, but the injected instructions are inert.
Guardrail 2 — Classification Check
Function:guardrails.check_incident_type(incident_type) → GuardrailResultWhen: Immediately after
_classify() returns (end of Lab 1)
The classification LLM is instructed to return one of ten valid incident types. However, LLMs can hallucinate — they may return a plausible-sounding but unsupported value like "database_corruption" or "timeout_error". This guardrail is the enforcement point.
Allowed set:
- Sets
passed = False - Forces
sanitized = "unknown" - Logs a warning:
[guardrails.classification] Tipo inválido '{incident_type}' → 'unknown'
unknown fallback is a valid operational type — the investigation continues, and the specialist agent still inspects the target and produces an analysis. Nothing breaks; the frontend is protected from an unrecognized category string.
Guardrail 3 — Action Check
Function:guardrails.check_proposed_action(action) → GuardrailResultWhen: After
_build_proposed_action() returns (end of Lab 3), before proposed_action is written to Supabase
This is the most critical guardrail. It gates every command that will be shown to an engineer for approval — and eventually executed on production infrastructure. Two checks run in sequence:
Step 1 — Metacharacter block:Any action containing
;, &&, ||, |, `, $(, >, or < is immediately rejected. These characters enable shell injection chaining and have no place in a safe whitelisted command.
Step 2 — Whitelist match:The action must fully match one of six compiled regex patterns:
action=None is valid. When _build_proposed_action determines that no safe action can be inferred (ambiguous target, unsupported runtime, unusual incident type combination), it returns None. check_proposed_action(None) returns passed=True with sanitized="" — no action is proposed, and the incident stays at analyzed.
This guardrail is defense-in-depth: _build_proposed_action already generates only whitelisted commands, but this re-validation is an independent check. If that function were ever modified, or a future agent tried to propose a custom command, this guardrail would intercept it before it reached any human or any database row.
Guardrail 4 — Output Scope Check
Function:guardrails.check_analysis_output(analysis) → GuardrailResultWhen: After the specialist agent returns
InvestigationResult.analysis (end of Lab 2)
This guardrail ensures the agent’s analysis stayed within the DevOps domain. If a prompt injection in the logs succeeded in partially diverting the agent, the output guardrail is the last line of defense before the analysis is shown to engineers.
Checks performed:
- Length check — analysis shorter than 20 characters is flagged as empty or incomplete
- Off-topic pattern detection:
The LangGraph Guardrail Graph
The deterministic guardrails are composed with the LLM judge into a two-node LangGraph graph defined inguardrail_graph.py. This graph runs for both input and output checks:
_GRAPH = _build_graph()) and reused for all invocations. The supervisor calls:
guardrail_graph.run_input_guardrail(title, logs)— returnsGuardrailState;sanitizedcontains clean logsguardrail_graph.run_output_guardrail(analysis)— returnsGuardrailState;sanitizedcontains analysis (with banner if flagged)
The LLM Judge
Module:llm_guardrail.pyFunction:
judge(text, stage) → dict
The LLM judge is a dedicated gpt-4o-mini call with a fixed system prompt. It evaluates two dimensions independently:
safe=false— text is attempting to manipulate the agent, change its instructions, or execute unauthorized actionson_topic=false— content is outside the DevOps/SRE domain (containers, databases, logs, metrics, incidents)
safe=true, on_topic=true (fail-open). If llm_passed is False on an input check, the sanitized text is completely replaced with [CONTENIDO BLOQUEADO POR GUARDRAIL SEMÁNTICO]. For output checks, it appends a secondary warning banner if one was not already added by the rules node.
Guardrail Positions Summary
| Guardrail | Position in Pipeline | Method | Blocks or Warns? |
|---|---|---|---|
| Input check | Before Lab 1 (classification) | check_input + LLM judge | Neutralizes lines; blocks if LLM flags |
| Classification check | After Lab 1 | check_incident_type | Forces to unknown (never hard-blocks) |
| Action check | After Lab 3 | check_proposed_action | Hard-blocks: clears proposed_action |
| Output scope check | After Lab 2 | check_analysis_output + LLM judge | Prepends warning banner; never blocks |