Incident Reference
Overview
Section titled “Overview”An incident in OpenStatus represents a detected problem or service disruption related to a monitored resource. Incidents are automatically generated when a monitor reports a failure condition that meets predefined criteria. They serve as a central point for tracking, managing, and resolving service impairments.
Key characteristics:
- Automatically triggered by monitor failures.
- Aggregates related failure events for a single monitor.
- Provides a clear status of service health.
Incident Triggering
Section titled “Incident Triggering”An incident is triggered when a significant percentage of recent monitoring checks for a given monitor report a failed status. This mechanism prevents false positives from transient network issues.
Trigger Condition:
- Failure Threshold: An incident is initiated when at least 50% of the checks within a defined window (e.g., the last
Nchecks or within aTduration) have reported afailureordegradedstatus.
Incident Lifecycle and States
Section titled “Incident Lifecycle and States”Incidents progress through several states reflecting their current resolution status. These states are managed through status reports (see Status Report Reference).
Primary States:
investigating: The incident has been detected, and the team is actively looking into the root cause.identified: The root cause of the incident has been identified.monitoring: A fix has been deployed or a mitigation is in place, and the service is being monitored to confirm resolution.resolved: The incident has been fully resolved, and the service is operating normally.
Properties
Section titled “Properties”While an incident is active, it collects and displays key information related to the service disruption.
- Monitor Association: Each incident is directly linked to the monitor that triggered it, providing immediate context to the affected service.
- Start Time: Timestamp indicating when the incident was first detected and created.
- Status Reports: A chronological log of all updates and state changes applied to the incident.
- Impacted Locations: Details on the geographical regions from which the monitor reported failures.
Related resources
Section titled “Related resources”- Status Report Reference - Details on how incident statuses are managed and reported.
- Monitoring Overview - Conceptual overview of monitoring within OpenStatus.
- Monitor Data Collected - All tracked metrics collected by monitors.