Self State Monitor

Self State Monitor is a built-in mechanism designed to protect end user from false NODATA notifications and notify administrator about issues in Moira and/or Graphite systems.

Why Self State Monitor

A situation is possible when Graphite Relay, Redis DB or Moira-Filter service breaks down. This leads to the fact that Moira doesn’t receive any metrics from Graphite. In this case, Moira has no metrics on which it could check state of the triggers. According to the Moira logic, it should switch triggers to NODATA state and send alert messages to users.

To handle this situation properly, we recommend turning on the Self State Monitor. In this case, Moira will prevent itself from sending alert messages to end users but notify administrators of the existing problem.

Warning

When Self State Monitor detects a problem, it disables any notifications to end users and does not turn it back on without manual intervention.

Please, read this manual before using Self State Monitor in production.

See also

For a better understanding, look at the architecture of the Moira microservices.

When Self State Monitor Helps

Self state monitor checks these situations:

  1. If there is no connection between Moira and Redis for longer than redis_disconect_delay.
  2. If Moira-Filter receive no metrics for longer than last_metric_received_delay.
  3. If Moira-Checker checks no triggers for longer than last_check_delay.

See also

All the above configuration parametres can be found in the Moira-Notifier section on configuration page.

How Self State Monitor Works

When you turn Self State Monitor on, it works this way:

  • Self State Monitor checks Moira state every 10 seconds.

  • Something breaks down. It can be Graphite-Relay, connection to Redis DB or crashed Moira-Filter docker container.

  • Self State send alarm message to administrator with issue discription.

    Here is an example of message:

  • Self State Monitor turns Moira-Notifier service off, switching it in ERROR state.

    Note

    When Moira-Notifier switches to ERROR state, it mutes all messages to end users and only alerts administrators about Moira health issues. You need to fix existing problems and then manually switch Moira-Notifier back to OK using API.

    When Moira-Notifier not in OK state, Moira will show you an error in Web UI:


Turn Moira Notifier On and Off

You can reveal current Moira-Notifier state or change it on a hidden /notifications page.

Notifier toggle

Warning

Please, note this toggle changes Moira-Notifier state, not user notifications preferences.

When you disable notifications with this toggle, Moira-Notifier stops sending messages to all users.