Self State Monitor¶
Self State Monitor is a built-in mechanism designed to protect
end user from false NODATA notifications and notify administrator
and end user about issues in Moira and/or Graphite systems.
Why Self State Monitor¶
A situation is possible when Graphite Relay, Redis DB or Moira-Filter
service breaks down. This leads to the fact that Moira doesn’t receive
any metrics from Graphite. In this case, Moira has no metrics on which
it could check state of the triggers. According to the Moira logic,
it should switch triggers to NODATA state and send alert messages to users.
To handle this situation properly, we recommend turning on the Self State Monitor. In this case, Moira will prevent itself from sending alert messages to end users but will notify administrators of the existing problem or end users who have subscribed to Self State alerts (see :ref:`system-subscriptions-description`).
Warning
When Self State Monitor detects a problem, it disables notifications to end users by their triggers, but can send notifications about the problem via system-subscriptions. When the problem resolved, Self State Monitor notifies admins about it and end users via system-subscriptions and turn back sending notifications automatically.
Please, read this manual before using Self State Monitor in production.
See also
For a better understanding, look at the architecture of the Moira microservices.
When Self State Monitor Helps¶
Self state monitor checks these situations:
If there is no connection between Moira and Redis for longer than
redis_disconect_delay.If Moira-Filter receive no metrics for longer than
last_metric_received_delay.If Moira-Checker checks no triggers for longer than
last_check_delay.
See also
All the above configuration parametres can be found in the Moira-Notifier section on configuration page.
How Self State Monitor Works¶
When you turn Self State Monitor on, it works this way:
Self State Monitor checks Moira state every 10 seconds.
Something breaks down. It can be Graphite-Relay, connection to Redis DB or crashed Moira-Filter docker container.
Self State send alarm message to administrator with issue discription and switch own state to
WARNHere is an example of message:
Self State Monitor turns Moira-Notifier service off, switching it in
ERRORstate.Note
When Moira-Notifier switches to
ERRORstate, it mutes all messages to end users and only alerts administrators about Moira health issues. You need to fix existing problems and then manually switch Moira-Notifier back toOKusing API.When Moira-Notifier not in
OKstate, Moira will show you an error in Web UI:Self State Monitor waits for
user_notifications_intervaland switches own state fromWARNtoERRORif problem persists. Then users will be notified about the problem via their system-subscriptions.If problem disappears, the Self State Monitor switches own state to
OKand sends notifications according to these rules.Self State Monitor state mutates from
WARNtoOKthen notification about Moira normalize are sent only to admins.Self State Monitor state mutates from
ERRORtoOKthen notification sends to both admins and users via system-subscriptions.
Note
For a better understanding, look at pictures below
If Self State detects a problem, then it possible 2 ways to next actions:
If problem persists a much more than
user_notifications_interval.. image:: ../_static/selfstate_full_cycle_WARN_to_ERROR.png- alt:
Self State sends notifications to admins and users
If problem persists a less than
user_notifications_interval.. image:: ../_static/selfstate_full_cycle_WARN_to_OK.png- alt:
Self State sends notifications to admins only
Turn Moira Notifier On and Off¶
You can reveal current Moira-Notifier state or change it
on a hidden /notifications page.
Warning
Please, note this toggle changes Moira-Notifier state, not user notifications preferences.
When you disable notifications with this toggle, Moira-Notifier stops sending messages to all users.