Alarm¶
- class lsst.ts.watcher.Alarm(name, log=None)¶
Bases:
object
A Watcher alarm.
- Parameters:
- name
str
Name of alarm. This must be unique among all alarms and should be of the form system.[subsystem….]_name so that groups of related alarms can be acknowledged.
- log
logging.Logger
, optional Parent logger.
- name
- Attributes:
- name
str
Name of alarm.
- acknowledged_by
str
The
user
argument when the alarm is acknowledged. “” if not acknowledged.- auto_acknowledge_delay
float
The delay (seconds) after which an alarm will be automatically acknowledged. Never if 0 (the default).
- auto_unacknowledge_delay
float
The delay (seconds) after which an alarm will be automatically unacknowleddged. Never if 0 (the default).
- do_escalate
bool
Should the alarm be escalated? The value is set by this class and is intended to be read by the alarm callback.
- escalation_delay
float
If an alarm goes to critical state and remains unacknowledged for this period of time (seconds), the alarm should be escalated. If 0, the alarm will not be escalated.
- escalation_responder
str
Who or what to escalate the alarm to. If blank, the alarm will not be escalated.
- escalated_id
str
ID of the SquadCast escalation alert. “” if not escalated. Set to “Failed: {reason}” if escalation failed. This is set to “” by
reset
, and intended to be set to non-empty values by the alarm callback.- severity_queue
asyncio.Queue
orNone
Intended only for unit tests. Defaults to None. If a unit test sets this to an
asyncio.Queue
,set_severity
will queue the severity every time it returns True.- auto_acknowledge_task:
asyncio.Future
A task that monitors the automatic acknowledge timer.
- auto_unacknowledge_task:
asyncio.Future
A task that monitors the automatic unacknowledge timer.
- escalating_task
asyncio.Future
A task that monitors the process of escalating an alarm to a notification service such as SquadCast. This timer is managed by WatcherCsc, because it knows how to communicate with the notification service.
- escalation_timer_task
asyncio.Future
A task that monitors the escalation timer. When this timer fires, it sets do_escalate to true and calls the callback. It is then up the CSC to actually escalate the alarm (see escalating_task).
- unmute_task:
asyncio.Future
A task that monitors the unmute timer.
- name
Attributes Summary
Get the callback function.
Is this alarm muted?
True if alarm is in nominal state: severity = max severity = NONE.
Methods Summary
acknowledge
(severity, user)Acknowledge the alarm.
assert_equal
(other[, ignore_attrs])Assert that this alarm equals another alarm.
assert_next_severity
(expected_severity[, ...])Wait for and check the next severity.
close
()Cancel pending tasks.
configure_basics
([callback, ...])Configure the callback function and auto ack/unack delays.
configure_escalation
(escalation_delay, ...)Configure escalation.
Remove all items from the severity queue.
Initialize the severity queue.
make_log_entry
(log_server_url)Post message to narrative log entry in response to alarm Parameters ---------- log_server_url:
str
URL of the narrativelog service.mute
(duration, severity, user)Mute this alarm for a specified duration and severity.
reset
()Reset the alarm to nominal state.
Run the callback function, if present.
set_severity
(severity, reason)Set the severity.
unacknowledge
([escalate])Unacknowledge the alarm.
unmute
()Unmute this alarm.
Attributes Documentation
- callback¶
Get the callback function.
- muted¶
Is this alarm muted?
- nominal¶
True if alarm is in nominal state: severity = max severity = NONE.
When the alarm is in nominal state it should not be displayed in the Watcher GUI.
Methods Documentation
- async acknowledge(severity, user)¶
Acknowledge the alarm.
Halt the escalation timer, if running, and set do_escalate False. Restart the auto unacknowledge timer, if configured (self.auto_unacknowledge_delay > 0).
- Parameters:
- severity
lsst.ts.idl.enums.Watcher.AlarmSeverity
orint
Severity to acknowledge. Must be >= self.max_severity. If the severity goes above this level the alarm will unacknowledge itself.
- user
str
Name of user; used to set acknowledged_by.
- severity
- Returns:
- updated
bool
True if the alarm state changed (any fields were modified other than tasks being cancelled), False otherwise.
- updated
- Raises:
- ValueError
If
severity < self.max_severity
. In this case the acknowledge method does not change the alarm state.
Notes
The reason
severity
is an argument is to handle the case that a user acknowledges an alarm just as the alarm severity increases. To avoid the danger of accidentally acknowledging an alarm at a higher severity than intended, the acknowledgement is rejected.
- assert_equal(other, ignore_attrs=())¶
Assert that this alarm equals another alarm.
Compares all attributes except tasks and those specified in ignore_attrs.
- async assert_next_severity(expected_severity, check_empty=True, flush=False, timeout=10)¶
Wait for and check the next severity.
Only intended for tests. In order to call this you must first call
init_severity_queue
(once) to set up a severity queue.- Parameters:
- expected_severity
AlarmSeverity
The expected severity.
- check_empty
bool
, optional If true (the default): check that the severity queue is empty, after getting the severity.
- flush
bool
, optional If true (not the default): flush all existing values from the queue, then wait for the next severity. This is useful for polling alarms.
- timeout
float
, optional Maximum time to wait (seconds)
- expected_severity
- Raises:
- AssertionError
If the severity is not as expected, or if
check_empty
true and there are additional queued severities.- asyncio.TimeoutError
If no new severity is seen in time.
- RuntimeError
If you never called
init_severity_queue
.
Notes
Here is the typical way to use this method: * Create a rule * Call
rule.alarm.init_severity_queue()
* Write SAL messages that are expected to change the alarm severity. * After writing each such message, call:await rule.alarm.assert_next_severity(expected_severity)
- close()¶
Cancel pending tasks.
- configure_basics(callback=None, auto_acknowledge_delay=0, auto_unacknowledge_delay=0)¶
Configure the callback function and auto ack/unack delays.
- Parameters:
- callbackcallable, optional
Function or coroutine to call whenever the alarm changes state, or None if no callback wanted. The function receives one argument: this alarm.
- auto_acknowledge_delay
float
, optional Delay (in seconds) before a stale alarm is automatically acknowledged, or 0 for no automatic acknowledgement. A stale alarm is one that has not yet been acknowledged, but its severity has gone to NONE.
- auto_unacknowledge_delay
float
, optional Delay (in seconds) before an acknowledged alarm is automatically unacknowledged, or 0 for no automatic unacknowledgement. Automatic unacknowledgement only occurs if the alarm persists, because an acknowledged alarm is reset if severity goes to NONE.
- configure_escalation(escalation_delay, escalation_responder)¶
Configure escalation.
Set the following attributes:
escalation_delay
escalation_responder
- Parameters:
- Raises:
- ValueError
If escalation_delay < 0. If escalation_delay > 0 and escalation_responder empty, or escalation_delay = 0 and escalation_responder not empty.
- TypeError
If escalation_responder is not a str.
- init_severity_queue()¶
Initialize the severity queue.
You must call this once before calling
assert_next_severity
. You may call it again to reset the queue, but that is uncommon.Warning
Only tests should call this method. Calling this in production code will cause a memory leak.
- async make_log_entry(log_server_url)¶
Post message to narrative log entry in response to alarm Parameters ———- log_server_url:
str
URL of the narrativelog service.
- Returns:
- response:
dict
JSON respose from Post
- response:
- async mute(duration, severity, user)¶
Mute this alarm for a specified duration and severity.
Muting also cancels the escalation timer.
- Parameters:
- duration
float
How long to mute the alarm (sec).
- severity
lsst.ts.idl.enums.Watcher.AlarmSeverity
orint
Severity to mute. If the alarm’s current or max severity goes above this level the alarm should be displayed.
- user
str
Name of user who muted this alarm. Used to set
muted_by
.
- duration
- Raises:
- ValueError
If
duration <= 0
,severity == AlarmSeverity.NONE
orseverity
is not a validAlarmSeverity
enum value.
Notes
An alarm cannot have multiple mute levels and durations. If mute is called multiple times, the most recent call overwrites information from earlier calls.
- reset()¶
Reset the alarm to nominal state.
Do not call the callback function. This is designed to be called by Model.enable, which first resets alarms and then feeds them data before writing alarm state.
It sets too many fields to be called by set_severity.
- async run_callback()¶
Run the callback function, if present.
- async set_severity(severity, reason)¶
Set the severity.
Call the callback function unless the alarm was nominal and remains nominal. Put the new severity on the severity queue (if it exists), regardless of whether the alarm was nominal.
- Parameters:
- severity
lsst.ts.idl.enums.Watcher.AlarmSeverity
orint
New severity.
- reason
str
The reason for this state; this should be a brief message explaining what is wrong. Ignored if severity is NONE.
- severity
- Returns:
- updated
bool
True if the alarm state changed (i.e. if any fields were modified), False otherwise.
- updated
- async unacknowledge(escalate=True)¶
Unacknowledge the alarm. Basically a no-op if nominal or not acknowledged.
- async unmute()¶
Unmute this alarm.