ts_watcher documentation

A CSC which monitors other SAL components and uses the data to generate alarms for display by LOVE. The point is to provide a simple, uniform interface to handle alarms.

The alarms are generated by rules which are defined in this package. The CSC configuration specifies which of the available rules are used, and the configuration for each rule.

Using lsst.ts.watcher

The fundamental objects that make up the Watcher are rules and alarms. Rules monitor topics from remote SAL components and, based on that information, set the severity of alarms. Alarms contains the state of an alarm, including the current severity, whether the alarm has been acknowledged, and the maximum severity seen since last acknowledgement. Rules are instances of subclasses of BaseRule. Alarms are instances of Alarm.

There is a one to one relationship between rules and alarms: every rule contains one associated alarm.

The set of rules used by the Watcher and the configuration of each rule is specified by the CSC configuration. The configuration options for each rule are specified by a schema provided by the rule. A typical Watcher configuration file will specify most available rules, and will likely be large. The Watcher configuration also has a list of disabled SAL components, for the situation that a subsystem is down for maintenance or repair. Rules that use a disabled SAL component are not loaded.

Severity Levels

Alarms have the following available severity levels:

  • CRITICAL: Equipment is in danger; phone calls will be made, or texts or emails sent, if not acknowledged in time (a feature that is planned but not yet implemented).

  • SERIOUS: Something is broken.

  • WARNING: Something is wrong but we can probably keep operating.

  • NONE: No alarm condition present.

Note that most alarms will only use one or two of the severity levels higher than NONE.

Each alarm has two severity fields:

  • severity: the current severity, as reported by the rule.

  • max_severity: the maximum severity seen since the alarm was last acknowledged.

Keeping track of max_severity makes sure that transient problems are seen and acknowledged.

Auto Acknowledge and Unacknowledge

You may configure the Watcher to auto-acknowledge and auto-unacknowledge alarms after a configurable period of time, using configuration parameters auto_acknowledge_delay and auto_unacknowledge_delay.

An alarm will be automatically acknowledged only if its current severity stays NONE for the full auto_acknowledge_delay period (i.e. if the problem truly appears to have gone away).

An alarm will be automatically unacknowledged only if the condition does not get worse than the level at which it was ackowledged, and does not get resolved (go to NONE), during the full auto_unacknowledge_delay period after being acknowledged.

Other Classes

Model manages all the rules that are in use. It is the model that uses the watcher configuration to construct rules, construct salobj remotes and topics and wire everything together. The model also disables rules when the Watcher CSC is not in the ENABLED state.

In order to reduce resource usage, remotes (instances of lsst.ts.salobj.Remote) and topics (instances of lsst.ts.salobj.topics.ReadTopic) are only constructed if a rule that is in use needs them. Also remotes and topics are shared, so if more than one rule needs them, only one is constructed.

Since rules share remotes and topics, the rule’s constructor does not construct remotes or topics (which also means that a rule’s constructor does not make the rule fully functional). Instead a rule specifies the remotes and topics it needs by constructing RemoteInfo objects, which the Model uses to construct the remotes and topics and connect them to the rule.

TopicCallback supports calling more than one rule from a topic. This is needed because a salobj topic can only call back to a single function and we may have more than one rule that wants to be called.

Rules are isolated from each other in two ways, both of which are implemented by wrapping each remote with multiple instances of RemoteWrapper, one instance per rule that uses the remote:

  • A rule can only see the topics that it specifies it wants. This eliminates a source of surprising errors where if rule A if uses a topic specified only by rule B then the topic will only be available to rule A if rule B is being used.

  • A rule can only see the current value of a topic; it cannot wait on the next value of a topic. That prevents one rule from stealing data from another rule.

Contributing

lsst.ts.watcher is developed at https://github.com/lsst-ts/ts_watcher. You can find Jira issues for this module using labels=ts_watcher.

Python API reference

lsst.ts.watcher Package

Functions

get_filtered_topic_wrapper_key(topic_key, ...)

Get a key for a filtered topic wrapper.

get_rule_class(classname)

Get a rule class given its name.

get_topic_key(topic)

Compute the key unique to a topic.

Classes

Alarm(name)

A Watcher alarm.

BaseFilteredFieldWrapper(model, topic, ...)

Base class for filtered field wrappers.

BaseRule(config, name, remote_info_list)

Base class for watcher rules.

FieldWrapperList()

A sequence of field wrappers.

FilteredEssFieldWrapper(model, topic, ...)

Track a field of an ESS telemetry topic, with a particular sensor name.

FilteredTopicWrapper(model, topic, filter_field)

Topic wrapper that caches data by the value of a filter field.

IndexedFilteredEssFieldWrapper(model, topic, ...)

A filtered field wrapper for an array field, with metadata indicating indices of interest.

MockModel([enabled])

Model(domain, config[, alarm_callback])

Watcher model: constructs and manages rules and alarms.

RemoteInfo(name, index[, callback_names, ...])

Information about a remote SAL component.

RemoteWrapper(remote, topic_names)

Simple access to the current value of a specified set of topics.

RuleDisabled

Raised by BaseRule's constructor if the rule is disabled.

ThresholdHandler(warning_level, ...[, ...])

Compute severity for a rule that involves one float value with multiple threshold levels.

TopicCallback(topic, rule, model)

Call one or more rules when a salobj topic receives a sample.

WatcherCsc([config_dir, initial_state, override])

The Watcher CSC.

Class Inheritance Diagram

Inheritance diagram of lsst.ts.watcher.alarm.Alarm, lsst.ts.watcher.filtered_field_wrapper.BaseFilteredFieldWrapper, lsst.ts.watcher.base_rule.BaseRule, lsst.ts.watcher.field_wrapper_list.FieldWrapperList, lsst.ts.watcher.filtered_field_wrapper.FilteredEssFieldWrapper, lsst.ts.watcher.filtered_topic_wrapper.FilteredTopicWrapper, lsst.ts.watcher.filtered_field_wrapper.IndexedFilteredEssFieldWrapper, lsst.ts.watcher.testutils.MockModel, lsst.ts.watcher.model.Model, lsst.ts.watcher.remote_info.RemoteInfo, lsst.ts.watcher.remote_wrapper.RemoteWrapper, lsst.ts.watcher.base_rule.RuleDisabled, lsst.ts.watcher.threshold_handler.ThresholdHandler, lsst.ts.watcher.topic_callback.TopicCallback, lsst.ts.watcher.watcher_csc.WatcherCsc

lsst.ts.watcher.rules Package

Classes

ATCameraDewar(config)

Monitor ATCamera dewar temperatures and vacuum.

Clock(config)

Monitor the system clock of a SAL component using the heartbeat event.

DewPointDepression(config)

Check the dew point depression.

Enabled(config)

Monitor the summary state of a CSC.

Heartbeat(config)

Monitor the heartbeat event from a SAL component.

Humidity(config)

Check the humidity.

MTCCWFollowingRotator(config)

Check that the MT camera cable wrap is following the camera rotator.

OverTemperature(config)

Check for something being too hot, such as hexapod struts.

Class Inheritance Diagram

Inheritance diagram of lsst.ts.watcher.rules.atcamera_dewar.ATCameraDewar, lsst.ts.watcher.rules.clock.Clock, lsst.ts.watcher.rules.dew_point_depression.DewPointDepression, lsst.ts.watcher.rules.enabled.Enabled, lsst.ts.watcher.rules.heartbeat.Heartbeat, lsst.ts.watcher.rules.humidity.Humidity, lsst.ts.watcher.rules.mt_ccw_following_rotator.MTCCWFollowingRotator, lsst.ts.watcher.rules.over_temperature.OverTemperature

lsst.ts.watcher.rules.test Package

Classes

ConfiguredSeverities(config)

A test rule that transitions through a specified list of severities, repeatedly.

NoConfig(config)

A minimal test rule that has no configuration and no remotes.

Version History