Monitoring theory, from scratch — Indicators and synthetics

Indicators

An indicator you do not want to see light up.

Do ideal indicators exist?

The Nagios monitoring system

If ideal indicators did exist, would they have sufficed?

  • First, not everything can be clearly made into clean and nice indicators. Some things are more vague than others. Some failures are intermittent or partial. Full coverage is also likely not feasible given the endless amount of possible failure states.
  • Second, indicators are a post-factum mechanism. Even idealized, once they have triggered, it means we’re in the problem zone. Since fixing problems is never a zero-time, zero-effort endeavor, we see considerable value in predicting trouble to begin with, something indicators are not as well-suited to.
  • Third, knowing something broke is half the battle. Fixing it is the other 80%. Indicators have their limits in helping us figure out how it broke and how to fix that. They’re simply not expressive enough.
  • Fourth, modern tech systems change. A lot. Either when their state changes, when we deploy or upgrade the system, when a third-party we interact with does so… these may influence our indicators and expectations (e.g. we may expect a certain component to be unavailable or degraded during an upgrade). These aren’t indicators we need to observe. These are events and we’ll discuss those later too.

Synthetics

Summary/Takeaways

  • Try to pick indicators that offer the smallest gap between your desired definition of ‘up’ and their ability to report it.
  • Document and train operators on such gaps and implementation limits.
  • Try to pick indicators that are as discrete as possible and leave as little room as possible for interpretation.
  • If you end up scratching your head much while looking at your indicators, revise and re-iterate.
  • Remember that indicators aren’t predictive by nature and need to be complemented with other measures/systems.
  • Remember that system state changes and you need to make sure your indicators are fresh and react to it. It’s always an on-going process.
  • Consider supporting synthetic transactions in your indicator strategy, in particular if you have a complex and distributed end to end system.

--

--

--

A Gil, of all trades. DevOps roles are often called “a one man show”. As it turns out, I’m not a man and never was. Welcome to this one (trans) woman show.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Is Software Project Management like coaching a football team?

GCP Instance Groups: A Different Way To Autoscale Part 1

Nate Bot — Version 10

Manipulation of an XML file with Python

Stock Generator — CS 101 Final Project

Polymorphism

Nate Bot — Database Migration

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Gil Bahat (she/her)

Gil Bahat (she/her)

A Gil, of all trades. DevOps roles are often called “a one man show”. As it turns out, I’m not a man and never was. Welcome to this one (trans) woman show.

More from Medium

Working with dynamically created USB devices in Docker

Build and Deploy Docker Container in GitLab

Devops Container Orchestration Using the Docker Swarm Model

Devops Container Orchestration Using the Docker Swarm Model

How to install docker on RHEL using Ansible role