Measure service unavailability during upgrade

Question

I am deploying a microservices - based application using an orchestrator (Rancher specifically).

During the service upgrade (when new images are being pulled and services re-discover one another), there is a small service outage.

What is the best / recommended way to measure the downtime?

I am performing e.g. a

watch -n1 'wget --spider http://some.endpoint.whoa'

but I want to time the duration of e.g. 502 responses.

score 1 · Answer 1 · answered Oct 20 '19 at 08:02

What you suggest might be the simplest way to collect the data, but you will have to do quite a bit of work to extract the availability over certain periods.

I think it's fair to say that if you want availability, you need a monitoring system. This means having an extra service in your catalogue to continuously probe the availability of your microservices over time. Storing them in a time-series database would allow you to make queries to establish availability over various periods.

There are many tools which could do this for you. A good starting place would be the CNCF monitoring landscape

Measure service unavailability during upgrade

1 Answers1