My home server has about 15 distinct services running, each represented by a Docker container. Depending on how the developers implemented the Dockerfile, some services have health checks, but most do not.
At the moment, I do not have a way to make sure these services remain healthy and running. I do have the
restart: always setting set, but worst case this just means Docker endlessly restarts a server if it fails to start or crashes on startup for whatever reason. Or maybe a
docker pull caused an issue during an update. There could be many reasons why these services do not start, or stop running during the day.
What I'd like is to know when they go down. Maybe an email that notifies me, so I can go fix the issue, instead of only finding out about the issue when a family member notifies me or when I try to use that service.
Is there a reasonable way to monitor multiple Docker containers in this way? Do I have to hand-write a bunch of scripts that do
ps to check for process up time or something? Preferably an out-of-the-box solution would be great. Note I use Docker Compose for everything, and I only have 1 physical machine where I run these services, so I don't need to worry about complex setups.
Your home server is a highly available kubernetes cluster running on GCP/AWS/Azure/Alibaba cloud...? @simbo1905 Nope, just a little Intel NUC running in my office.