Installation of Network & Services monitoring - Grafana/Prometheus
Whether you are administering over a set of internet services, or networked devices, you are fully aware how blackbox'y they become after being setup. How every device or service, after optimizations done during the initial setup, executes it's tasks with only final results being known, but no intermediate problems being observable.
There are tons of apparently "ready-made" solutions, but in my opinion no pre-scripted solution is going to be aware of your individual needs or additional abilities or data going through your systems. Hence, a good base of a Data Oriented DataBase (Prometheus) & Observability Dashboard (Grafana) is plenty enough to get all the information we need in our administrative tasks.
Preparing the Hardware
I am going to use a Raspberry Pi to host mentioned software. This choice is going to come with some downsides related to the architecture, so it is preferred to use a AMD64 device for this.
I am going to connect the raspberry with USB ethernet interface to an isolated network and create some VLANS to access other subnets.
Don't forget that we are setting up this device to function as a source of truth also in case of disasters, so placing it in a high availability location, behind a backup power supply, would be preferable.
Setting up the data collection software
Database
Observability is 100% a statistics problem, so we need a hub of state changes in time. The part without which this project might as well not exist at all.
Prometheus is a GOlang application contained into single executable, without dependencies outside of the go environment. This single application is going to store state of all points of interest and let us query it for statistics we are interested in.
If you have already installed GOLang and set it up on your device, we are going to proceed with:
wget [Latest release matching your infrastructure from https://github.com/prometheus/prometheus/releases]
tar xf [the downloaded file]
mv [extracted dir] prometheus/ #for organisation
and go straight into launching it (for now, just to test if it is going to launch at all):
/home/pi/prometheus/prometheus \
--config.file=/home/pi/prometheus/prometheus.yml \
--storage.tsdb.path=/home/pi/prometheus/data
The config.file
and storage.tsdb.path
are very important, setting them explicitly is going to save us from hustle with defaults.
We can also setup a systemctl service with sudo touch /etc/systemd/system/prometheus.service
and putting there this text:
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
#Wants=blackbox_exporter.service snmp_exporter.service
[Service]
User=pi
Restart=on-failure
ExecStart=/home/pi/prometheus/prometheus \
--config.file=/home/pi/prometheus/prometheus.yml \
--storage.tsdb.path=/home/pi/prometheus/data
[Install]
WantedBy=multi-user.target
If the daemon has started successfully, we can visit http://localhost:9090/ and explore the data collected by prometheus by default.
Data Exporters
Main prometheus exec is a database and stored data querying interface. This is it.
The actual data is collected by specialized modules, which are mostly also just GO executables.
Let's start with the simplest data collection method there is I think, which is ICMP pinging of devices to check for their presence.
This task is handled by the Blackbox Exporter service, which we are going to install the same way we did the prometheus executable.
wget [Latest release matching your infrastructure from https://github.com/prometheus/blackbox_exporter/releases]
tar xf [the downloaded file]
mv [extracted dir] blackbox_exporter/ #for organisation
and we can also create a systemd service for it:
[Unit]
Description=Blackbox_Exporter for Prometheus
Documentation=https://github.com/prometheus/blackbox_exporter
After=network-online.target
PartOf=prometheus.service
[Service]
User=pi
Restart=on-failure
ExecStart=/home/pi/blackbox_exporter/blackbox_exporter \
--config.file=/home/pi/blackbox_exporter/blackbox.yml
[Install]
WantedBy=multi-user.target
One thing to note is, we can set the "PartOf" setting in here and "Wants" setting in
prometheus.service
, so the data collectors stop and start with the prometheus service. this way, we only need onesystemctl start/stop prometheus
to manage the whole package of services
Setting up the visualization software
Grafana
Grafana is a bit more complicated software, as it includes a whole interactive web service, so it might be more manageable to get it set up through docker, but as we are running on a device with limited resources, I am going to run it directly.
The installation process is described here, but it boils down to:
$ sudo apt-get install -y apt-transport-https software-properties-common wget
$ curl -sS https://apt.grafana.com/gpg.key | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/grafana.gpg
$ echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] https://apt.grafana.com stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list
$ sudo apt-get update
$ sudo apt-get install grafana
Grafana comes with a systemctl service with the installation and can be managed with
sudo systemctl status grafana-server
Further reading about configuration & monitoring methods, can be found here:
https://amun.pl/blog/network-services-monitoring-with-prometheus-grafana