This blog post will guide you through setting up a monitoring system for your Proxmox cluster using Prometheus and Grafana. By implementing this solution, you’ll gain valuable insights into the health and performance of your Proxmox environment, allowing you to identify potential issues before they escalate and optimize resource allocation for your virtual machines and containers.
Who should follow this guide?
This guide is ideal for those who are already comfortable with Proxmox and want to add a monitoring layer to their setup. Basic understanding of Linux concepts and familiarity with navigating the command line will be helpful.
The setup#
I’m going to setup my monitoring in a way that I don’t have to install anything on my Proxmox nodes. Instead, I will install the complete monitoring stack on one of the VMs running in my cluster. Like in the following diagram, I’ll have an AlmaLinux server running in my cluster that will host the following:
- Prometheus Server
- Grafana
- Prometheus VE Exporter
Security Considerations:
While the guide demonstrates the configuration process, it’s important to note that the example uses a weak password for Prometheus.
Introducing the Tools:#
Prometheus:#
An open-source monitoring system that acts like a central collector, scraping metrics from various sources like Proxmox VE at regular intervals and storing them in a time-series database.
Grafana:#
An open-source platform for visualizing data. It allows you to create interactive dashboards that display the metrics collected by Prometheus, making it easy to understand resource utilization, identify trends, and troubleshoot issues.
With a combination of this two tools in your Proxmox cluster you can have the following benefits:
- Enhanced Visibility: Gain real-time insights into your Proxmox VE performance, including CPU, memory, storage, and network usage.
- Proactive Problem Detection: Identify potential issues before they escalate, allowing you to take preventive action.
- Improved Resource Management: Optimize resource allocation for your virtual machines and containers based on actual usage data.
- Informed Decision Making: Gain data-driven insights to make informed choices about scaling your Proxmox VE infrastructure.
Setting Up Monitoring#
Prometheus#
Installation#
There are many options to install, I’ll use one of the official precompiled binaries provided on their dowloads page
So I’ll just download their tar file and extract the downloaded file with the following command.
tar xfvz prometheus.tar.gz -C prometheus
First, add a new user specific for Prometheus usage
sudo useradd --no-create-home --shell /bin/false prometheus
Create the required directories for Prometheus and change the ownership for them
sudo mkdir /etc/prometheus
sudo mkdir /var/lib/prometheus
sudo chown prometheus:prometheus /etc/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus
Begin by copying the essential Prometheus tools (prometheus and promtool) to their designated locations:
cp prometheus/prometheus /usr/local/bin/
cp prometheus/promtool /usr/local/bin/
Ensure the Prometheus user owns these binaries for proper execution:
chown prometheus:prometheus /usr/local/bin/prometheus
chown prometheus:prometheus /usr/local/bin/promtool
Configuration#
Use a text editor, (in the case of AlmaLinux, vi
comes preinstalled) to create
the Prometheus configuration file:
vi /etc/prometheus/prometheus.yml
Add configuration: Paste the following configuration details into the file:
global:
scrape_interval: 10s
scrape_configs:
- job_name: "prometheus"
scrape_interval: 5s
static_configs:
- targets: ["localhost:9090"]
- This configuration defines the scraping interval (how often Prometheus collects data) and sets up a job to scrape metrics from the Prometheus server itself (localhost:9090).
Save and adjust ownership: Once you’ve added the configuration, save the file and ensure the Prometheus user owns it:
sudo chown prometheus:prometheus /etc/prometheus/prometheus.yml
Set Prometheus as a Linux service#
Since I want to monitor my cluster anytime, I’ll setup the Prometheus server as a Linux Service using Systemd:
Add this to the file /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
[Install]
WantedBy=multi-user.target
Apply Changes and Verify Status
Reload systemd: After saving the Prometheus configuration, inform systemd about the changes using:
sudo systemctl daemon-reload
Start Prometheus: Finally, initiate the Prometheus service:
sudo systemctl start prometheus
sudo systemctl enable prometheus
Verify Status: To confirm whether Prometheus is running successfully, check its status with:
sudo systemctl status prometheus
By now, you should have Prometheus running in your VM and accesible from the web
UI on the port 9090
.
Grafana#
Installation#
In the case of the Grafana installation, I’m going to use the default package manager to install the OSS version of Grafana:
sudo dnf install grafana
Once installation is complete, initiate and enable the Grafana server using these commands:
sudo systemctl start grafana-server
sudo systemctl enable grafana-server
The first command starts the service immediately, while the second ensures it starts automatically on system boot.
Confirm that Grafana is running successfully:
sudo systemctl status grafana-server
Configuration#
Web interface access: With Grafana running, you can access its web interface
at http://<VM IP address>:3000
in your web browser.
Initial login: The default credentials are username: ‘admin’ and password: ‘admin’. You’ll be prompted to change the password upon first login for security reasons.
Advanced configuration (optional): For further configuration (e.g., email
alerts or port changes), edit the /etc/grafana/grafana.ini
file using any text
editor :
sudo vi /etc/grafana/grafana.ini
Make any necessary adjustments within the file and save your changes.
Prometheus VE Exporter#
We’ll use the Prometheus-pve-exporter to send the metrics from our clusters to our Prometheus Server and from that to grafana
Installation#
Make sure venv is installed (i.e. apt install python3-venv), then prepare a new env for the PVE Exporter:
python3 -m venv /opt/prometheus-pve-exporter
Install PVE Exporter into the new env:
/opt/prometheus-pve-exporter/bin/pip install prometheus-pve-exporter
Configuration#
Previous to running the exporter, we need to create a Proxmox user that the exporter will use to scrape the metrics. Create a new user under the Datacenter -> Users section. Fill in the user name, set realm to Proxmox VE authentication & give it your password.Go back to the Permissions. Set the path to ‘/’ and select your user. Give it the PVE Auditor role.
Next, create a configuration file for the exporter: /etc/prometheus/pve.yml
In
this file you need to specify the user details from the promexporter
user we
just created and we’ll add a flag to skip the verification from the Proxmox
cluster:
default:
user: promexporter@pve
password: supersecretpassword
verify_ssl: false
PVE exporter as Linux Service#
Since I want to monitor my cluster anytime, I’ll setup the PVE exporter as a Linux Service using Systemd:
Add this to the file /etc/systemd/system/prometheus-pve-exporter.service
[Unit]
Description=Prometheus exporter for Proxmox VE
Documentation=https://github.com/znerol/prometheus-pve-exporter
[Service]
Restart=always
User=prometheus
ExecStart=/opt/prometheus-pve-exporter/bin/pve_exporter /etc/prometheus/pve.yml
[Install]
WantedBy=multi-user.target
Apply Changes and Verify Status
Reload systemd: After saving the exporter configuration, inform systemd about the changes using:
sudo systemctl daemon-reload
Start the service: Finally, initiate the PVE exporter service:
sudo systemctl start prometheus-pve-exporter
sudo systemctl enable prometheus-pve-exporter
Verify Status: To confirm whether the Exporter is running successfully, check its status with:
sudo systemctl status prometheus-pve-exporter
By now, you should have the exporter running in your VM and its metrics
accesible from the web UI on the port 9221
.
You can access the metrics with curl:
Associating Prometheus with the PVE exporter#
Now, we need to configure the Prometheus server to collect the metrics exposed
by the Proxmox PVE Exporter. To do this we need to edit the Prometheus config
file /etc/prometheus/prometheus.yml
global:
scrape_interval: 10s
scrape_configs:
- job_name: "prometheus"
scrape_interval: 5s
static_configs:
- targets: ["localhost:9090"]
- job_name: "pve"
static_configs:
- targets:
- <IP of node 1> # pve1
- <IP of node 2> # pve2
metrics_path: /pve
params:
module: [default]
cluster: ["1"]
node: ["1"]
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: <IP from the VM where the exporter is installed>:9221
If everyting is correct, by now you should have the metrics from the Proxmox cluster accesible from your browser:
If for some reason you cannot reach any of the addresses we’ve gone trough, I’d
recommend you check your network configuration, in a previous post I talked
about how the firewall config from my AlmaLinux server stopped me from accessing
these sites on my local computer.
Visualizing Proxmox metrics from Grafana#
Once you’ve set up Prometheus and Grafana, the next step is to connect them so Grafana can visualize the metrics collected by Prometheus. Here’s how to achieve this:
Add Prometheus as a Data Source in Grafana:
- Open the Grafana web interface in your browser
- Login using your credentials
- On the left sidebar, click the configuration cogwheel icon and select “Data Sources”.
- Click “Add data source”.
- Choose “Prometheus” as the type.
- In the “URL” field, enter the address of your Prometheus server. If Prometheus
is running on the same machine as Grafana, you can use
http://<Prometheus Server IP>:9090
. - Click “Save & Test” to verify the connection. If successful, you’ll see a message confirming connectivity.
Explore and Visualize Metrics:
The fastest way to utilize the metrics is to export a great dashboard that is available to the public.
This dashboard can be imported to our Grafana server and will come preconfigured with helpful charts and tables from our cluster, which will look like this:
By now, you should be able to explore the dashboard and start monitoring your home lab, this will allow you to gain valuable insights into your Proxmox VE health and performance.
Thank you for reading!