You are currently viewing Amazing script for CPU Temperature Monitoring on Proxmox

Amazing script for CPU Temperature Monitoring on Proxmox

Introduction

Today, I relocated my Proxmox Host and ESXi Host to the attic of my garage. Previously, both servers were situated in my home office, but the summer heat and constant noise became unbearable.

This post may contain affiliate links which means I receive a commission for purchases made through links. I only recommend products that I personally use! Learn more on my private policy page.

To keep an eye on both servers, I need to monitor their temperature and status closely.

Monitoring server temperatures is crucial for optimal performance and hardware longevity. This guide will walk you through setting up a temperature monitoring system on a Proxmox host using lm-sensors, Prometheus Node Exporter, and Grafana.

Installing Necessary Tools

On a fresh Proxmox installation, the following tools are necessary to monitor the CPU temperature.

But first, update the package list of your system:

Bash
sudo apt update


Install lm-sensors and mailutils:

Bash
sudo apt install lm-sensors mailutils bc


Detect sensors and configure them:

Bash
sudo sensors-detect


Follow the prompts to complete the configuration. Verify the installation by running:

Bash
sensors


For more detailed instructions, refer to the lm-sensors documentation.


The Monitoring Script

Use Nano (my favorite Editor) to make a new file:

Bash
nano /usr/local/bin/check_temp.sh
Bash
#!/bin/bash
# Read the CPU temperature (Tctl)
CPU_TEMP=$(sensors | grep 'Tctl:' | awk '{print $2}' | sed 's/+//g' | sed 's/°C//g')
# Read the NVMe temperature (Composite)
NVME_TEMP=$(sensors | grep 'Composite:' | awk '{print $2}' | sed 's/+//g' | sed 's/°C//g')
# Default thresholds for warnings
CPU_MAX_TEMP=80
NVME_MAX_TEMP=70
# Check if the temperature values are not empty and are integers
if ! [[ "$CPU_TEMP" =~ ^[0-9]+(\.[0-9]+)?$ ]] || ! [[ "$NVME_TEMP" =~ ^[0-9]+(\.[0-9]+)?$ ]]; then
    echo "Could not read temperatures. Check the sensors configuration." | mail -s "Temperature read failed" admin@sluijsjes.nl
    exit 1
fi
# Check the CPU temperature
if (( $(echo "$CPU_TEMP > $CPU_MAX_TEMP" | bc -l) )); then
    echo "Warning: CPU temperature is too high: $CPU_TEMP°C" | mail -s "CPU temperature warning" admin@sluijsjes.nl
fi
# Check the NVMe temperature
if (( $(echo "$NVME_TEMP > $NVME_MAX_TEMP" | bc -l) )); then
    echo "Warning: NVMe temperature is too high: $NVME_TEMP°C" | mail -s "NVMe temperature warning" admin@sluijsjes.nl
fi


Make your script executable with chmod:

Bash
chmod +x /usr/local/bin/check_temp.sh


Automating with Crontab

Use Crontab to frequently run the script. In this example every 5 minutes:

Bash
crontab -e
Bash
# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').
#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h  dom mon dow   command
*/5 * * * * /usr/local/bin/check_temp.sh > /dev/null 2>&1
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=""


Now it’s time to sending data to Grafana!

Sending Data to Grafana

To send the temperature data to Grafana (or to the Proxmox web interface), you will need to:

Install and configure Prometheus Node Exporter on the Proxmox host:

Bash
wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
tar xvfz node_exporter-1.3.1.linux-amd64.tar.gz
sudo cp node_exporter-1.3.1.linux-amd64/node_exporter /usr/local/bin/
sudo useradd -rs /bin/false node_exporter


Create a systemd service file for Node Exporter:

Bash
sudo nano /etc/systemd/system/node_exporter.service


Add the following content:

Bash
[Unit]
Description=Node Exporter
After=network.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=default.target


Start and enable Node Exporter:

Bash
sudo systemctl daemon-reload
sudo systemctl start node_exporter
sudo systemctl enable node_exporter


Verify Node Exporter is running:

Bash
curl http://localhost:9100/metrics


Modify the script to send custom metrics to Prometheus: You can create custom metrics and expose them using Node Exporter. Modify your script to generate these metrics and then use Node Exporter’s textfile collector.

For example, modify your script (or make a new one) to write metrics to a file:

Bash
nano /usr/local/bin/sent_temp_grafana.sh
Bash
#!/bin/bash
# Read the CPU temperature (Tctl)
CPU_TEMP=$(sensors | grep 'Tctl:' | awk '{print $2}' | sed 's/+//g' | sed 's/°C//g')
# Read the NVMe temperature (Composite)
NVME_TEMP=$(sensors | grep 'Composite:' | awk '{print $2}' | sed 's/+//g' | sed 's/°C//g')
# Write metrics to file
echo "# HELP node_cpu_temperature CPU temperature in Celsius" > /var/lib/node_exporter/temperature.prom
echo "# TYPE node_cpu_temperature gauge" >> /var/lib/node_exporter/temperature.prom
echo "node_cpu_temperature $CPU_TEMP" >> /var/lib/node_exporter/temperature.prom
echo "# HELP node_nvme_temperature NVMe temperature in Celsius" >> /var/lib/node_exporter/temperature.prom
echo "# TYPE node_nvme_temperature gauge" >> /var/lib/node_exporter/temperature.prom
echo "node_nvme_temperature $NVME_TEMP" >> /var/lib/node_exporter/temperature.prom


Make your script executable with chmod:

Bash
chmod +x /usr/local/bin/sent_temp_grafana.sh


Use Crontab to frequently run the script. In this example every 5 minutes:

Bash
crontab -e
Bash
# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').
#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h  dom mon dow   command
*/5 * * * * /usr/local/bin/check_temp.sh > /dev/null 2>&1
*/5 * * * * /usr/local/bin/sent_temp_grafana.sh > /dev/null 2>&1
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=""


Ensure Node Exporter reads the custom metrics file:
Edit the Node Exporter service to include the textfile collector directory:

Bash
sudo nano /etc/systemd/system/node_exporter.service 


Modify the ExecStart line to include the textfile collector:

Bash
ExecStart=/usr/local/bin/node_exporter --collector.textfile.directory=/var/lib/node_exporter/


Restart the Node Exporter service:

Bash
sudo systemctl daemon-reload sudo systemctl restart node_exporter


Configure Prometheus to scrape the Proxmox host:
Add the Proxmox host to your Prometheus configuration (prometheus.yml):

Bash
scrape_configs:   - job_name: 'node_exporter_proxmox' 
    static_configs:
      - targets: ['your_proxmox_host:9100']


Reload Prometheus configuration:

Bash
curl -X POST http://localhost:9090/-/reload


Add the Prometheus data source in Grafana:

  • Go to your Grafana web interface.
  • Navigate to Configuration -> Data Sources.
  • Add a new Prometheus data source with the URL of your Prometheus server.


Create a dashboard in Grafana:

  • Create a new dashboard.
  • Add a new panel with a graph visualization.
  • Set the query to node_cpu_temperature and node_nvme_temperature.

This setup will allow you to monitor your Proxmox host’s temperatures in Grafana using Prometheus and Node Exporter.


Copyright © 2024 Sluijsjes Tech Lab

Leave a Reply