Service Management¶
Overview¶
All monitoring components run as systemd services, either native or Podman containers managed by systemd quadlets. This page covers common service management operations.
Service Status¶
Check Service Status¶
# InfluxDB
systemctl status influxdb
# Loki
systemctl status loki
# Telegraf
systemctl status telegraf
# Alloy
systemctl status alloy
Check All Monitoring Services¶
Starting and Stopping Services¶
Individual Services¶
# Start service
systemctl start influxdb
systemctl start loki
systemctl start telegraf
systemctl start alloy
# Stop service
systemctl stop influxdb
systemctl stop loki
systemctl stop telegraf
systemctl stop alloy
# Restart service
systemctl restart influxdb
systemctl restart loki
systemctl restart telegraf
systemctl restart alloy
Multiple Services¶
# Start all server components
systemctl start influxdb loki
# Start all client components
systemctl start telegraf alloy
# Restart all monitoring services
systemctl restart influxdb loki telegraf alloy
Enable/Disable Auto-Start¶
Enable Services at Boot¶
Disable Services at Boot¶
systemctl disable influxdb
systemctl disable loki
systemctl disable telegraf
systemctl disable alloy
Check if Service is Enabled¶
systemctl is-enabled influxdb
systemctl is-enabled loki
systemctl is-enabled telegraf
systemctl is-enabled alloy
Viewing Logs¶
Recent Logs¶
# View last 50 lines
journalctl -u influxdb -n 50
journalctl -u loki -n 50
journalctl -u telegraf -n 50
journalctl -u alloy -n 50
Follow Logs (Real-time)¶
# Tail logs in real-time
journalctl -u influxdb -f
journalctl -u loki -f
journalctl -u telegraf -f
journalctl -u alloy -f
Logs Since Timestamp¶
# Logs from last hour
journalctl -u telegraf --since "1 hour ago"
# Logs from specific time
journalctl -u alloy --since "2024-01-01 10:00:00"
# Logs between times
journalctl -u influxdb --since "09:00" --until "10:00"
Filter Logs by Priority¶
# Only errors and critical
journalctl -u telegraf -p err
# Warning and above
journalctl -u alloy -p warning
Container Management (Podman)¶
View Containers¶
# List running containers
podman ps
# List all containers (including stopped)
podman ps -a
# Filter by name
podman ps | grep influxdb
podman ps | grep loki
Container Logs¶
# View container logs
podman logs influxdb
podman logs loki
# Follow container logs
podman logs -f influxdb
podman logs -f loki
# Last 100 lines
podman logs --tail 100 influxdb
Container Information¶
# Inspect container
podman inspect influxdb
podman inspect loki
# Container stats (CPU, memory)
podman stats influxdb loki
# Container processes
podman top influxdb
Restarting Containers¶
# Restart via systemd (preferred)
systemctl restart influxdb
systemctl restart loki
# Direct container restart (not recommended)
podman restart influxdb
podman restart loki
Configuration Reload¶
Telegraf¶
Reload configuration without restart:
# Send SIGHUP to reload
systemctl reload telegraf
# Or restart (if reload not supported)
systemctl restart telegraf
Alloy¶
Test configuration before reload:
# Validate configuration
alloy validate /etc/alloy/config.alloy
# Reload configuration (experimental)
systemctl reload alloy
# Restart if reload fails
systemctl restart alloy
InfluxDB and Loki¶
Configuration changes require restart:
Service Dependencies¶
Start Order¶
When starting monitoring stack:
- Server components first: InfluxDB, Loki
- Wait for readiness: Check health endpoints
- Client components: Telegraf, Alloy
# Start servers
systemctl start influxdb loki
# Verify servers are ready
curl http://localhost:8086/health
curl http://localhost:3100/ready
# Start clients
systemctl start telegraf alloy
Stop Order¶
When stopping monitoring stack:
- Client components first: Telegraf, Alloy
- Server components: InfluxDB, Loki
Orchestrator Tools¶
manage-svc.sh¶
Service lifecycle management:
# Deploy service
./manage-svc.sh deploy monitor11-metrics
./manage-svc.sh deploy ispconfig3-alloy
# Remove service
./manage-svc.sh remove monitor11-metrics
# Prepare environment
./manage-svc.sh prepare monitor11
svc-exec.sh¶
Execute specific tasks:
# Verify service
./svc-exec.sh verify monitor11-metrics
./svc-exec.sh verify ispconfig3-alloy
# Restart service
./svc-exec.sh restart monitor11-metrics
# Configure service
./svc-exec.sh configure ispconfig3-alloy
Health Checks¶
InfluxDB¶
# Health endpoint
curl -I http://localhost:8086/health
# Ping endpoint
curl -I http://localhost:8086/ping
# Check with token
curl -H "Authorization: Token YOUR_TOKEN" \
http://localhost:8086/api/v2/buckets
Loki¶
# Ready endpoint
curl http://localhost:3100/ready
# Metrics endpoint
curl http://localhost:3100/metrics
# Query test
curl -G "http://localhost:3100/loki/api/v1/query" \
--data-urlencode 'query={service_type="fail2ban"}' \
--data-urlencode 'limit=1'
Telegraf¶
# Check service status
systemctl status telegraf
# Test configuration
telegraf --config /etc/telegraf/telegraf.conf --test
# Verify output to InfluxDB
journalctl -u telegraf -n 20 | grep -i error
Alloy¶
# Check service status
systemctl status alloy
# Validate configuration
alloy validate /etc/alloy/config.alloy
# Check metrics endpoint
curl http://127.0.0.1:12345/metrics
Automation Scripts¶
Health Check Script¶
#!/bin/bash
# check-monitoring.sh
check_service() {
if systemctl is-active --quiet "$1"; then
echo "✅ $1 is running"
return 0
else
echo "❌ $1 is not running"
return 1
fi
}
echo "Checking monitoring services..."
check_service influxdb
check_service loki
check_service telegraf
check_service alloy
Restart All Script¶
#!/bin/bash
# restart-monitoring.sh
echo "Stopping clients..."
systemctl stop telegraf alloy
echo "Restarting servers..."
systemctl restart influxdb loki
echo "Waiting for servers..."
sleep 5
echo "Starting clients..."
systemctl start telegraf alloy
echo "Checking status..."
systemctl status influxdb loki telegraf alloy
Best Practices¶
- Always check status before operations: Verify service state
- Use systemd for container management: Don't use podman commands directly
- Follow proper start/stop order: Servers before clients
- Check logs after restart: Verify no errors occurred
- Test configuration before applying: Avoid service failures
- Use orchestrator tools: Leverage manage-svc.sh and svc-exec.sh
- Document custom procedures: Keep runbooks for complex operations