Skip to content

Version Upgrades

Overview

This page covers upgrading between versions of the solti-monitoring collection and its components. Always test upgrades in non-production environments first.

Collection Upgrade Process

Check Current Version

ansible-galaxy collection list | grep solti_monitoring

Upgrade Collection

# Upgrade to latest version
ansible-galaxy collection install jackaltx.solti_monitoring --upgrade

# Upgrade to specific version
ansible-galaxy collection install jackaltx.solti_monitoring:1.2.0 --force

Review Changelog

Before upgrading, review the changelog for: - Breaking changes - New features - Deprecations - Bug fixes

# View collection documentation
ansible-doc -t module jackaltx.solti_monitoring.ROLE_NAME

Component Upgrades

InfluxDB Upgrades

Minor version upgrades (e.g., 2.7.0 → 2.7.5):

  1. Stop InfluxDB:

    systemctl stop influxdb
    

  2. Update version in role:

    influxdb_version: "2.7.5"  # Update version
    

  3. Run playbook:

    ansible-playbook deploy-influxdb.yml
    

  4. Verify:

    systemctl status influxdb
    curl http://localhost:8086/health
    

Major version upgrades (e.g., 2.x → 3.x):

Warning: InfluxDB 3.x is a major rewrite. Not currently supported by solti-monitoring.

Current recommendation: Stay on InfluxDB 2.x series.

Loki Upgrades

Minor version upgrades (e.g., 2.9.0 → 2.9.3):

  1. Stop Loki:

    systemctl stop loki
    

  2. Update version:

    loki_version: "2.9.3"
    

  3. Run playbook:

    ansible-playbook deploy-loki.yml
    

  4. Verify:

    systemctl status loki
    curl http://localhost:3100/ready
    

Major version upgrades (e.g., 2.x → 3.x):

  1. Review Loki upgrade guide
  2. Check for configuration changes
  3. Test in non-production first
  4. Plan downtime window
  5. Backup data (if using local storage)

Telegraf Upgrades

Any version upgrade:

  1. Update version:

    telegraf_version: "1.29"
    

  2. Run playbook:

    ansible-playbook deploy-telegraf.yml
    

  3. Verify:

    systemctl status telegraf
    telegraf --version
    

Telegraf upgrades are generally seamless due to stable plugin API.

Alloy Upgrades

Version upgrade:

  1. Test configuration with new version:

    # Download new version
    # Test config
    alloy validate /etc/alloy/config.alloy
    

  2. Update version:

    alloy_version: "1.2.0"
    

  3. Run playbook:

    ansible-playbook deploy-alloy.yml
    

  4. Verify:

    systemctl status alloy
    alloy --version
    

Note: Alloy (Grafana Agent) is actively developed. Check release notes for configuration changes.

Breaking Changes

Version 1.x → 2.x (Hypothetical Example)

If a future version introduces breaking changes:

  1. Inventory variable renames:

    # Old (v1.x)
    influxdb_admin_pass: "secret"
    
    # New (v2.x)
    influxdb_admin_token: "secret"
    

  2. Role structure changes: Follow migration guide in release notes

  3. Playbook updates: Update playbook syntax if needed

Rollback Procedures

Collection Rollback

# Install specific older version
ansible-galaxy collection install jackaltx.solti_monitoring:1.1.0 --force

Component Rollback

  1. Stop service:

    systemctl stop influxdb
    

  2. Update to previous version:

    influxdb_version: "2.7.0"  # Previous version
    

  3. Run playbook:

    ansible-playbook deploy-influxdb.yml
    

  4. Restore data (if needed):

    # InfluxDB
    influx restore /backup/influxdb/
    
    # Loki (if local storage)
    systemctl stop loki
    rsync -av /backup/loki/ /var/lib/loki/
    systemctl start loki
    

Compatibility Matrix

Tested Combinations

Collection InfluxDB Loki Telegraf Alloy Ansible
1.0.0 2.7 2.9 1.28 1.0 2.14+
1.1.0 2.7 2.9 1.29 1.1 2.14+

Platform Support

  • Rocky Linux 9
  • Debian 12
  • Ubuntu 24.04

Pre-Upgrade Checklist

  • [ ] Review changelog for breaking changes
  • [ ] Check compatibility matrix
  • [ ] Backup data (InfluxDB, Loki)
  • [ ] Test upgrade in non-production
  • [ ] Plan downtime window (if needed)
  • [ ] Notify users of planned maintenance
  • [ ] Verify rollback procedure
  • [ ] Document current versions

Post-Upgrade Checklist

  • [ ] Verify all services running
  • [ ] Check for errors in logs
  • [ ] Verify data ingestion (metrics, logs)
  • [ ] Run verification procedures (Chapter 9)
  • [ ] Check Grafana dashboards
  • [ ] Monitor performance metrics
  • [ ] Update documentation

Upgrade Automation

Automated Upgrade Script

#!/bin/bash
# upgrade-monitoring.sh

set -e

COLLECTION_VERSION="1.2.0"

echo "Upgrading solti-monitoring to version $COLLECTION_VERSION"

# Upgrade collection
echo "Upgrading collection..."
ansible-galaxy collection install \
  jackaltx.solti_monitoring:$COLLECTION_VERSION --force

# Backup configurations
echo "Backing up configurations..."
mkdir -p /tmp/monitoring-backup
cp /etc/telegraf/telegraf.conf /tmp/monitoring-backup/
cp /etc/alloy/config.alloy /tmp/monitoring-backup/

# Run upgrade playbooks
echo "Upgrading components..."
ansible-playbook -i inventory.yml deploy-monitoring.yml

# Verify
echo "Verifying upgrade..."
./bin/verify-monitoring.sh

echo "Upgrade complete!"

Zero-Downtime Upgrades

For high-availability setups:

  1. Run multiple instances: Deploy redundant servers
  2. Upgrade one at a time: Rolling upgrade pattern
  3. Use load balancer: Shift traffic during upgrade
  4. Verify before proceeding: Check each instance before next

Example for multiple InfluxDB instances:

# Upgrade instance 1
ansible-playbook -i inventory.yml deploy-influxdb.yml --limit influxdb1
# Verify
# Upgrade instance 2
ansible-playbook -i inventory.yml deploy-influxdb.yml --limit influxdb2

Support

For upgrade issues: - Check collection documentation - Review component release notes - Search GitHub issues - Ask in community forums