Skip to content

Configuration Migrations

Overview

Configuration migrations handle changes to configuration formats, variable names, or structure between versions. This page documents migration procedures for breaking configuration changes.

Inventory Variable Changes

Variable Renames

When variables are renamed between versions:

Example migration:

# Old variable name (v1.0)
influxdb_admin_password: "{{ vault_influxdb_pass }}"

# New variable name (v1.1+)
influxdb_admin_token: "{{ vault_influxdb_token }}"

Migration steps: 1. Find all occurrences of old variable 2. Rename to new variable name 3. Update vault if needed 4. Test with --check mode

Variable Structure Changes

Old structure (flat):

telegraf_output_url: "http://localhost:8086"
telegraf_output_token: "token"
telegraf_output_org: "myorg"
telegraf_output_bucket: "telegraf"

New structure (nested, hypothetical):

telegraf_output:
  url: "http://localhost:8086"
  token: "token"
  org: "myorg"
  bucket: "telegraf"

Migration script:

#!/usr/bin/env python3
import yaml

# Read old inventory
with open('inventory.yml') as f:
    inventory = yaml.safe_load(f)

# Transform structure
for host in inventory['all']['hosts'].values():
    if 'telegraf_output_url' in host:
        host['telegraf_output'] = {
            'url': host.pop('telegraf_output_url'),
            'token': host.pop('telegraf_output_token'),
            'org': host.pop('telegraf_output_org'),
            'bucket': host.pop('telegraf_output_bucket'),
        }

# Write new inventory
with open('inventory-new.yml', 'w') as f:
    yaml.safe_dump(inventory, f)

Configuration File Migrations

Telegraf Configuration

TOML format changes (rare):

If Telegraf TOML format changes:

  1. Export current config:

    cp /etc/telegraf/telegraf.conf /tmp/telegraf.conf.old
    

  2. Generate new config:

    telegraf config > /tmp/telegraf.conf.new
    

  3. Compare and merge:

    diff -u /tmp/telegraf.conf.old /tmp/telegraf.conf.new
    

  4. Test new config:

    telegraf --config /tmp/telegraf.conf.new --test
    

Alloy Configuration

River format changes:

Alloy's configuration language may evolve:

Old syntax (hypothetical):

loki.write "default" {
  endpoint = "http://localhost:3100/loki/api/v1/push"
}

New syntax:

loki.write "default" {
  endpoint {
    url = "http://localhost:3100/loki/api/v1/push"
  }
}

Migration: 1. Use alloy fmt to auto-format 2. Use alloy validate to check syntax 3. Review Alloy release notes for breaking changes

Playbook Migrations

Role Name Changes

If role names change:

Old playbook:

roles:
  - role: jackaltx.solti_monitoring.influxdb2

New playbook:

roles:
  - role: jackaltx.solti_monitoring.influxdb

Migration: Update all playbooks with new role names.

Task Tag Changes

If task tags are reorganized:

Old tags:

ansible-playbook deploy.yml --tags influxdb_install,influxdb_config

New tags:

ansible-playbook deploy.yml --tags influxdb

Storage Backend Migrations

Local to S3 Migration

Migrate from local storage to S3 backend:

InfluxDB:

  1. Backup current data:

    influx backup /backup/influxdb/
    

  2. Deploy new InfluxDB with S3:

    influxdb_storage_type: "s3"
    influxdb_s3_endpoint: "storage.example.com:8010"
    influxdb_s3_bucket: "influx11"
    influxdb_s3_access_key: "{{ vault_s3_access }}"
    influxdb_s3_secret_key: "{{ vault_s3_secret }}"
    

  3. Restore data:

    influx restore /backup/influxdb/
    

Loki:

Similar process, but note that Loki uses different storage formats. Consider: - Running dual instances during migration - Accepting log data loss (if retention is short) - Gradual cutover (point new clients to new instance)

Label/Tag Migrations

Standardizing Labels

Migrate to consistent labeling strategy:

Old labels (inconsistent):

# Host 1
telegraf_global_tags:
  host: "server1"

# Host 2
telegraf_global_tags:
  hostname: "server2"

New labels (consistent):

# All hosts
telegraf_global_tags:
  hostname: "{{ ansible_hostname }}"
  environment: "production"

Migration steps: 1. Update inventory with new labels 2. Run playbook to update configurations 3. Wait for retention period (old data expires) 4. Update Grafana dashboards to use new labels

Or run queries that handle both:

from(bucket: "telegraf")
  |> filter(fn: (r) =>
    r["host"] == "server1" or r["hostname"] == "server1"
  )

API Changes

InfluxDB API

InfluxDB 2.x uses different API than 1.x:

Not applicable to solti-monitoring (only supports 2.x), but for reference:

  • InfluxDB 1.x: InfluxQL, /query endpoint
  • InfluxDB 2.x: Flux, /api/v2/query endpoint

Loki API

Loki API has been stable, but check release notes for deprecations.

Deprecation Warnings

Handling Deprecations

When collection shows deprecation warnings:

  1. Identify deprecated features:

    ansible-playbook deploy.yml 2>&1 | grep -i deprecat
    

  2. Review migration guide: Check collection changelog

  3. Update configurations: Replace deprecated features

  4. Test thoroughly: Verify no regressions

Example Deprecation

Deprecated variable (hypothetical):

influxdb_use_ssl: true  # Deprecated in 2.0

Replacement:

influxdb_scheme: "https"  # Use in 2.0+

Migration:

# Find all uses
grep -r "influxdb_use_ssl" inventory/

# Replace with new variable
sed -i 's/influxdb_use_ssl: true/influxdb_scheme: "https"/g' inventory/group_vars/all.yml

Automated Migration Tools

Migration Script Template

#!/usr/bin/env python3
"""
Migrate inventory from v1.0 to v2.0
"""
import yaml
from pathlib import Path

def migrate_host_vars(host_vars):
    """Migrate host variables"""
    migrations = {
        'influxdb_admin_password': 'influxdb_admin_token',
        'telegraf_flush_time': 'telegraf_flush_interval',
    }

    for old, new in migrations.items():
        if old in host_vars:
            host_vars[new] = host_vars.pop(old)
            print(f"  Migrated {old} -> {new}")

    return host_vars

def main():
    inventory_file = Path('inventory.yml')

    # Backup
    backup_file = inventory_file.with_suffix('.yml.backup')
    inventory_file.rename(backup_file)
    print(f"Backed up to {backup_file}")

    # Load
    with open(backup_file) as f:
        inventory = yaml.safe_load(f)

    # Migrate
    print("Migrating inventory...")
    for group in inventory.get('all', {}).get('children', {}).values():
        for host_name, host_vars in group.get('hosts', {}).items():
            print(f"Migrating {host_name}")
            migrate_host_vars(host_vars)

    # Save
    with open(inventory_file, 'w') as f:
        yaml.safe_dump(inventory, f, default_flow_style=False)

    print(f"Migration complete! Saved to {inventory_file}")
    print(f"Review changes with: diff {backup_file} {inventory_file}")

if __name__ == '__main__':
    main()

Testing Migrations

Pre-Migration Testing

  1. Syntax check:

    ansible-playbook deploy.yml --syntax-check
    

  2. Dry run:

    ansible-playbook deploy.yml --check
    

  3. Test environment: Deploy to test hosts first

Post-Migration Verification

  1. Service status: Verify all services running
  2. Data flow: Check metrics and logs arriving
  3. Dashboard check: Verify Grafana dashboards working
  4. Query test: Run sample queries

Migration Checklist

  • [ ] Review migration guide in release notes
  • [ ] Backup current configurations
  • [ ] Update inventory variables
  • [ ] Update playbooks (if needed)
  • [ ] Test in non-production environment
  • [ ] Run syntax check and dry run
  • [ ] Execute migration in production
  • [ ] Verify services and data flow
  • [ ] Update documentation
  • [ ] Monitor for issues

Rollback Plan

If migration fails:

  1. Restore configurations:

    cp /backup/inventory.yml inventory.yml
    cp /backup/telegraf.conf /etc/telegraf/telegraf.conf
    cp /backup/config.alloy /etc/alloy/config.alloy
    

  2. Rollback collection:

    ansible-galaxy collection install jackaltx.solti_monitoring:1.0.0 --force
    

  3. Redeploy:

    ansible-playbook deploy.yml
    

  4. Verify rollback: Run verification procedures