Skip to content

Storage Migrations

Overview

Storage migrations involve moving data between storage backends or expanding storage capacity. This page covers common storage migration scenarios.

Local to S3 Migration

InfluxDB Local → S3

Scenario: Migrate from local disk to S3-compatible object storage

Benefits: - Virtually unlimited capacity - Lower cost per GB - Built-in replication - Separate compute from storage

Migration steps:

  1. Prepare S3 bucket:

    # Create bucket (MinIO example)
    mc mb minio/influx11
    mc anonymous set download minio/influx11
    

  2. Backup current data:

    influx backup /backup/influxdb-$(date +%Y%m%d)/
    

  3. Deploy new InfluxDB with S3:

    # inventory update
    influxdb_storage_type: "s3"
    influxdb_s3_endpoint: "storage.example.com:8010"
    influxdb_s3_bucket: "influx11"
    influxdb_s3_access_key: "{{ vault_s3_access }}"
    influxdb_s3_secret_key: "{{ vault_s3_secret }}"
    

ansible-playbook deploy-influxdb.yml
  1. Restore data:

    influx restore /backup/influxdb-YYYYMMDD/
    

  2. Verify data:

    # Query for recent data
    influx query 'from(bucket: "telegraf") |> range(start: -1h) |> limit(n: 10)'
    

  3. Update clients (if endpoint changed):

    telegraf_output_url: "http://NEW_ENDPOINT:8086"
    

Downtime: 15-30 minutes (depending on data size)

Loki Local → S3

Migration steps:

  1. Prepare S3 bucket:

    mc mb minio/loki11
    

  2. Option A: Cutover (accept data loss):

  3. Deploy new Loki with S3
  4. Point clients to new Loki
  5. Old logs retained per retention policy

  6. Option B: Dual instance (no data loss):

  7. Deploy second Loki with S3
  8. Update Alloy to send to both
  9. After retention period, decommission old Loki

Option B configuration:

// Send to both old and new Loki
loki.write "old" {
  endpoint {
    url = "http://old-loki:3100/loki/api/v1/push"
  }
}

loki.write "new_s3" {
  endpoint {
    url = "http://new-loki:3100/loki/api/v1/push"
  }
}

loki.source.journal "logs" {
  forward_to = [
    loki.write.old.receiver,
    loki.write.new_s3.receiver,
  ]
}

After retention period (e.g., 30 days):

// Remove old Loki
loki.source.journal "logs" {
  forward_to = [loki.write.new_s3.receiver]
}

S3 Bucket Migration

Move to Different S3 Provider

Scenario: Migrate from one S3-compatible storage to another

Steps:

  1. Create bucket on new provider:

    # New provider
    mc alias set newprovider https://new-s3.example.com ACCESS_KEY SECRET_KEY
    mc mb newprovider/influx11-new
    

  2. Sync data (if supported):

    mc mirror oldprovider/influx11 newprovider/influx11-new
    

  3. Update InfluxDB configuration:

    influxdb_s3_endpoint: "new-s3.example.com"
    influxdb_s3_bucket: "influx11-new"
    influxdb_s3_access_key: "{{ vault_new_s3_access }}"
    influxdb_s3_secret_key: "{{ vault_new_s3_secret }}"
    

  4. Restart InfluxDB:

    systemctl restart influxdb
    

  5. Verify:

    curl http://localhost:8086/health
    influx query 'from(bucket: "telegraf") |> range(start: -1h) |> count()'
    

Expanding Local Storage

Add New Disk

Scenario: Local storage running out of space

Steps:

  1. Add physical disk: Install new disk in server

  2. Create filesystem:

    # Create partition
    sudo fdisk /dev/sdb
    # Create filesystem
    sudo mkfs.ext4 /dev/sdb1
    

  3. Stop services:

    systemctl stop influxdb loki
    

  4. Move data:

    # Mount new disk temporarily
    mkdir /mnt/new_storage
    mount /dev/sdb1 /mnt/new_storage
    
    # Copy data
    rsync -av /var/lib/influxdb2/ /mnt/new_storage/influxdb/
    rsync -av /var/lib/loki/ /mnt/new_storage/loki/
    
    # Verify copy
    du -sh /var/lib/influxdb2 /mnt/new_storage/influxdb
    

  5. Update mounts:

    # /etc/fstab
    /dev/sdb1  /var/lib/influxdb2  ext4  noatime,nodiratime  0  2
    
    # Mount
    umount /mnt/new_storage
    mount /var/lib/influxdb2
    

  6. Fix permissions:

    chown -R 1000:1000 /var/lib/influxdb2
    chown -R 1000:1000 /var/lib/loki
    

  7. Start services:

    systemctl start influxdb loki
    

  8. Verify:

    df -h /var/lib/influxdb2
    systemctl status influxdb loki
    

Extend LVM Volume

If using LVM:

# Add physical volume
pvcreate /dev/sdb1

# Extend volume group
vgextend vg_monitoring /dev/sdb1

# Extend logical volume
lvextend -l +100%FREE /dev/vg_monitoring/lv_monitoring

# Resize filesystem
resize2fs /dev/vg_monitoring/lv_monitoring

NFS Migration

Local to NFS

Scenario: Migrate to network-attached storage

Steps:

  1. Prepare NFS share:

    # On NFS server
    mkdir -p /exports/monitoring
    chown 1000:1000 /exports/monitoring
    
    # /etc/exports
    /exports/monitoring  192.168.1.0/24(rw,sync,no_root_squash)
    
    exportfs -a
    

  2. Mount on client:

    mkdir /mnt/nfs_monitoring
    mount -t nfs storage.example.com:/exports/monitoring /mnt/nfs_monitoring
    
    # Test write
    touch /mnt/nfs_monitoring/test
    

  3. Stop services and migrate:

    systemctl stop influxdb loki
    rsync -av /var/lib/influxdb2/ /mnt/nfs_monitoring/influxdb/
    rsync -av /var/lib/loki/ /mnt/nfs_monitoring/loki/
    

  4. Update systemd units:

    # /etc/systemd/system/influxdb.service.d/override.conf
    [Service]
    Environment="INFLUXDB_DATA_PATH=/mnt/nfs_monitoring/influxdb"
    

  5. Update fstab:

    # /etc/fstab
    storage.example.com:/exports/monitoring  /mnt/nfs_monitoring  nfs  defaults  0  0
    

  6. Start services:

    systemctl daemon-reload
    systemctl start influxdb loki
    

Data Retention Changes

Reduce Retention to Free Space

If running out of space, reduce retention:

InfluxDB:

# Update bucket retention
influx bucket update \
  --name telegraf \
  --retention 7d  # Reduced from 30d

Loki:

# Update role config
loki_retention: "7d"  # Reduced from 30d

Redeploy:

ansible-playbook deploy-loki.yml

Old data will be deleted according to new retention policy.

Migration Verification

Post-Migration Checklist

  • [ ] Services running without errors
  • [ ] Data accessible via queries
  • [ ] No permission errors in logs
  • [ ] Storage usage as expected
  • [ ] Performance acceptable
  • [ ] Clients connecting successfully
  • [ ] Dashboards showing data
  • [ ] Backup/restore tested

Query Test

InfluxDB:

influx query 'from(bucket: "telegraf")
  |> range(start: -1h)
  |> filter(fn: (r) => r["_measurement"] == "cpu")
  |> count()'

Loki:

curl -G "http://localhost:3100/loki/api/v1/query" \
  --data-urlencode 'query={service_type="fail2ban"}' \
  --data-urlencode 'limit=5'

Performance Check

Disk I/O:

iostat -x 1 10

Network (if NFS/S3):

iftop

Query latency:

time influx query '...'

Rollback Procedures

S3 → Local Rollback

  1. Keep local backup until verified
  2. Update configuration back to local
  3. Restore from backup if needed
  4. Restart services

NFS → Local Rollback

  1. Copy data back to local disk
  2. Update systemd unit files
  3. Update fstab
  4. Restart services

Best Practices

  1. Always backup: Before any storage migration
  2. Test first: Use test environment
  3. Verify twice: Check data before deleting source
  4. Plan downtime: Communicate maintenance window
  5. Monitor closely: Watch for issues post-migration
  6. Keep backups: Don't delete old data immediately
  7. Document changes: Update infrastructure docs
  8. Verify performance: Ensure migration didn't degrade performance