Skip to content

Client Configuration

Overview

Client configuration covers common patterns for deploying Telegraf and Alloy collectors on hosts that ship data to central monitoring servers.

Standard Client Playbook

Basic playbook for deploying both collectors:

---
- name: Deploy monitoring clients
  hosts: monitoring_clients
  become: true

  roles:
    - role: jackaltx.solti_monitoring.telegraf
      vars:
        telegraf_output_url: "{{ vault_influxdb_url }}"
        telegraf_output_token: "{{ vault_influxdb_token }}"
        telegraf_output_org: "myorg"
        telegraf_output_bucket: "telegraf"

    - role: jackaltx.solti_monitoring.alloy
      vars:
        alloy_loki_endpoint: "{{ vault_loki_url }}"
        alloy_config_sources:
          - fail2ban
          - syslog

Inventory Configuration

Define clients in inventory with appropriate variables:

# inventory.yml
all:
  children:
    monitoring_clients:
      hosts:
        web1.example.com:
          alloy_config_sources:
            - apache
            - fail2ban

        db1.example.com:
          telegraf_enable_postgresql: true
          alloy_config_sources:
            - postgresql
            - syslog

        mail1.example.com:
          alloy_config_sources:
            - mail
            - fail2ban

      vars:
        # Shared client configuration
        telegraf_output_url: "http://10.10.0.11:8086"
        alloy_loki_endpoint: "http://10.10.0.11:3100"

Network Configurations

Direct Access (Development)

Clients connect directly to monitoring servers:

telegraf_output_url: "http://monitor.example.com:8086"
alloy_loki_endpoint: "http://monitor.example.com:3100"

Pros: Simple setup Cons: No encryption, requires open firewall rules

WireGuard VPN (Production)

Clients connect via WireGuard tunnel:

telegraf_output_url: "http://10.10.0.11:8086"  # WireGuard IP
alloy_loki_endpoint: "http://10.10.0.11:3100"  # WireGuard IP

Pros: Encrypted, authenticated, works across networks Cons: Requires WireGuard setup

HTTPS with TLS

Clients connect via HTTPS:

telegraf_output_url: "https://monitor.example.com:8086"
alloy_loki_endpoint: "https://monitor.example.com:3100"

Pros: Encrypted, uses standard HTTPS Cons: Requires TLS certificates, reverse proxy setup

Host-Specific Customization

Web Server Client

- host: web1.example.com
  roles:
    - telegraf:
        telegraf_enable_nginx: true
    - alloy:
        alloy_config_sources:
          - apache
          - nginx
          - fail2ban

Database Server Client

- host: db1.example.com
  roles:
    - telegraf:
        telegraf_enable_postgresql: true
        telegraf_enable_redis: true
    - alloy:
        alloy_config_sources:
          - postgresql
          - syslog

Mail Server Client

- host: mail1.example.com
  roles:
    - alloy:
        alloy_config_sources:
          - mail
          - fail2ban

Container Host Client

- host: docker1.example.com
  roles:
    - telegraf:
        telegraf_enable_docker: true
    - alloy:
        alloy_config_sources:
          - docker
          - syslog

Global Tagging Strategy

Add consistent labels across all clients:

# Group vars for all clients
monitoring_clients:
  vars:
    telegraf_global_tags:
      environment: "production"
      datacenter: "dc1"
      collector: "telegraf"

    alloy_global_labels:
      environment: "production"
      datacenter: "dc1"
      collector: "alloy"

Client Verification

After deployment, verify clients are working:

Check Service Status

ansible monitoring_clients -m shell -a "systemctl status telegraf"
ansible monitoring_clients -m shell -a "systemctl status alloy"

Verify Data Flow

Check that data is reaching servers:

# InfluxDB - check for recent data
curl -G "http://monitor.example.com:8086/api/v2/query" \
  --data-urlencode "org=myorg" \
  --data-urlencode "query=from(bucket:"telegraf") |> range(start: -5m)"

# Loki - check for recent logs
curl -G "http://monitor.example.com:3100/loki/api/v1/query" \
  --data-urlencode 'query={hostname="web1.example.com"}' \
  --data-urlencode "limit=10"

Incremental Updates

Update client configurations without full redeployment:

# Update only Telegraf configs
ansible-playbook -i inventory.yml deploy-clients.yml --tags telegraf_config

# Restart only Alloy services
ansible-playbook -i inventory.yml deploy-clients.yml --tags alloy_restart

Client Removal

To remove monitoring clients:

- name: Remove monitoring clients
  hosts: monitoring_clients
  become: true

  tasks:
    - name: Stop and disable services
      systemd:
        name: "{{ item }}"
        state: stopped
        enabled: false
      loop:
        - telegraf
        - alloy

    - name: Remove packages
      package:
        name: "{{ item }}"
        state: absent
      loop:
        - telegraf
        - alloy

Best Practices

  1. Use inventory groups: Organize clients by function (web, db, mail)
  2. Consistent tagging: Apply meaningful labels for filtering
  3. Test before production: Deploy to test hosts first
  4. Monitor the monitors: Alert on client failures
  5. Document customizations: Keep notes on host-specific configs
  6. Version control: Store inventory and playbooks in git
  7. Secure credentials: Use Ansible Vault for tokens
  8. Network security: Use WireGuard or TLS for production

Troubleshooting

Client Not Sending Data

  1. Check service status
  2. Review logs (journalctl -u telegraf, journalctl -u alloy)
  3. Test network connectivity to servers
  4. Verify authentication tokens
  5. Check firewall rules

High Resource Usage

  1. Reduce collection frequency (Telegraf)
  2. Limit log sources (Alloy)
  3. Filter metrics/logs at source
  4. Check for configuration errors

Configuration Drift

  1. Re-run Ansible playbooks to enforce state
  2. Use --check mode to preview changes
  3. Monitor for unauthorized config changes