Client Configuration¶

Overview¶

Client configuration covers common patterns for deploying Telegraf and Alloy collectors on hosts that ship data to central monitoring servers.

Standard Client Playbook¶

Basic playbook for deploying both collectors:

---
- name: Deploy monitoring clients
  hosts: monitoring_clients
  become: true

  roles:
    - role: jackaltx.solti_monitoring.telegraf
      vars:
        telegraf_output_url: "{{ vault_influxdb_url }}"
        telegraf_output_token: "{{ vault_influxdb_token }}"
        telegraf_output_org: "myorg"
        telegraf_output_bucket: "telegraf"

    - role: jackaltx.solti_monitoring.alloy
      vars:
        alloy_loki_endpoint: "{{ vault_loki_url }}"
        alloy_config_sources:
          - fail2ban
          - syslog

Inventory Configuration¶

Define clients in inventory with appropriate variables:

# inventory.yml
all:
  children:
    monitoring_clients:
      hosts:
        web1.example.com:
          alloy_config_sources:
            - apache
            - fail2ban

        db1.example.com:
          telegraf_enable_postgresql: true
          alloy_config_sources:
            - postgresql
            - syslog

        mail1.example.com:
          alloy_config_sources:
            - mail
            - fail2ban

      vars:
        # Shared client configuration
        telegraf_output_url: "http://10.10.0.11:8086"
        alloy_loki_endpoint: "http://10.10.0.11:3100"

Network Configurations¶

Direct Access (Development)¶

Clients connect directly to monitoring servers:

telegraf_output_url: "http://monitor.example.com:8086"
alloy_loki_endpoint: "http://monitor.example.com:3100"

Pros: Simple setup Cons: No encryption, requires open firewall rules

WireGuard VPN (Production)¶

Clients connect via WireGuard tunnel:

telegraf_output_url: "http://10.10.0.11:8086"  # WireGuard IP
alloy_loki_endpoint: "http://10.10.0.11:3100"  # WireGuard IP

Pros: Encrypted, authenticated, works across networks Cons: Requires WireGuard setup

HTTPS with TLS¶

Clients connect via HTTPS:

telegraf_output_url: "https://monitor.example.com:8086"
alloy_loki_endpoint: "https://monitor.example.com:3100"

Pros: Encrypted, uses standard HTTPS Cons: Requires TLS certificates, reverse proxy setup

Host-Specific Customization¶

Web Server Client¶

- host: web1.example.com
  roles:
    - telegraf:
        telegraf_enable_nginx: true
    - alloy:
        alloy_config_sources:
          - apache
          - nginx
          - fail2ban

Database Server Client¶

- host: db1.example.com
  roles:
    - telegraf:
        telegraf_enable_postgresql: true
        telegraf_enable_redis: true
    - alloy:
        alloy_config_sources:
          - postgresql
          - syslog

Mail Server Client¶

- host: mail1.example.com
  roles:
    - alloy:
        alloy_config_sources:
          - mail
          - fail2ban

Container Host Client¶

- host: docker1.example.com
  roles:
    - telegraf:
        telegraf_enable_docker: true
    - alloy:
        alloy_config_sources:
          - docker
          - syslog

Global Tagging Strategy¶

Add consistent labels across all clients:

# Group vars for all clients
monitoring_clients:
  vars:
    telegraf_global_tags:
      environment: "production"
      datacenter: "dc1"
      collector: "telegraf"

    alloy_global_labels:
      environment: "production"
      datacenter: "dc1"
      collector: "alloy"

Client Verification¶

After deployment, verify clients are working:

Check Service Status¶

ansible monitoring_clients -m shell -a "systemctl status telegraf"
ansible monitoring_clients -m shell -a "systemctl status alloy"

Verify Data Flow¶

Check that data is reaching servers:

# InfluxDB - check for recent data
curl -G "http://monitor.example.com:8086/api/v2/query" \
  --data-urlencode "org=myorg" \
  --data-urlencode "query=from(bucket:"telegraf") |> range(start: -5m)"

# Loki - check for recent logs
curl -G "http://monitor.example.com:3100/loki/api/v1/query" \
  --data-urlencode 'query={hostname="web1.example.com"}' \
  --data-urlencode "limit=10"

Incremental Updates¶

Update client configurations without full redeployment:

# Update only Telegraf configs
ansible-playbook -i inventory.yml deploy-clients.yml --tags telegraf_config

# Restart only Alloy services
ansible-playbook -i inventory.yml deploy-clients.yml --tags alloy_restart

Client Removal¶

To remove monitoring clients:

- name: Remove monitoring clients
  hosts: monitoring_clients
  become: true

  tasks:
    - name: Stop and disable services
      systemd:
        name: "{{ item }}"
        state: stopped
        enabled: false
      loop:
        - telegraf
        - alloy

    - name: Remove packages
      package:
        name: "{{ item }}"
        state: absent
      loop:
        - telegraf
        - alloy

Best Practices¶

Use inventory groups: Organize clients by function (web, db, mail)
Consistent tagging: Apply meaningful labels for filtering
Test before production: Deploy to test hosts first
Monitor the monitors: Alert on client failures
Document customizations: Keep notes on host-specific configs
Version control: Store inventory and playbooks in git
Secure credentials: Use Ansible Vault for tokens
Network security: Use WireGuard or TLS for production

Troubleshooting¶

Client Not Sending Data¶

Check service status
Review logs (journalctl -u telegraf, journalctl -u alloy)
Test network connectivity to servers
Verify authentication tokens
Check firewall rules

High Resource Usage¶

Reduce collection frequency (Telegraf)
Limit log sources (Alloy)
Filter metrics/logs at source
Check for configuration errors

Configuration Drift¶

Re-run Ansible playbooks to enforce state
Use --check mode to preview changes
Monitor for unauthorized config changes