Quick Start

Get a monitoring stack running in 30 minutes or less.

Prerequisites¶

Required¶

Ansible: 2.9 or higher
Python: 3.6 or higher on control node
Linux host: Debian 11/12, Rocky 9, or Ubuntu 24
SSH access: To target hosts with sudo privileges
Network: Outbound internet for package downloads

Recommended¶

Two hosts: One for monitoring server, one for client
4GB RAM: Minimum per host
20GB disk: For short-term storage

Installation¶

Step 1: Install Collection¶

# Install from Ansible Galaxy
ansible-galaxy collection install jackaltx.solti_monitoring

# Verify installation
ansible-galaxy collection list | grep solti_monitoring

Step 2: Create Inventory¶

# inventory.yml
all:
  children:
    monitoring_servers:
      hosts:
        monitor1:
          ansible_host: 192.168.1.10

    monitored_clients:
      hosts:
        monitor1:  # Monitor the monitoring server itself
          ansible_host: 192.168.1.10
        client1:
          ansible_host: 192.168.1.20

Why monitor1 appears twice: - As monitoring_servers → Runs InfluxDB and Loki (storage) - As monitored_clients → Runs Telegraf and Alloy (collectors) - Reason: Monitor the monitor! Track CPU, memory, disk usage of monitoring infrastructure itself

Step 3: Deploy Monitoring Server¶

# deploy-server.yml
---
- name: Deploy Monitoring Server
  hosts: monitoring_servers
  become: true

  roles:
    - role: jackaltx.solti_monitoring.influxdb
      vars:
        influxdb_org: "myorg"
        influxdb_bucket: "telegraf"

    - role: jackaltx.solti_monitoring.loki
      vars:
        loki_local_storage: true

Run the playbook:

ansible-playbook -i inventory.yml deploy-server.yml

Expected time: 5-10 minutes

What happens: - InfluxDB deployed with auto-generated tokens - Loki deployed with local storage - Monitor1 ready to receive metrics and logs

Step 4: Get InfluxDB Token¶

After deployment, get the system-operator token:

# SSH to monitoring server
ssh monitor1

# List all tokens
sudo podman exec influxdb influx auth list --json

# Find system-operator token
sudo podman exec influxdb influx auth list --json | jq -r '.[] | select(.description == "system-operator") | .token'

Copy the token value - you'll need it for the next step.

Step 5: Configure Connection Database¶

Create connection configurations for Telegraf outputs:

# group_vars/all/telegraf2influxdb_configs.yml
---
telgraf2influxdb_configs:
  localhost:
    url: "http://127.0.0.1:8086"
    token: ""  # Leave empty - auto-discovered when collector runs on same host as InfluxDB
    bucket: "telegraf"
    org: "myorg"
    namedrop: '["influxdb_oss"]'

  central:
    url: "http://192.168.1.10:8086"
    token: "Z5VHB6JzIEioWv9MH1_lFgTcfFy_yZR8V7ThhiA0lAdXmeS50_y-rjL8PrXNPrlG1zLziOHZwsxVkqojWaPJ4A=="  # Paste token from Step 4
    bucket: "telegraf"
    org: "myorg"

Important - Auto-Discovery Feature:

localhost config: Token is empty (token: "")
Why: When Telegraf runs on the same host as InfluxDB, the role automatically discovers the token from the local InfluxDB instance
Use case: monitor1 monitoring itself (all-in-one deployment)
central config: Token is required (paste from Step 4)
Why: Remote clients cannot auto-discover tokens
Use case: client1 sending metrics to monitor1

Directory structure:

./
├── inventory.yml
├── group_vars/
│   └── all/
│       └── telegraf2influxdb_configs.yml
├── deploy-server.yml
└── deploy-client.yml

Step 6: Deploy Monitoring Clients¶

# deploy-client.yml
---
- name: Deploy Telegraf on Monitor Server
  hosts: monitoring_servers
  become: true

  roles:
    - role: jackaltx.solti_monitoring.telegraf
      vars:
        telegraf_outputs: ['localhost']  # Uses localhost config (auto-discovery)

- name: Deploy Clients
  hosts: monitored_clients
  become: true

  roles:
    - role: jackaltx.solti_monitoring.telegraf
      vars:
        telegraf_outputs: ['central']  # Uses central config (requires token)

    - role: jackaltx.solti_monitoring.alloy
      vars:
        alloy_loki_endpoints:
          - label: central
            endpoint: "192.168.1.10"

Run the playbook:

ansible-playbook -i inventory.yml deploy-client.yml

Expected time: 3-5 minutes

What happens: - monitor1 gets Telegraf → uses localhost config (token auto-discovered) - client1 gets Telegraf → uses central config (token from Step 5) - Both get Alloy → send logs to monitor1

Step 7: Verify Data Flow¶

Check metrics from monitor1 (monitoring itself):

curl -G "http://192.168.1.10:8086/api/v2/query?org=myorg" \
  -H "Authorization: Token YOUR_TOKEN_HERE" \
  --data-urlencode 'query=from(bucket:"telegraf") |> range(start:-5m) |> filter(fn:(r) => r["host"] == "monitor1") |> limit(n:5)'

Check metrics from client1:

curl -G "http://192.168.1.10:8086/api/v2/query?org=myorg" \
  -H "Authorization: Token YOUR_TOKEN_HERE" \
  --data-urlencode 'query=from(bucket:"telegraf") |> range(start:-5m) |> filter(fn:(r) => r["host"] == "client1") |> limit(n:5)'

Check logs:

curl -G "http://192.168.1.10:3100/loki/api/v1/query" \
  --data-urlencode 'query={hostname=~"monitor1|client1"}' \
  --data-urlencode 'limit=10'

If you see data from both hosts, congratulations! Your monitoring stack is working.

Alternative: All-in-One Testing¶

For testing or single-host deployments, you can deploy everything on one host:

Simplified Inventory¶

# inventory-allinone.yml
all:
  hosts:
    monitor1:
      ansible_host: 192.168.1.10

  children:
    monitoring_servers:
      hosts:
        monitor1:

    monitored_clients:
      hosts:
        monitor1:

Single Playbook¶

# deploy-allinone.yml
---
- name: Deploy Monitoring Server
  hosts: monitoring_servers
  become: true
  roles:
    - role: jackaltx.solti_monitoring.influxdb
      vars:
        influxdb_org: "myorg"
        influxdb_bucket: "telegraf"
    - role: jackaltx.solti_monitoring.loki
      vars:
        loki_local_storage: true

- name: Deploy Collectors
  hosts: monitored_clients
  become: true
  roles:
    - role: jackaltx.solti_monitoring.telegraf
      vars:
        telegraf_outputs: ['localhost']  # Token auto-discovered
    - role: jackaltx.solti_monitoring.alloy
      vars:
        alloy_loki_endpoints:
          - label: localhost
            endpoint: "127.0.0.1"

Run All-in-One¶

# Create minimal connection config
mkdir -p group_vars/all
cat > group_vars/all/telegraf2influxdb_configs.yml <<EOF
---
telgraf2influxdb_configs:
  localhost:
    url: "http://127.0.0.1:8086"
    token: ""  # Auto-discovered
    bucket: "telegraf"
    org: "myorg"
    namedrop: '["influxdb_oss"]'
EOF

# Deploy everything
ansible-playbook -i inventory-allinone.yml deploy-allinone.yml

What's special: - No manual token retrieval needed - The Telegraf role automatically discovers the token from the local InfluxDB - Perfect for testing - Quick setup to verify the collection works - Production use - Also works for small deployments where everything runs on one server

Summary¶

You now have: - ✅ InfluxDB receiving metrics - ✅ Loki receiving logs - ✅ Telegraf collecting system metrics - ✅ Alloy collecting system logs - ✅ Verified data flow

Total time: ~30 minutes

Next: Explore the rest of this documentation to customize and expand your monitoring setup.