Verification Tasks¶
Overview¶
Verification tasks confirm that roles deployed correctly and services function as expected. These tasks run during molecule testing and can be executed independently for production validation.
Verification Playbooks¶
Structure¶
roles/influxdb/
├── molecule/
│ └── default/
│ └── verify.yml # Verification tasks
└── tasks/
└── verify.yml # Reusable verification
Basic Verification¶
# roles/influxdb/molecule/default/verify.yml
---
- name: Verify InfluxDB
hosts: all
become: true
gather_facts: false
tasks:
- name: Include role verification
include_role:
name: influxdb
tasks_from: verify
Verification Patterns¶
Service Status¶
- name: Check service is running
systemd:
name: influxdb
state: started
check_mode: true
register: service_check
failed_when: service_check.changed
Port Listening¶
- name: Check InfluxDB port
wait_for:
host: localhost
port: 8086
state: started
timeout: 30
register: port_check
HTTP Endpoints¶
- name: Verify InfluxDB health endpoint
uri:
url: "http://localhost:8086/health"
status_code: 200
return_content: true
register: health_check
until: health_check.status == 200
retries: 10
delay: 5
- name: Show health status
debug:
msg: "InfluxDB health: {{ health_check.json.status }}"
File Existence¶
- name: Check configuration file
stat:
path: /etc/influxdb/config.toml
register: config_file
failed_when: not config_file.stat.exists
- name: Verify file permissions
assert:
that:
- config_file.stat.mode == '0644'
- config_file.stat.owner == 'influxdb'
fail_msg: "Configuration file has incorrect permissions"
Process Running¶
- name: Check InfluxDB process
command: pgrep -f influxdb
register: process_check
changed_when: false
failed_when: process_check.rc != 0
Podman Container¶
- name: Verify InfluxDB container running
command: podman ps --filter name=influxdb --format "{{.Status}}"
register: container_status
changed_when: false
failed_when: "'Up' not in container_status.stdout"
- name: Check container health
command: podman healthcheck run influxdb
register: health
changed_when: false
failed_when: health.rc != 0
Role-Specific Verification¶
InfluxDB Verification¶
# roles/influxdb/tasks/verify.yml
---
- name: Verify InfluxDB service
systemd:
name: influxdb
state: started
check_mode: true
register: service
failed_when: service.changed
- name: Check InfluxDB API
uri:
url: "http://localhost:8086/health"
status_code: 200
register: api_health
until: api_health.status == 200
retries: 10
delay: 5
- name: Verify InfluxDB version
command: podman exec influxdb influx version
register: version
changed_when: false
failed_when: version.rc != 0
- name: Check organization exists
command: >
podman exec influxdb
influx org list --name {{ influxdb_org }}
register: org_check
changed_when: false
failed_when: influxdb_org not in org_check.stdout
- name: Verify bucket created
command: >
podman exec influxdb
influx bucket list --name {{ influxdb_bucket }}
register: bucket_check
changed_when: false
failed_when: influxdb_bucket not in bucket_check.stdout
- name: Test write capability
uri:
url: "http://localhost:8086/api/v2/write?org={{ influxdb_org }}&bucket={{ influxdb_bucket }}"
method: POST
headers:
Authorization: "Token {{ influxdb_admin_token }}"
body: "test,host=verify value=1"
status_code: 204
when: not (secure_logging | default(true) | bool)
- name: Test query capability
uri:
url: "http://localhost:8086/api/v2/query?org={{ influxdb_org }}"
method: POST
headers:
Authorization: "Token {{ influxdb_admin_token }}"
Content-Type: "application/vnd.flux"
body: 'from(bucket:"{{ influxdb_bucket }}") |> range(start:-1m) |> limit(n:1)'
status_code: 200
when: not (secure_logging | default(true) | bool)
Loki Verification¶
# roles/loki/tasks/verify.yml
---
- name: Verify Loki service
systemd:
name: loki
state: started
check_mode: true
register: service
failed_when: service.changed
- name: Check Loki ready endpoint
uri:
url: "http://localhost:3100/ready"
status_code: 200
register: ready_check
until: ready_check.status == 200
retries: 10
delay: 5
- name: Test Loki push API
uri:
url: "http://localhost:3100/loki/api/v1/push"
method: POST
headers:
Content-Type: "application/json"
body_format: json
body:
streams:
- stream:
hostname: verify_test
values:
- ["{{ ansible_date_time.epoch }}000000000", "test log message"]
status_code: 204
- name: Test Loki query API
uri:
url: "http://localhost:3100/loki/api/v1/query"
method: GET
body_format: form-urlencoded
body:
query: '{hostname="verify_test"}'
limit: 1
status_code: 200
register: query_result
until: query_result.status == 200
retries: 5
delay: 2
Telegraf Verification¶
# roles/telegraf/tasks/verify.yml
---
- name: Verify Telegraf service
systemd:
name: telegraf
state: started
check_mode: true
register: service
failed_when: service.changed
- name: Test Telegraf configuration
command: telegraf --test --config /etc/telegraf/telegraf.conf
register: config_test
changed_when: false
failed_when: config_test.rc != 0
- name: Check Telegraf is collecting metrics
shell: |
telegraf --test --config /etc/telegraf/telegraf.conf | head -20
register: metrics
changed_when: false
failed_when: "'cpu' not in metrics.stdout"
Alloy Verification¶
# roles/alloy/tasks/verify.yml
---
- name: Verify Alloy service
systemd:
name: alloy
state: started
check_mode: true
register: service
failed_when: service.changed
- name: Validate Alloy configuration
command: alloy fmt /etc/alloy/config.alloy
register: fmt_check
changed_when: false
- name: Check Alloy configuration syntax
command: alloy validate /etc/alloy/config.alloy
register: validate_check
changed_when: false
failed_when: validate_check.rc != 0
- name: Verify Alloy is forwarding logs
shell: journalctl -u alloy -n 50 | grep -i "forward"
register: forward_check
changed_when: false
failed_when: forward_check.rc != 0
Integration Verification¶
End-to-End Metrics Pipeline¶
# roles/metrics_tests/tasks/verify.yml
---
- name: Write test metric via Telegraf
lineinfile:
path: /tmp/test_metric.txt
line: "test_metric,host={{ ansible_hostname }} value=42"
create: true
- name: Wait for metric to be collected
pause:
seconds: 15
- name: Query test metric from InfluxDB
uri:
url: "http://{{ influxdb_url }}/api/v2/query?org={{ influxdb_org }}"
method: POST
headers:
Authorization: "Token {{ influxdb_token }}"
Content-Type: "application/vnd.flux"
body: |
from(bucket:"{{ influxdb_bucket }}")
|> range(start:-5m)
|> filter(fn: (r) => r._measurement == "test_metric")
|> last()
status_code: 200
register: metric_query
failed_when: metric_query.json.length == 0
End-to-End Logging Pipeline¶
# roles/log_tests/tasks/verify.yml
---
- name: Write test log entry
shell: logger -t test_verify "Verification test log entry {{ ansible_date_time.epoch }}"
- name: Wait for log to be forwarded
pause:
seconds: 10
- name: Query test log from Loki
uri:
url: "http://{{ loki_url }}/loki/api/v1/query"
method: GET
body_format: form-urlencoded
body:
query: '{hostname="{{ ansible_hostname }}"} |= "Verification test log"'
limit: 10
status_code: 200
register: log_query
failed_when:
- log_query.json.data.result | length == 0
Verification Reports¶
Generate Test Report¶
- name: Create verification report
template:
src: verification_report.md.j2
dest: /tmp/verification_report.md
delegate_to: localhost
- name: Display report
debug:
msg: "{{ lookup('file', '/tmp/verification_report.md') }}"
Report Template¶
# Verification Report
**Date:** {{ ansible_date_time.iso8601 }}
**Host:** {{ ansible_hostname }}
## Service Status
- InfluxDB: {{ 'PASS' if influxdb_service.state == 'running' else 'FAIL' }}
- Loki: {{ 'PASS' if loki_service.state == 'running' else 'FAIL' }}
- Telegraf: {{ 'PASS' if telegraf_service.state == 'running' else 'FAIL' }}
- Alloy: {{ 'PASS' if alloy_service.state == 'running' else 'FAIL' }}
## Health Checks
- InfluxDB API: {{ influxdb_health.status }}
- Loki API: {{ loki_health.status }}
## Test Results
- Metrics Pipeline: {{ 'PASS' if metrics_test.rc == 0 else 'FAIL' }}
- Logging Pipeline: {{ 'PASS' if logs_test.rc == 0 else 'FAIL' }}
Running Verification Independently¶
Via Ansible¶
# Run verification playbook
ansible-playbook -i inventory.yml verify.yml
# Verify specific role
ansible-playbook -i inventory.yml verify.yml --tags influxdb
Via Molecule¶
Production Verification¶
Best Practices¶
1. Test Actual Functionality¶
Don't just check if services are running - verify they work:
# BAD: Only checks service status
- systemd:
name: influxdb
state: started
# GOOD: Verifies API responds
- uri:
url: "http://localhost:8086/health"
status_code: 200
2. Include Retries¶
Services need time to start:
- uri:
url: "http://localhost:8086/health"
register: health
until: health.status == 200
retries: 10
delay: 5
3. Verify Data Flow¶
Test the entire pipeline:
4. Cleanup Test Data¶
- name: Remove test data
uri:
url: "http://localhost:8086/api/v2/delete"
method: POST
when: cleanup_after_verify | default(true)
5. Secure Credential Testing¶
- name: Test with credentials
uri:
url: "{{ api_url }}"
headers:
Authorization: "Token {{ token }}"
when: not (secure_logging | default(true) | bool)
no_log: "{{ secure_logging | default(true) | bool }}"
Next Steps¶
- Operations: Production verification procedures
- Troubleshooting: Common verification failures
- Molecule Framework: Complete testing guide