Skip to content

Introduction

Purpose

An experiment in AI co-development for code, process, and documentation.

The Solti-Monitoring Collection is an Ansible-based monitoring stack designed for small labs and testing environments, not enterprise scale. It provides tested roles for deploying monitoring infrastructure using modern, cloud-native tools.

Key Characteristics: - Lab-scale focus: Streamlined for testing and small deployments, not high-availability or massive scale - Monitoring sink/source architecture: Collectors (sources) send to storage servers (sinks) - Comprehensive testing: Molecule framework with both container and VM-based testing - Manual HTTPS: TLS/HTTPS configuration is not automated - must be configured manually if needed - AI-assisted development: This collection, its processes, and documentation were developed with AI collaboration

What is Solti-Monitoring?

Solti-Monitoring implements parallel monitoring pipelines:

  • Metrics Pipeline: Telegraf (collector) → InfluxDB (storage)
  • Logging Pipeline: Alloy (collector) → Loki (storage)

Both pipelines support multiple deployment patterns and storage backends.

Key Features

  • Quick setup: Deploy a complete monitoring stack in under 30 minutes
  • Sink/source pattern: Collectors and storage servers deployed independently
  • Molecule testing: Automated testing with containers and VMs across Debian, Rocky, Ubuntu
  • Flexible storage: Local disk, NFS, or S3-compatible object storage
  • Token authentication: InfluxDB uses token-based auth (Loki relies on network security)
  • Manual HTTPS: Transport encryption requires manual configuration

Architecture

Sink/Source Pattern

┌─────────────────┐
│  Source (Host)  │
│                 │
│  ┌──────────┐   │      ┌──────────────┐
│  │ Telegraf │───┼─────>│  InfluxDB    │
│  └──────────┘   │      │  (Sink)      │
│                 │      └──────────────┘
│  ┌──────────┐   │      ┌──────────────┐
│  │  Alloy   │───┼─────>│    Loki      │
│  └──────────┘   │      │  (Sink)      │
└─────────────────┘      └──────────────┘

Sources (Collectors): Telegraf and Alloy run on monitored systems Sinks (Storage): InfluxDB and Loki run on monitoring servers Decoupled: Deploy independently, multiple destinations supported

Use Cases

All-in-One Testing

  • Single host: Server + collectors on same machine
  • Token auto-discovery: No manual token configuration needed
  • Quick validation: Test the collection works
  • Development: Validate changes before deployment

Small Production Deployments

  • 1 monitoring server + handful of clients
  • Home labs, edge sites, small offices
  • VPN-connected remote hosts (WireGuard)
  • Resource-constrained environments

NOT For:

  • Multi-host HA deployments - Out of scope for this project
  • Massive scale - Not optimized for hundreds of hosts
  • Enterprise features - No clustering, multi-tenancy, automated HTTPS

What Can You Monitor?

System Metrics: CPU, memory, disk, network, processes Application Logs: Apache, Fail2ban, Postfix, Bind9, WireGuard, Gitea, ISPConfig Application Metrics: Apache, MySQL/MariaDB, Redis, Memcached, custom Telegraf plugins

Collection Structure

solti_monitoring/
├── roles/
│   ├── influxdb/          # Metrics storage server
│   ├── loki/              # Log storage server
│   ├── telegraf/          # Metrics collector
│   ├── alloy/             # Log collector
│   ├── log_tests/         # Log pipeline verification
│   ├── metrics_tests/     # Metrics pipeline verification
│   ├── nfs-client/        # NFS mount support
│   └── shared/            # Shared utilities and tasks
├── playbooks/             # Example playbooks
├── molecule/              # Testing scenarios
└── docs/                  # Additional documentation

Note: fail2ban_config and wazuh_agent roles moved to solti-ensemble collection.

Getting Started

Prerequisites

  • Ansible 2.9 or higher
  • Python 3.6 or higher
  • Supported Linux distribution (Debian 11/12, Rocky 9, Ubuntu 24)
  • SSH access to target hosts with sudo privileges

Installation

User Installation (Ansible Galaxy):

ansible-galaxy collection install jackaltx.solti_monitoring

Developer Installation (Local Clone):

git clone https://github.com/jackaltx/solti-monitoring.git
cd solti-monitoring
./create_symlinks.sh

Important for developers: Remove symlink before molecule tests, recreate after.

First Deployment

# Deploy monitoring server
- hosts: monitoring_server
  roles:
    - jackaltx.solti_monitoring.influxdb
    - jackaltx.solti_monitoring.loki

# Deploy collectors
- hosts: monitored_hosts
  roles:
    - jackaltx.solti_monitoring.telegraf
    - jackaltx.solti_monitoring.alloy

See the Quick Start chapter for a complete 30-minute deployment walkthrough.

Documentation Structure

This book contains:

  • Quick Start: 30-minute deployment walkthrough
  • Architecture Overview: Understanding the monitoring pipelines
  • Installation & Deployment: Detailed deployment patterns
  • Role Documentation: Complete reference for each role
  • Testing Guide: Molecule testing and verification
  • Troubleshooting: Common issues and solutions

Testing Philosophy

  • Molecule framework for automated testing
  • Container tests (Podman) for fast unit testing
  • VM tests for integration testing with real systemd
  • Platform matrix: Debian 11/12, Rocky 9, Ubuntu 24
  • Verification tasks to confirm functionality

Security Model

  • InfluxDB: Token-based authentication (implemented)
  • Loki: No authentication - relies on network security (VPN/private LAN)
  • Transport: HTTP over VPN acceptable, HTTPS requires manual configuration
  • Network isolation over transport encryption

Support and Community

  • GitHub Issues: Bug reports and feature requests
  • GitHub Discussions: Questions and community support
  • Role READMEs: Detailed role-specific documentation
  • This Book: Comprehensive guides and examples

License

MIT License - See LICENSE file for details.

Acknowledgments

Built with: - InfluxData - InfluxDB and Telegraf - Grafana Labs - Loki and Alloy - Ansible - Infrastructure automation - Molecule - Testing framework

AI Collaboration: This collection represents an experiment in AI-assisted development for code, process, and documentation.

Next Steps

  • Quick Start - Deploy your first monitoring stack in 30 minutes
  • Architecture Overview - Understand the sink/source pattern
  • Role Documentation - Explore individual role capabilities