Introduction
Purpose¶
An experiment in AI co-development for code, process, and documentation.
The Solti-Monitoring Collection is an Ansible-based monitoring stack designed for small labs and testing environments, not enterprise scale. It provides tested roles for deploying monitoring infrastructure using modern, cloud-native tools.
Key Characteristics: - Lab-scale focus: Streamlined for testing and small deployments, not high-availability or massive scale - Monitoring sink/source architecture: Collectors (sources) send to storage servers (sinks) - Comprehensive testing: Molecule framework with both container and VM-based testing - Manual HTTPS: TLS/HTTPS configuration is not automated - must be configured manually if needed - AI-assisted development: This collection, its processes, and documentation were developed with AI collaboration
What is Solti-Monitoring?¶
Solti-Monitoring implements parallel monitoring pipelines:
- Metrics Pipeline: Telegraf (collector) → InfluxDB (storage)
- Logging Pipeline: Alloy (collector) → Loki (storage)
Both pipelines support multiple deployment patterns and storage backends.
Key Features¶
- Quick setup: Deploy a complete monitoring stack in under 30 minutes
- Sink/source pattern: Collectors and storage servers deployed independently
- Molecule testing: Automated testing with containers and VMs across Debian, Rocky, Ubuntu
- Flexible storage: Local disk, NFS, or S3-compatible object storage
- Token authentication: InfluxDB uses token-based auth (Loki relies on network security)
- Manual HTTPS: Transport encryption requires manual configuration
Architecture¶
Sink/Source Pattern¶
┌─────────────────┐
│ Source (Host) │
│ │
│ ┌──────────┐ │ ┌──────────────┐
│ │ Telegraf │───┼─────>│ InfluxDB │
│ └──────────┘ │ │ (Sink) │
│ │ └──────────────┘
│ ┌──────────┐ │ ┌──────────────┐
│ │ Alloy │───┼─────>│ Loki │
│ └──────────┘ │ │ (Sink) │
└─────────────────┘ └──────────────┘
Sources (Collectors): Telegraf and Alloy run on monitored systems Sinks (Storage): InfluxDB and Loki run on monitoring servers Decoupled: Deploy independently, multiple destinations supported
Use Cases¶
All-in-One Testing¶
- Single host: Server + collectors on same machine
- Token auto-discovery: No manual token configuration needed
- Quick validation: Test the collection works
- Development: Validate changes before deployment
Small Production Deployments¶
- 1 monitoring server + handful of clients
- Home labs, edge sites, small offices
- VPN-connected remote hosts (WireGuard)
- Resource-constrained environments
NOT For:¶
- Multi-host HA deployments - Out of scope for this project
- Massive scale - Not optimized for hundreds of hosts
- Enterprise features - No clustering, multi-tenancy, automated HTTPS
What Can You Monitor?¶
System Metrics: CPU, memory, disk, network, processes Application Logs: Apache, Fail2ban, Postfix, Bind9, WireGuard, Gitea, ISPConfig Application Metrics: Apache, MySQL/MariaDB, Redis, Memcached, custom Telegraf plugins
Collection Structure¶
solti_monitoring/
├── roles/
│ ├── influxdb/ # Metrics storage server
│ ├── loki/ # Log storage server
│ ├── telegraf/ # Metrics collector
│ ├── alloy/ # Log collector
│ ├── log_tests/ # Log pipeline verification
│ ├── metrics_tests/ # Metrics pipeline verification
│ ├── nfs-client/ # NFS mount support
│ └── shared/ # Shared utilities and tasks
│
├── playbooks/ # Example playbooks
├── molecule/ # Testing scenarios
└── docs/ # Additional documentation
Note: fail2ban_config and wazuh_agent roles moved to solti-ensemble collection.
Getting Started¶
Prerequisites¶
- Ansible 2.9 or higher
- Python 3.6 or higher
- Supported Linux distribution (Debian 11/12, Rocky 9, Ubuntu 24)
- SSH access to target hosts with sudo privileges
Installation¶
User Installation (Ansible Galaxy):
Developer Installation (Local Clone):
Important for developers: Remove symlink before molecule tests, recreate after.
First Deployment¶
# Deploy monitoring server
- hosts: monitoring_server
roles:
- jackaltx.solti_monitoring.influxdb
- jackaltx.solti_monitoring.loki
# Deploy collectors
- hosts: monitored_hosts
roles:
- jackaltx.solti_monitoring.telegraf
- jackaltx.solti_monitoring.alloy
See the Quick Start chapter for a complete 30-minute deployment walkthrough.
Documentation Structure¶
This book contains:
- Quick Start: 30-minute deployment walkthrough
- Architecture Overview: Understanding the monitoring pipelines
- Installation & Deployment: Detailed deployment patterns
- Role Documentation: Complete reference for each role
- Testing Guide: Molecule testing and verification
- Troubleshooting: Common issues and solutions
Testing Philosophy¶
- Molecule framework for automated testing
- Container tests (Podman) for fast unit testing
- VM tests for integration testing with real systemd
- Platform matrix: Debian 11/12, Rocky 9, Ubuntu 24
- Verification tasks to confirm functionality
Security Model¶
- InfluxDB: Token-based authentication (implemented)
- Loki: No authentication - relies on network security (VPN/private LAN)
- Transport: HTTP over VPN acceptable, HTTPS requires manual configuration
- Network isolation over transport encryption
Support and Community¶
- GitHub Issues: Bug reports and feature requests
- GitHub Discussions: Questions and community support
- Role READMEs: Detailed role-specific documentation
- This Book: Comprehensive guides and examples
License¶
MIT License - See LICENSE file for details.
Acknowledgments¶
Built with: - InfluxData - InfluxDB and Telegraf - Grafana Labs - Loki and Alloy - Ansible - Infrastructure automation - Molecule - Testing framework
AI Collaboration: This collection represents an experiment in AI-assisted development for code, process, and documentation.
Next Steps¶
- Quick Start - Deploy your first monitoring stack in 30 minutes
- Architecture Overview - Understand the sink/source pattern
- Role Documentation - Explore individual role capabilities