Custom Plugins¶

Overview¶

Telegraf supports custom input plugins for collecting metrics from applications not covered by built-in plugins. This page covers how to add custom plugins to the solti-monitoring collection.

Telegraf Plugin Types¶

Input Plugins¶

Collect metrics from sources: - Built-in: CPU, memory, disk (enabled by default) - Optional: Docker, PostgreSQL, Nginx, Redis, etc. - Custom: Exec, HTTP, StatsD, Socket

Processor Plugins¶

Transform metrics before sending: - Rename fields/tags - Drop unwanted metrics - Add computed fields

Aggregator Plugins¶

Aggregate metrics over time: - Calculate min/max/mean - Histograms and percentiles

Output Plugins¶

Send metrics to destinations (InfluxDB in solti-monitoring)

Enabling Optional Built-in Plugins¶

Docker/Podman Metrics¶

- role: jackaltx.solti_monitoring.telegraf
  vars:
    telegraf_enable_docker: true

Collects: - Container count (running, stopped) - Container CPU/memory usage - Container network I/O

PostgreSQL Metrics¶

- role: jackaltx.solti_monitoring.telegraf
  vars:
    telegraf_enable_postgresql: true
    telegraf_postgresql_address: "host=localhost user=telegraf dbname=postgres"

Collects: - Database connections - Transaction rates - Table sizes - Query performance

Nginx Metrics¶

- role: jackaltx.solti_monitoring.telegraf
  vars:
    telegraf_enable_nginx: true
    telegraf_nginx_urls: ["http://localhost/nginx_status"]

Requires nginx stub_status module enabled.

Collects: - Active connections - Request rate - Accepted/handled connections

Redis Metrics¶

- role: jackaltx.solti_monitoring.telegraf
  vars:
    telegraf_enable_redis: true
    telegraf_redis_servers: ["tcp://localhost:6379"]

Collects: - Connected clients - Memory usage - Keyspace stats - Command stats

Custom Exec Plugin¶

Run custom scripts to collect metrics:

Script Requirements¶

Script must output metrics in one of these formats: - InfluxDB line protocol - JSON - Graphite

Example: Custom Application Metrics¶

Script (/usr/local/bin/app-metrics.sh):

#!/bin/bash
# Output in InfluxDB line protocol
echo "app_users,host=$(hostname) active=42,total=150"
echo "app_requests,host=$(hostname) count=1234"

Configuration:

telegraf_exec_plugins:
  - commands: ["/usr/local/bin/app-metrics.sh"]
    timeout: "5s"
    data_format: "influx"

Telegraf config generated:

[[inputs.exec]]
  commands = ["/usr/local/bin/app-metrics.sh"]
  timeout = "5s"
  data_format = "influx"

JSON Output Example¶

Script:

#!/usr/bin/env python3
import json

metrics = {
    "app_users": {
        "active": 42,
        "total": 150
    }
}

print(json.dumps(metrics))

Configuration:

telegraf_exec_plugins:
  - commands: ["/usr/local/bin/app-metrics.py"]
    data_format: "json"
    name_override: "app_metrics"
    tag_keys: ["host"]

HTTP Input Plugin¶

Poll HTTP endpoints for metrics:

Prometheus Metrics¶

telegraf_http_plugins:
  - urls: ["http://localhost:9090/metrics"]
    data_format: "prometheus"

JSON API¶

telegraf_http_plugins:
  - urls: ["http://localhost:8080/metrics"]
    data_format: "json"
    tag_keys: ["service", "environment"]

StatsD Input Plugin¶

Receive metrics via StatsD protocol:

telegraf_statsd_enabled: true
telegraf_statsd_port: 8125
telegraf_statsd_protocol: "udp"

Application sends metrics:

import statsd

c = statsd.StatsClient('localhost', 8125)
c.incr('app.requests')
c.timing('app.response_time', 42)
c.gauge('app.users', 150)

Socket Listener Plugin¶

Listen on TCP/UDP socket for metrics:

telegraf_socket_listener:
  - service_address: "tcp://:8094"
    data_format: "json"

Custom Loki Log Sources¶

Custom Application Logs¶

Add custom log file monitoring:

alloy_custom_sources:
  - name: "myapp"
    type: "file"
    path: "/var/log/myapp/app.log"
    labels:
      service_type: "application"
      app_name: "myapp"

Generated Alloy config:

loki.source.file "myapp" {
  targets = [
    {
      __path__ = "/var/log/myapp/app.log",
      service_type = "application",
      app_name = "myapp",
    },
  ]

  forward_to = [loki.write.default.receiver]
}

Custom Journald Source¶

Monitor specific systemd units:

alloy_custom_sources:
  - name: "myservice"
    type: "journald"
    matches:
      _SYSTEMD_UNIT: "myservice.service"
    labels:
      service_type: "application"
      app_name: "myservice"

Processing and Filtering¶

Drop Unwanted Metrics¶

telegraf_processors:
  - type: "drop"
    filter:
      measurement: ["http_response_time"]
      tags:
        url: ["http://localhost/health"]

Rename Fields¶

telegraf_processors:
  - type: "rename"
    replace:
      - measurement: "old_name"
        dest: "new_name"

Add Computed Fields¶

telegraf_processors:
  - type: "enum"
    mapping:
      field: "status"
      value_mappings:
        "ok": 1
        "warning": 2
        "error": 3

Testing Custom Plugins¶

Test Telegraf Configuration¶

# Test config without starting service
telegraf --config /etc/telegraf/telegraf.conf --test

# Test specific input plugin
telegraf --config /etc/telegraf/telegraf.conf --input-filter exec --test

Test Script Output¶

# Verify script runs and outputs correct format
/usr/local/bin/app-metrics.sh

# Expected output (InfluxDB line protocol):
# measurement,tag=value field=value timestamp

Test Alloy Configuration¶

# Validate Alloy config
alloy fmt /etc/alloy/config.alloy
alloy validate /etc/alloy/config.alloy

Performance Considerations¶

Script execution time: Keep exec plugins < 5 seconds
Collection frequency: Balance between freshness and overhead
Metric cardinality: Avoid high-cardinality tags
Error handling: Ensure scripts handle failures gracefully

Security Considerations¶

Script permissions: Run with minimal privileges
Input validation: Sanitize external inputs
Credential management: Use vault for sensitive data
Network access: Restrict plugin network access

Example: Complete Custom Monitoring¶

Monitor custom application with both metrics and logs:

- role: jackaltx.solti_monitoring.telegraf
  vars:
    telegraf_exec_plugins:
      - commands: ["/usr/local/bin/myapp-metrics.sh"]
        interval: "60s"
        timeout: "5s"
        data_format: "influx"

- role: jackaltx.solti_monitoring.alloy
  vars:
    alloy_custom_sources:
      - name: "myapp"
        type: "file"
        path: "/var/log/myapp/*.log"
        labels:
          service_type: "application"
          app_name: "myapp"

Reference¶

For complete plugin documentation: - Telegraf plugins: https://github.com/influxdata/telegraf/tree/master/plugins/inputs - Alloy components: https://grafana.com/docs/alloy/latest/reference/components/