Prometheus & Tailscale service discovery
This post is part of the Homelab Monitoring 2025 series.
Rebuilding my monitoring infrastructurePrometheus & Tailscale service discovery
In my last post on homelab monitoring I was still manually specifying hosts in my scrape configuration, which felt awkward when Tailscale has an API we can query, so time to fix that!
Table of Contents
Overview
We will be using cfunkhouser/tailscalesd for
polling the Tailscale API, to feed Prometheus with nodes to scrape.
I’ve not seen this packaged in any distribution but NixOS yet, so you
probably have to install it yourself.
To make this work you have to:
- Source and install
cfunkhouser/tailscalesdon your monitoring machine (to some location like/usr/local/bin) - Create a Tailscale API token, or better, an OAuth client with
devices:core:read1 assigned - Save the API token or OAuth secret to a file on your monitoring
machine, save it with minimal permissions (
chmod 0400)
Once this is setup, tailscalesd will poll the Tailscale API every 5
minutes, and prometheus will use tailscalesd for discovering hosts
to scrape. In the example below we will be using the same basic tag
structure as in my previous post (tag:monitor) as a filter.
Secrets file
Once you’ve issued a OAuth client or an API token, save it to a file that looks like below to some location on your system (it can be owned by root). This file will be included as environment variables by systemd, hence the format.
TAILNET="example-network.ts.net"
# For OAuth clients
TAILSCALE_CLIENT_ID=""
TAILSCALE_CLIENT_SECRET="tskey-client-"
# For API token
TAILSCALE_API_TOKEN=""
Make sure to run something like chown 400 on this file, or use
something like agenix to encrypt it!
systemd services
There’s not much to the systemd configuration in this case – I’m using environmental variables for configuring this service.
non-NixOS
If you use something else than NixOS (which you probably do), this is
an example systemd service file which you can save as
/etc/systemd/system/tailscalesd.service
[Unit]
After=tailscaled.service
Requires=tailscaled.service
Description=Tailscale service discovery for Prometheus
[Service]
EnvironmentFile=/path/to/secrets-tailscale # CHANGEME
ExecStart=/path/to/tailscalesd -localapi # CHANGEME
Type=simple
# Some hardening, you _can_ omit this if you really want to
CapabilityBoundingSet=
DeviceAllow=/dev/null rw
DevicePolicy=strict
DynamicUser=true
LockPersonality=true
MemoryDenyWriteExecute=true
NoNewPrivileges=true
PrivateDevices=true
PrivateTmp=true
PrivateUsers=true
ProtectClock=true
ProtectControlGroups=true
ProtectHome=true
ProtectHostname=true
ProtectKernelLogs=true
ProtectKernelModules=true
ProtectKernelTunables=true
ProtectProc=invisible
ProtectSystem=full
RemoveIPC=true
RestrictAddressFamilies=AF_INET
RestrictAddressFamilies=AF_INET6
RestrictAddressFamilies=AF_UNIX
RestrictNamespaces=true
RestrictRealtime=true
RestrictSUIDSGID=true
SystemCallArchitectures=native
SystemCallFilter=@system-service
SystemCallFilter=~@privileged
[Install]
WantedBy=multi-user.target
NixOS
I couldn’t find any existing systemd service for this, so here’s a somewhat basic2 setup:
{ pkgs, ... }:
{
services.tailscale.enable = true;
systemd.services.tailscalesd = {
wantedBy = [ "multi-user.target" ];
after = [ "tailscaled.service" ];
requires = [ "tailscaled.service" ];
description = "Tailscale service discovery for Prometheus";
serviceConfig = {
Type = "simple";
DynamicUser = true;
EnvironmentFile = /etc/nixos/secrets-tailscale;
ExecStart = "${pkgs.tailscalesd}/bin/tailscalesd -localapi";
# Hardening!
CapabilityBoundingSet = [""];
DeviceAllow = [ "/dev/null rw" ];
DevicePolicy = "strict";
LockPersonality = true;
MemoryDenyWriteExecute = true;
NoNewPrivileges = true;
PrivateDevices = true;
PrivateTmp = true;
PrivateUsers = true;
ProtectClock = true;
ProtectControlGroups = true;
ProtectHome = true;
ProtectHostname = true;
ProtectKernelLogs = true;
ProtectKernelModules = true;
ProtectKernelTunables = true;
ProtectProc = "invisible";
ProtectSystem = "full";
RemoveIPC = true;
RestrictAddressFamilies = [
"AF_INET"
"AF_INET6"
"AF_UNIX"
];
RestrictNamespaces = true;
RestrictRealtime = true;
RestrictSUIDSGID = true;
SystemCallArchitectures = "native";
SystemCallFilter = [
"@system-service"
"~@privileged"
];
};
};
}
prometheus configuration
This is not a complete configuration example, just the scrape configuration!
scrape_configs:
- http_sd_configs:
- url: http://localhost:9242/
job_name: tailscale-node-exporter
relabel_configs:
# Check that tag:monitor is set, else discard the host!
- action: keep
regex: tag:monitor
source_labels:
- __meta_tailscale_device_tag
# Do some relabelling
- source_labels:
- __meta_tailscale_device_hostname
target_label: tailscale_hostname
- source_labels:
- __meta_tailscale_device_name
target_label: tailscale_name
- source_labels:
- __meta_tailscale_device_name
target_label: instance
- regex: (.*)
replacement: $1:9100
source_labels:
- __address__
target_label: __address__
# Another example - check if tag:chronyexporter is set, and scrape said
# exporter if that is
- http_sd_configs:
- url: http://localhost:9242/
job_name: tailscale-chrony-exporter
relabel_configs:
- action: keep
regex: tag:chronyexporter
source_labels:
- __meta_tailscale_device_tag
- source_labels:
- __meta_tailscale_device_hostname
target_label: tailscale_hostname
- source_labels:
- __meta_tailscale_device_name
target_label: tailscale_name
- source_labels:
- __meta_tailscale_device_name
target_label: instance
- regex: (.*)
replacement: $1:9123
source_labels:
- __address__
target_label: __address__
And the same configuration again, this time as a NixOS module:
{
services.prometheus = {
enable = true;
scrapeConfigs = [
{
job_name = "tailscale-node-exporter";
http_sd_configs = [ { url = "http://localhost:9242/"; } ];
relabel_configs = [
{
source_labels = [ "__meta_tailscale_device_tag" ];
regex = "tag:monitor";
action = "keep";
}
{
source_labels = [ "__meta_tailscale_device_hostname" ];
target_label = "tailscale_hostname";
}
{
source_labels = [ "__meta_tailscale_device_name" ];
target_label = "tailscale_name";
}
{
source_labels = [ "__address__" ];
regex = "(.*)";
replacement = "$1:9100";
target_label = "__address__";
}
];
}
{
job_name = "tailscale-chrony-exporter";
http_sd_configs = [ { url = "http://localhost:9242/"; } ];
relabel_configs = [
{
source_labels = [ "__meta_tailscale_device_tag" ];
regex = "tag:chronyexporter";
action = "keep";
}
{
source_labels = [ "__meta_tailscale_device_hostname" ];
target_label = "tailscale_hostname";
}
{
source_labels = [ "__meta_tailscale_device_name" ];
target_label = "tailscale_name";
}
{
source_labels = [ "__address__" ];
regex = "(.*)";
replacement = "$1:9123";
target_label = "__address__";
}
];
}
];
};
}
Bonus
You can also make Prometheus start after tailscalesd.service by
editing your systemd unit file:
# systemctl edit prometheus.service
[Unit]
After=tailscalesd.service
Requires=tailscalesd.service
In NixOS terms:
systemd.services.prometheus = {
after = [ "tailscalesd.service" ];
requires = [ "tailscalesd.service" ];
};
And while we are at it – we might as well scrape metrics for
tailscalesd! Add the following scrape definition to your
prometheus.yaml:
- job_name: 'tailscalesd'
static_configs:
- targets: ['localhost:9242']