Prometheus & Tailscale service discovery
This post is part of the Homelab Monitoring 2025 series.
- Rebuilding my monitoring infrastructure
- Prometheus & Tailscale service discovery
In my last post on homelab monitoring I was still manually specifying hosts in my scrape configuration, which felt awkward when Tailscale has an API we can query, so time to fix that!
Overview #
We will be using cfunkhouser/tailscalesd
for
polling the Tailscale API, to feed Prometheus with nodes to scrape.
I’ve not seen this packaged in any distribution but NixOS yet, so you
probably have to install it yourself.
To make this work you have to:
- Source and install
cfunkhouser/tailscalesd
on your monitoring machine (to some location like/usr/local/bin
) - Create a Tailscale API token, or better, an OAuth client with
devices:core:read
1 assigned - Save the API token or OAuth secret to a file on your monitoring
machine, save it with minimal permissions (
chmod 0400
)
Once this is setup, tailscalesd
will poll the Tailscale API every 5
minutes, and prometheus
will use tailscalesd
for discovering hosts
to scrape. In the example below we will be using the same basic tag
structure as in my previous post (tag:monitor
) as a filter.
Secrets file #
Once you’ve issued a OAuth client or an API token, save it to a file that looks like below to some location on your system (it can be owned by root). This file will be included as environment variables by systemd, hence the format.
TAILNET="example-network.ts.net"
# For OAuth clients
TAILSCALE_CLIENT_ID=""
TAILSCALE_CLIENT_SECRET="tskey-client-"
# For API token
TAILSCALE_API_TOKEN=""
Make sure to run something like chown 400
on this file, or use
something like agenix
to encrypt it!
systemd services #
There’s not much to the systemd configuration in this case – I’m using environmental variables for configuring this service.
non-NixOS #
If you use something else than NixOS (which you probably do), this is
an example systemd service file which you can save as
/etc/systemd/system/tailscalesd.service
[Unit]
After=tailscaled.service
Requires=tailscaled.service
Description=Tailscale service discovery for Prometheus
[Service]
EnvironmentFile=/path/to/secrets-tailscale # CHANGEME
ExecStart=/path/to/tailscalesd -localapi # CHANGEME
Type=simple
# Some hardening, you _can_ omit this if you really want to
CapabilityBoundingSet=
DeviceAllow=/dev/null rw
DevicePolicy=strict
DynamicUser=true
LockPersonality=true
MemoryDenyWriteExecute=true
NoNewPrivileges=true
PrivateDevices=true
PrivateTmp=true
PrivateUsers=true
ProtectClock=true
ProtectControlGroups=true
ProtectHome=true
ProtectHostname=true
ProtectKernelLogs=true
ProtectKernelModules=true
ProtectKernelTunables=true
ProtectProc=invisible
ProtectSystem=full
RemoveIPC=true
RestrictAddressFamilies=AF_INET
RestrictAddressFamilies=AF_INET6
RestrictAddressFamilies=AF_UNIX
RestrictNamespaces=true
RestrictRealtime=true
RestrictSUIDSGID=true
SystemCallArchitectures=native
SystemCallFilter=@system-service
SystemCallFilter=~@privileged
[Install]
WantedBy=multi-user.target
NixOS #
I couldn’t find any existing systemd service for this, so here’s a somewhat basic2 setup:
{ pkgs, ... }:
{
services.tailscale.enable = true;
systemd.services.tailscalesd = {
wantedBy = [ "multi-user.target" ];
after = [ "tailscaled.service" ];
requires = [ "tailscaled.service" ];
description = "Tailscale service discovery for Prometheus";
serviceConfig = {
Type = "simple";
DynamicUser = true;
EnvironmentFile = /etc/nixos/secrets-tailscale;
ExecStart = "${pkgs.tailscalesd}/bin/tailscalesd -localapi";
# Hardening!
CapabilityBoundingSet = [""];
DeviceAllow = [ "/dev/null rw" ];
DevicePolicy = "strict";
LockPersonality = true;
MemoryDenyWriteExecute = true;
NoNewPrivileges = true;
PrivateDevices = true;
PrivateTmp = true;
PrivateUsers = true;
ProtectClock = true;
ProtectControlGroups = true;
ProtectHome = true;
ProtectHostname = true;
ProtectKernelLogs = true;
ProtectKernelModules = true;
ProtectKernelTunables = true;
ProtectProc = "invisible";
ProtectSystem = "full";
RemoveIPC = true;
RestrictAddressFamilies = [
"AF_INET"
"AF_INET6"
"AF_UNIX"
];
RestrictNamespaces = true;
RestrictRealtime = true;
RestrictSUIDSGID = true;
SystemCallArchitectures = "native";
SystemCallFilter = [
"@system-service"
"~@privileged"
];
};
};
}
prometheus configuration #
This is not a complete configuration example, just the scrape configuration!
scrape_configs:
- http_sd_configs:
- url: http://localhost:9242/
job_name: tailscale-node-exporter
relabel_configs:
# Check that tag:monitor is set, else discard the host!
- action: keep
regex: tag:monitor
source_labels:
- __meta_tailscale_device_tag
# Do some relabelling
- source_labels:
- __meta_tailscale_device_hostname
target_label: tailscale_hostname
- source_labels:
- __meta_tailscale_device_name
target_label: tailscale_name
- source_labels:
- __meta_tailscale_device_name
target_label: instance
- regex: (.*)
replacement: $1:9100
source_labels:
- __address__
target_label: __address__
# Another example - check if tag:chronyexporter is set, and scrape said
# exporter if that is
- http_sd_configs:
- url: http://localhost:9242/
job_name: tailscale-chrony-exporter
relabel_configs:
- action: keep
regex: tag:chronyexporter
source_labels:
- __meta_tailscale_device_tag
- source_labels:
- __meta_tailscale_device_hostname
target_label: tailscale_hostname
- source_labels:
- __meta_tailscale_device_name
target_label: tailscale_name
- source_labels:
- __meta_tailscale_device_name
target_label: instance
- regex: (.*)
replacement: $1:9123
source_labels:
- __address__
target_label: __address__
And the same configuration again, this time as a NixOS module:
{
services.prometheus = {
enable = true;
scrapeConfigs = [
{
job_name = "tailscale-node-exporter";
http_sd_configs = [ { url = "http://localhost:9242/"; } ];
relabel_configs = [
{
source_labels = [ "__meta_tailscale_device_tag" ];
regex = "tag:monitor";
action = "keep";
}
{
source_labels = [ "__meta_tailscale_device_hostname" ];
target_label = "tailscale_hostname";
}
{
source_labels = [ "__meta_tailscale_device_name" ];
target_label = "tailscale_name";
}
{
source_labels = [ "__address__" ];
regex = "(.*)";
replacement = "$1:9100";
target_label = "__address__";
}
];
}
{
job_name = "tailscale-chrony-exporter";
http_sd_configs = [ { url = "http://localhost:9242/"; } ];
relabel_configs = [
{
source_labels = [ "__meta_tailscale_device_tag" ];
regex = "tag:chronyexporter";
action = "keep";
}
{
source_labels = [ "__meta_tailscale_device_hostname" ];
target_label = "tailscale_hostname";
}
{
source_labels = [ "__meta_tailscale_device_name" ];
target_label = "tailscale_name";
}
{
source_labels = [ "__address__" ];
regex = "(.*)";
replacement = "$1:9123";
target_label = "__address__";
}
];
}
];
};
}
Bonus #
You can also make Prometheus start after tailscalesd.service
by
editing your systemd unit file:
# systemctl edit prometheus.service
[Unit]
After=tailscalesd.service
Requires=tailscalesd.service
In NixOS terms:
systemd.services.prometheus = {
after = [ "tailscalesd.service" ];
requires = [ "tailscalesd.service" ];
};
And while we are at it – we might as well scrape metrics for
tailscalesd
! Add the following scrape definition to your
prometheus.yaml
:
- job_name: 'tailscalesd'
static_configs:
- targets: ['localhost:9242']