monotux.tech

Prometheus & Tailscale service discovery

Prometheus, Tailscale, Monitoring, systemd

This post is part of the Homelab Monitoring 2025 series.

  1. Rebuilding my monitoring infrastructure
  2. Prometheus & Tailscale service discovery

In my last post on homelab monitoring I was still manually specifying hosts in my scrape configuration, which felt awkward when Tailscale has an API we can query, so time to fix that!

Overview #

We will be using cfunkhouser/tailscalesd for polling the Tailscale API, to feed Prometheus with nodes to scrape. I’ve not seen this packaged in any distribution but NixOS yet, so you probably have to install it yourself.

To make this work you have to:

  • Source and install cfunkhouser/tailscalesd on your monitoring machine (to some location like /usr/local/bin)
  • Create a Tailscale API token, or better, an OAuth client with devices:core:read1 assigned
  • Save the API token or OAuth secret to a file on your monitoring machine, save it with minimal permissions (chmod 0400)

Once this is setup, tailscalesd will poll the Tailscale API every 5 minutes, and prometheus will use tailscalesd for discovering hosts to scrape. In the example below we will be using the same basic tag structure as in my previous post (tag:monitor) as a filter.

Secrets file #

Once you’ve issued a OAuth client or an API token, save it to a file that looks like below to some location on your system (it can be owned by root). This file will be included as environment variables by systemd, hence the format.

TAILNET="example-network.ts.net"

# For OAuth clients
TAILSCALE_CLIENT_ID=""
TAILSCALE_CLIENT_SECRET="tskey-client-"

# For API token
TAILSCALE_API_TOKEN=""

Make sure to run something like chown 400 on this file, or use something like agenix to encrypt it!

systemd services #

There’s not much to the systemd configuration in this case – I’m using environmental variables for configuring this service.

non-NixOS #

If you use something else than NixOS (which you probably do), this is an example systemd service file which you can save as /etc/systemd/system/tailscalesd.service

[Unit]
After=tailscaled.service
Requires=tailscaled.service
Description=Tailscale service discovery for Prometheus

[Service]
EnvironmentFile=/path/to/secrets-tailscale # CHANGEME
ExecStart=/path/to/tailscalesd -localapi   # CHANGEME
Type=simple

# Some hardening, you _can_ omit this if you really want to
CapabilityBoundingSet=
DeviceAllow=/dev/null rw
DevicePolicy=strict
DynamicUser=true
LockPersonality=true
MemoryDenyWriteExecute=true
NoNewPrivileges=true
PrivateDevices=true
PrivateTmp=true
PrivateUsers=true
ProtectClock=true
ProtectControlGroups=true
ProtectHome=true
ProtectHostname=true
ProtectKernelLogs=true
ProtectKernelModules=true
ProtectKernelTunables=true
ProtectProc=invisible
ProtectSystem=full
RemoveIPC=true
RestrictAddressFamilies=AF_INET
RestrictAddressFamilies=AF_INET6
RestrictAddressFamilies=AF_UNIX
RestrictNamespaces=true
RestrictRealtime=true
RestrictSUIDSGID=true
SystemCallArchitectures=native
SystemCallFilter=@system-service
SystemCallFilter=~@privileged

[Install]
WantedBy=multi-user.target

NixOS #

I couldn’t find any existing systemd service for this, so here’s a somewhat basic2 setup:

{ pkgs, ... }:
{
	services.tailscale.enable = true;

	systemd.services.tailscalesd = {
		wantedBy = [ "multi-user.target" ];
		after = [ "tailscaled.service" ];
		requires = [ "tailscaled.service" ];
		description = "Tailscale service discovery for Prometheus";
		serviceConfig = {
			Type = "simple";
			DynamicUser = true;
			EnvironmentFile = /etc/nixos/secrets-tailscale;
			ExecStart = "${pkgs.tailscalesd}/bin/tailscalesd -localapi";

			# Hardening!
			CapabilityBoundingSet = [""];
			DeviceAllow = [ "/dev/null rw" ];
			DevicePolicy = "strict";
			LockPersonality = true;
			MemoryDenyWriteExecute = true;
			NoNewPrivileges = true;
			PrivateDevices = true;
			PrivateTmp = true;
			PrivateUsers = true;
			ProtectClock = true;
			ProtectControlGroups = true;
			ProtectHome = true;
			ProtectHostname = true;
			ProtectKernelLogs = true;
			ProtectKernelModules = true;
			ProtectKernelTunables = true;
			ProtectProc = "invisible";
			ProtectSystem = "full";
			RemoveIPC = true;
			RestrictAddressFamilies = [
			  "AF_INET"
			  "AF_INET6"
			  "AF_UNIX"
			];
			RestrictNamespaces = true;
			RestrictRealtime = true;
			RestrictSUIDSGID = true;
			SystemCallArchitectures = "native";
			SystemCallFilter = [
			  "@system-service"
			  "~@privileged"
			];
	};
  };
}

prometheus configuration #

This is not a complete configuration example, just the scrape configuration!

scrape_configs:
- http_sd_configs:
  - url: http://localhost:9242/
  job_name: tailscale-node-exporter
  relabel_configs:
  # Check that tag:monitor is set, else discard the host!
  - action: keep
	regex: tag:monitor
	source_labels:
	- __meta_tailscale_device_tag
  # Do some relabelling
  - source_labels:
	- __meta_tailscale_device_hostname
	target_label: tailscale_hostname
  - source_labels:
	- __meta_tailscale_device_name
	target_label: tailscale_name
  - source_labels:
	- __meta_tailscale_device_name
	target_label: instance
  - regex: (.*)
	replacement: $1:9100
	source_labels:
	- __address__
	target_label: __address__

# Another example - check if tag:chronyexporter is set, and scrape said
# exporter if that is
- http_sd_configs:
  - url: http://localhost:9242/
  job_name: tailscale-chrony-exporter
  relabel_configs:
  - action: keep
	regex: tag:chronyexporter
	source_labels:
	- __meta_tailscale_device_tag
  - source_labels:
	- __meta_tailscale_device_hostname
	target_label: tailscale_hostname
  - source_labels:
	- __meta_tailscale_device_name
	target_label: tailscale_name
  - source_labels:
	- __meta_tailscale_device_name
	target_label: instance
  - regex: (.*)
	replacement: $1:9123
	source_labels:
	- __address__
	target_label: __address__

And the same configuration again, this time as a NixOS module:

{
  services.prometheus = {
	enable = true;
	scrapeConfigs = [
	  {
		job_name = "tailscale-node-exporter";
		http_sd_configs = [ { url = "http://localhost:9242/"; } ];
		relabel_configs = [
		  {
			source_labels = [ "__meta_tailscale_device_tag" ];
			regex = "tag:monitor";
			action = "keep";
		  }
		  {
			source_labels = [ "__meta_tailscale_device_hostname" ];
			target_label = "tailscale_hostname";
		  }
		  {
			source_labels = [ "__meta_tailscale_device_name" ];
			target_label = "tailscale_name";
		  }
		  {
			source_labels = [ "__address__" ];
			regex = "(.*)";
			replacement = "$1:9100";
			target_label = "__address__";
		  }
		];
	  }
	  {
		job_name = "tailscale-chrony-exporter";
		http_sd_configs = [ { url = "http://localhost:9242/"; } ];
		relabel_configs = [
		  {
			source_labels = [ "__meta_tailscale_device_tag" ];
			regex = "tag:chronyexporter";
			action = "keep";
		  }
		  {
			source_labels = [ "__meta_tailscale_device_hostname" ];
			target_label = "tailscale_hostname";
		  }
		  {
			source_labels = [ "__meta_tailscale_device_name" ];
			target_label = "tailscale_name";
		  }
		  {
			source_labels = [ "__address__" ];
			regex = "(.*)";
			replacement = "$1:9123";
			target_label = "__address__";
		  }
		];
	  }
	];
  };
}

Bonus #

You can also make Prometheus start after tailscalesd.service by editing your systemd unit file:

# systemctl edit prometheus.service
[Unit]
After=tailscalesd.service
Requires=tailscalesd.service

In NixOS terms:

systemd.services.prometheus = {
  after = [ "tailscalesd.service" ];
  requires = [ "tailscalesd.service" ];
};

And while we are at it – we might as well scrape metrics for tailscalesd! Add the following scrape definition to your prometheus.yaml:

  - job_name: 'tailscalesd'
	static_configs:
	  - targets: ['localhost:9242']

  1. Settings → OAuth clients → Generate OAuth client → Devices → Core → assign Read ↩︎

  2. You can omit the hardening bits ↩︎