Overengineering Local DNS: High-Performance DNS Chain

Network-wide filtering (ads/trackers/malware) without touching every device
Split-horizon / authoritative zones for internal services
Fast resolution under load (low latency + high QPS)
Autonomy when upstreams or the WAN get flaky
Security controls (encrypted upstreams + DNSSEC validation)
Repeatability: the whole thing is deployed, validated, and re-deployed via Ansible

So I built a DNS Chain. Overengineered on purpose.

What this post matches
#

This post reflects my current Ansible role and host layout:

dnsdist listens on :53 and load-balances into a backend pool
pihole runs as N containers on 127.0.0.1:9991-999N
- in my lab: N = (CPU cores - 1), which currently equals 7
bind9 listens on 127.0.0.1:1053
unbound listens on 127.0.0.1:2054

Architecture
#

High-level flow
#

flowchart LR
  C["Clients (LAN/VPN)"] -->|"UDP/TCP 53"| D["dnsdist :53
LB + health checks + packet cache"]
  D --> P["Pi-hole pool xN
127.0.0.1:9991-999N
blocking + local cache"]
  P --> B["Bind9
127.0.0.1:1053
authoritative zones / split-horizon"]
  B --> U["Unbound
127.0.0.1:2054
recursive cache + DoT + DNSSEC validation*"]
  U -->|"TLS 853"| Up[("Cloudflare / Quad9 / Google")]

* DNSSEC validation is intended to happen in Unbound (more below). I also include a concrete test so I can prove it’s actually enabled.

Request flow (the big “back-and-forth” diagram)
#

This is the single diagram I use when I’m debugging. If you can mentally simulate this flow, you can usually pinpoint where things went wrong in under a minute.

sequenceDiagram
  participant C as Client
  participant D as dnsdist :53
  participant P as Pi-hole (pool)
  participant B as Bind9 :1053
  participant U as Unbound :2054
  participant O as Upstream DoT :853

  C->>D: Query A/AAAA
  alt dnsdist packet-cache HIT
    D-->>C: Answer (cache)
  else MISS
    D->>P: forward (LB + health check)
    alt Blocked by Pi-hole policy
      P-->>D: Blocking answer (NXDOMAIN/0.0.0.0)
      D-->>C: blocked
    else Allowed
      P->>B: forward
      alt Internal zone hit
        B-->>P: authoritative answer
      else External domain
        B->>U: forward
        U->>O: DoT + (DNSSEC validate)
        O-->>U: response
        U-->>B: response
        B-->>P: response
      end
      P-->>D: response
      D-->>C: response
    end
  end

Why this exact order
#

1) dnsdist: a “front door” that stays fast under load
#

dnsdist earns its place by doing three things well:

Load balancing across multiple Pi-hole backends
Health checks (unhealthy backends are automatically avoided)
Packet cache for the hottest queries (answer without touching downstream layers)

This matters because it keeps client configuration simple: clients always use one DNS IP (this host) on port 53.

2) Pi-hole xN: filtering at the edge, scaled horizontally
#

Pi-hole is a convenient “policy layer” for the whole network. I run multiple instances because it provides:

Isolation: one container restart does not nuke the whole service
Throughput headroom: load spreads across instances
Operational flexibility: different lists/behavior can be tested on a subset (if desired)

Implementation detail: containers bind to distinct loopback ports (127.0.0.1:9991-999N), and dnsdist distributes traffic.

3) Bind9: authoritative split-horizon zones
#

Bind9 is where my internal universe lives:

authoritative zones (e.g., lab.internal)
internal records for services (git.lab.internal, wiki.lab.internal, etc.)
optional split-horizon logic (internal view vs external)

If Bind9 can answer from an authoritative zone, it replies immediately. Otherwise, it forwards “the internet” further down the chain.

4) Unbound: recursive engine, cache, and upstream security
#

Unbound is my “last hop” for external domains:

large recursive cache (and aggressive performance tuning)
DoT (DNS-over-TLS) to upstream providers
resilience features like serve-expired (use cached records during upstream turbulence)
a sensible place to enforce DNSSEC validation in one component

Caching strategy: layered on purpose
#

Yes, there is caching at multiple layers. That is intentional.

dnsdist packet cache: fastest possible responses for repeat queries
Pi-hole cache: local caching close to the policy decision (block/allow)
Bind9: instant answers for internal authoritative zones + cache for forwarded lookups
Unbound: the heavy recursive cache + prefetch + serve-expired

The practical result: most “normal browsing” queries become very low latency once the caches are warm, and the system stays stable under bursts.

Autonomy: when upstreams fail
#

Homelabs are where networking experiments happen: firewall restarts, VPN changes, routing updates, ISP hiccups.

Unbound can be tuned to keep things usable via serve-expired and prefetching. The goal isn’t perfection; it’s graceful degradation: internal services keep resolving, and external browsing is less likely to collapse immediately.

DNSSEC: security goal, concrete test
#

If I say “DNSSEC”, I want it to be verifiable.

Where it should happen: in Unbound (single enforcement point).

How I prove it: I test a known-bad DNSSEC domain. If validation is on, the resolver should return SERVFAIL.

# Should resolve (often signed)
dig @127.0.0.1 -p 2054 cloudflare.com +dnssec

# Should SERVFAIL when DNSSEC validation is actually enabled
dig @127.0.0.1 -p 2054 dnssec-failed.org +dnssec

If this does not SERVFAIL, DNSSEC validation is not actually being enforced (or you are not testing the right resolver/port).

Security warning: do not become an open resolver
#

Two rules I consider non-negotiable:

Restrict who can query you. Enforce LAN/VPN-only access with firewall rules and/or dnsdist ACLs.
Never expose this to the public internet. A publicly reachable recursive resolver will be abused.

I treat dnsdist ACLs and host firewall policy as part of “the design”, not an afterthought.

Performance tuning (kernel + service knobs)
#

I tune the host because high-QPS DNS is mostly “fast UDP + lots of sockets”, and defaults are designed for general-purpose servers.

Example sysctl groups from my role:

socket buffers (UDP/TCP)
backlog limits
TCP reuse/timeouts (important for TLS upstreams)

- name: tune sysctl for dns
  ansible.posix.sysctl:
    sysctl_file: /etc/sysctl.d/9999-ansible-dns.conf
    name:       "{{ item.name }}"
    value:      "{{ item.value }}"
    sysctl_set: yes
  loop:
    - { name: "net.core.rmem_max", value: "4194304" }
    - { name: "net.core.wmem_max", value: "4194304" }
    - { name: "net.core.somaxconn", value: "65535" }
    - { name: "net.ipv4.tcp_tw_reuse", value: "1" }

On the service side:

Bind9 runs with multiple worker threads
Unbound scales num-threads by CPU
dnsdist is configured for caching and backend distribution
Pi-hole instances are isolated and can be pinned with cpusets

Ansible as a contract: deploy and verify
#

I do not trust a DNS deploy that does not validate itself.

My role runs sanity checks for the configs and then performs live resolution tests with retries. If resolution fails, the role fails immediately.

- name: Attempt DNS resolution
  command: "dig @{{ resolution_host }} -p {{ resolution_port }} google.com +short"
  register: result
  until: result.rc == 0
  retries: 5
  delay: 10

This turns “I think I deployed DNS” into “I can prove it works end-to-end”.

Why I keep it this way
#

This chain gives me:

Network-wide ad/track blocking across every device (including IoT)
Internal naming that feels like a real environment (authoritative zones, split-horizon)
Fast hot-path DNS with multiple caching layers
Resilience when upstreams or WAN connectivity are not perfect
A safe lab platform for experiments with resolvers, upstream providers, and policy
Automation and reproducibility: rebuildable from scratch using Ansible

Yes, it’s overengineering. But it’s the kind that buys me what I actually care about: autonomy, security, and speed in a homelab environment.

homelab - This article is part of a series.

Part : This Article

Part : Proxmox Firewall Configuration with Terraform

Part : MikroTik Watchdog: Why 127.0.0.1 is not a good way to Disable Watchdog Ping

Part : Debian Upgrade Trixie

Part : Known but Overlooked: Onboard E1000 NIC Problems in Proxmox

Part : Handling Unreachable Hosts with Ansible

Part : Proxmox VGA pass issues with Radeon RX 7900 XTX | kvm: VFIO_MAP_DMA failed: Invalid argument

Part : Watchdog: Last-Resort Reboot for Mikrotik

Part : mbkp release v1.2

Part : Using Pre-Mounted Storage with Proxmox Backup Server