fix: route buildkit dns through host's systemd-resolved cache#63
fix: route buildkit dns through host's systemd-resolved cache#63
Conversation
|
@codesmith fix ci |
|
I'm now helping with this PR. Give me a moment to analyze the code and I'll respond shortly. |
Pushed the fix. All 4 CI failures (Build and Test, Lint, Verify Build Output, Test Action) were caused by the same issue: Removed the |
|
With this change DNS cache utilization went from 2% -> 99% on the parallel pnpm stress test. Before: https://github.com/useblacksmith/test-workflows/actions/runs/21959552170/job/63433002178?pr=20#step:8:101 |
BuildKit falls back to hardcoded public DNS (8.8.8.8/8.8.4.4) when the host uses systemd-resolved (127.0.0.53), because that loopback address is unreachable from build containers in their own network namespace. This causes DNS cache misses and EAI_AGAIN/ETIMEOUT errors during parallel builds that make many DNS queries (e.g. pnpm install). Fix: 1. Configure systemd-resolved to listen on all interfaces (0.0.0.0) via a drop-in config, not just loopback 2. Resolve the host's routable IP at startup and inject it into buildkitd.toml's [dns] nameservers 3. Keep public DNS as fallback if the host IP can't be determined This makes the host's DNS cache reachable from build containers on any network mode (host, bridge, custom), matching Docker's built-in behavior for the docker-container driver. See: moby/buildkit#5009
BuildKit round-robins across all nameservers rather than using them as ordered fallbacks. Including public DNS alongside the host IP caused ~50% of queries to bypass the local cache. Now only the host's routable IP (backed by systemd-resolved) is used. systemd-resolved itself already handles upstream fallback to external resolvers when needed.
Regenerate dist artifacts with ncc after rebasing onto origin/main.
81a3da4 to
d633740
Compare
|
Summary
8.8.8.8,1.1.1.1, etc.) inbuildkitd.toml, bypassing the host'ssystemd-resolvedDNS cacheEAI_AGAIN/ETIMEOUTerrors during parallel Docker builds with heavy DNS traffic (e.g.pnpm installin monorepos)systemd-resolvedto listen on the host's routable IP (not just loopback127.0.0.53), then injects that IP intobuildkitd.toml's[dns]sectiondocker-containerdriverSee: moby/buildkit#5009
Made with Cursor
Note
Medium Risk
Touches host-level DNS configuration and BuildKit daemon startup behavior; misconfiguration could disrupt DNS resolution on runners or in builds.
Overview
BuildKit DNS is reworked to prefer the host’s
systemd-resolvedcache instead of hardcoded public resolvers.startBuildkitdnow (1) installs asystemd-resolveddrop-in to listen on a routable interface (DNSStubListenerExtra=0.0.0.0) and restarts the service, and (2) writesbuildkitd.tomlwith a[dns]nameserverslist set to the host’s primary routable IP (derived viaip route get), falling back to public DNS only when the host IP can’t be determined.Written by Cursor Bugbot for commit 5d82af7. This will update automatically on new commits. Configure here.