sfw/fix
SERVFAIL high

Resolver returns SERVFAIL (Server Failure)

The recursive resolver couldn't get a usable answer and gave up, returning status SERVFAIL instead of an address.

What you see

;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 41522
;; ANSWER SECTION:
(empty)

What’s actually happening

dig or nslookup comes back with status: SERVFAIL and no records. Unlike NXDOMAIN (name doesn't exist), SERVFAIL means the resolver tried and failed: a timeout, a broken upstream, or a validation error. The same name may resolve on one resolver and SERVFAIL on another, which is the tell-tale sign of DNSSEC trouble. Browsers show a generic 'site can't be reached' with no clean error code.

Common causes

  • All authoritative nameservers for the zone are down, unreachable, or refusing queries
  • Lame delegation: the parent points to nameservers that aren't actually authoritative for the zone
  • DNSSEC validation failure on the zone (expired RRSIG, bad DS) so the validating resolver refuses the answer
  • Resolver can't reach the auth servers over the network (firewall blocking UDP/TCP 53, EDNS/fragmentation issues)
  • Overloaded or misconfigured recursive resolver timing out before it gets a response

How to fix it

  1. Isolate resolver vs. authoritativeQuery a public resolver and the authoritative servers directly: dig @1.1.1.1 example.com then dig @ns1.yourdns.com example.com. If the auth server answers but 1.1.1.1 SERVFAILs, the problem is validation or reachability, not your records.
  2. Test whether DNSSEC is the causeRun dig +cd example.com (CD = checking disabled). If +cd returns the answer and a normal query SERVFAILs, it's a DNSSEC validation failure — go fix the signatures/DS (see related entries).
  3. Confirm the delegation is saneCompare the NS set at the parent (dig NS example.com @a.gtld-servers.net) against the NS records the zone serves. Mismatched or non-authoritative nameservers cause lame delegation. Run the zone through Zonemaster or dnsviz.net for a full report.
  4. Check authoritative server health and reachabilityVerify every listed nameserver answers on UDP and TCP 53. Test EDNS: dig +bufsize=1232 and dig +tcp. A firewall dropping large UDP responses produces intermittent SERVFAIL under load.
  5. Flush and retest after fixingResolvers cache SERVFAIL briefly (seconds to minutes). After correcting the zone, clear your local cache and wait out the negative TTL before declaring it broken again.

Stop it recurring

Run at least two nameservers on independent networks and monitor them plus DNSSEC signature expiry so a single failure doesn't take the whole zone to SERVFAIL.

Related errors