When DNS problems are subtle, basic lookup commands can look normal while clients still fail. This is the sequence I use.

1) Check authoritative vs recursive answers

  • Query authoritative servers directly.
  • Query the recursive resolver used by clients.
  • Compare record values, TTL, and response flags.

Mismatch here often explains "works for me" reports across segments.

2) Check from at least two client locations

  • one host inside the primary network
  • one host from an external or alternate network

If answers differ, treat it as a resolver path issue before touching app config.

3) Validate complete resolution path

  • confirm search domains are not rewriting short names
  • confirm no stale hosts file entries exist
  • confirm CNAME chain resolves to the expected final A/AAAA target

4) Correlate with edge logs

If DNS looks right, check reverse proxy access/error logs for host header mismatches, wrong upstream selection, or connection failures. DNS and proxy issues often overlap during migrations.

5) Document TTL assumptions

Every DNS change note should include:

  • old TTL
  • new TTL
  • exact change timestamp
  • expected full propagation window

Without that, teams declare incidents too early or keep waiting after propagation should already be complete.