Skip to content

Resolver uses DNS Servers from inactive interfaces on Windows, causing timeouts #1191

@jaraco

Description

@jaraco

dnspython Version: 2.7.0
Platform: Microsoft Windows 11 24H2
Python Version: 3.11.2

Summary

When using the default resolver (dns.resolver.resolve() or dns.resolver.Resolver()) on Windows, dnspython appears to gather DNS server configurations from the registry for all network interfaces, including those that are currently inactive (e.g., disconnected USB Ethernet adapters, disabled adapters). It then attempts to use these DNS servers during resolution. If the DNS servers associated with the inactive interface are unreachable (which they typically are when the interface is down), dnspython experiences significant delays due to query timeouts and retries before eventually falling back to servers configured on active interfaces. This results in unexpectedly long DNS resolution times within Python applications.

Observed Behavior

  • A Python script using dns.resolver.resolve() experiences significant delays (e.g., ~4 seconds) for DNS lookups, leading to 16 second startup delay for some applications.
  • cProfile analysis shows the time is spent within dns.resolver.resolve, ultimately waiting in select.select, with individual internal query attempts timing out after ~1.3 seconds each.
  • Printing dns.resolver.get_default_resolver().nameservers reveals a list containing DNS server IP addresses that are configured on an inactive network interface (e.g., associated with a disconnected USB Ethernet adapter/docking station). These servers were confirmed to be present in the registry key for that interface (HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\{GUID}).
  • Using system tools like nslookup to explicitly query these specific DNS server IPs results in timeouts, confirming they are unreachable when the associated interface is inactive.
  • When the specific network interface associated with the problematic DNS servers is activated (e.g., connecting the USB adapter), the delay in the dnspython script disappears, and resolution becomes fast.
  • System tools like nslookup using default system settings may resolve quickly even when the interface is inactive, suggesting they might handle fallback from inactive interface DNS servers differently or faster than dnspython.

Expected Behavior

dnspython's default resolver configuration on Windows should ideally:

  1. Prioritize DNS servers associated with currently active (OperationalStatus="Up") network interfaces.
  2. Avoid attempting to query DNS servers associated with inactive network interfaces, or fail very quickly on them without long timeouts/retries.
  3. The list returned by dns.resolver.get_default_resolver().nameservers should primarily reflect servers usable by the system at that moment.

Diagnosis / Root Cause Analysis

  1. dnspython appears to correctly read DNS server configuration from the Windows registry, likely iterating through keys under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\{GUID}.
  2. However, it does not seem to correlate this configuration data with the current operational status of the corresponding network interface {GUID} before adding the NameServer or DhcpNameServer entries to its internal list of servers to query.
  3. During resolution, dnspython sequentially attempts to query servers from this combined list. When it attempts to query the unreachable servers belonging to the inactive interface, it waits for the standard DNS query timeout, possibly retries, and only then moves to the next server in the list.
  4. This timeout/retry cycle for each unreachable server configured on inactive interfaces causes the cumulative observed delay.

Steps to Reproduce (Conceptual)

  1. On a Windows machine, configure a secondary network adapter (e.g., USB Ethernet, second physical NIC, possibly a persistent virtual adapter) with specific DNS server IPs (e.g., 10.50.10.50, 10.50.50.50 as in the user's case, or any IP unreachable when the adapter is inactive). Note the adapter's GUID from the registry key HKLM\SYSTEM\CCS\Services\Tcpip\Parameters\Interfaces\{GUID} where these IPs are set.
  2. Ensure this specific network adapter is inactive (e.g., disconnected, disabled in ncpa.cpl).
  3. Ensure at least one other primary network adapter (e.g., Wi-Fi, main Ethernet) is active and configured with working DNS servers.
  4. Run a simple Python script:
    import dns.resolver
    import time
    import platform
    
    if platform.system() != 'Windows':
        print("This test is designed for Windows.")
    else:
        print(f"dnspython default servers: {dns.resolver.get_default_resolver().nameservers}") # Should show servers from both active and inactive interfaces
    
        print("Starting lookup...")
        start_time = time.monotonic()
        try:
            # Use a common hostname unlikely to be in local cache initially
            result = dns.resolver.resolve('google.com', 'A')
            # Or use the SRV/TXT records from the original scenario if applicable
            # result_srv = dns.resolver.resolve('_mongodb._tcp.cluster0.acvlhai.mongodb.net', 'SRV')
            # result_txt = dns.resolver.resolve('cluster0.acvlhai.mongodb.net', 'TXT')
            duration = time.monotonic() - start_time
            print(f"Lookup successful in {duration:.2f} seconds.")
            # print(result.rrset)
        except Exception as e:
            duration = time.monotonic() - start_time
            print(f"Lookup failed after {duration:.2f} seconds: {e}")
  5. Observe: The script execution time for the resolve() call will be significantly longer (multiple seconds) than expected for a normal DNS lookup, reflecting the timeouts caused by querying servers on the inactive interface.
  6. Activate the network adapter configured in Step 1 (e.g., plug in the USB adapter).
  7. Re-run the Python script.
  8. Observe: The script execution time for resolve() should now be very short (sub-second).

Suggested Fix / Improvement

Consider modifying the Windows-specific DNS configuration logic within dnspython (potentially in dns.platform.windows or called by dns.resolver.Resolver._config_resolver) to check the operational status of the network interface associated with a set of DNS servers before adding them to the usable list. This could involve using Windows APIs like GetAdaptersAddresses (checking OperStatus) or WMI queries (Win32_NetworkAdapter NetConnectionStatus) or potentially parsing PowerShell Get-NetAdapter output (though native APIs are preferable). Servers associated with interfaces not currently Up should probably be excluded or placed at the very end of the list with minimal retry/timeout settings.

Workaround

A functional workaround is to explicitly configure the dnspython resolver to use specific known-good DNS servers, bypassing the problematic system configuration discovery:

resolver = dns.resolver.Resolver(configure=False)
resolver.nameservers = ['192.168.107.1'] # Example: Using only a known-good local server
# OR
# resolver.nameservers = ['8.8.8.8', '1.1.1.1'] # Example: Using public DNS
# Use resolver.resolve(...) instead of dns.resolver.resolve(...)

Another workaround involves monkeypatching Resolver._config_resolver to filter the discovered nameservers based on interface status checked via PowerShell, but this is fragile and not recommended for production.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions