Internet-Draft in-tree-hints March 2026
Kolkman Expires 18 September 2026 [Page]
Workgroup:
dnsop
Internet-Draft:
draft-kolkman-in-tree-hints
Published:
Intended Status:
Informational
Expires:
Author:
O. Kolkman

In-Tree Hints for DNS Resiliency

Abstract

We present a methodology by which networks that rely very strongly on specific domain names can become more resilience to failures in the parent domain.

The approach presented uses a hints-file-like mechanism in recursive nameservers in addition to having the authoritative servers follow a few operational practices.

The suggested method can be seen as a means for increasing digital sovereignty. We describe the approach, the necessary operational practices, and the dilemmas this approach introduces.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 18 September 2026.

Table of Contents

1. Introduction

The Domain Name System (DNS) is a remarkably stable and resilient system. However, in many environments people are looking on how they can remain in control over the continuity of digital services in their own environments and reduce external dependencies. One those dependencies is the DNS, on which we focus in this document.

Consider the following failure case:

This failure case may sound relatively limited. But here are a few less abstract examples of such failure.

Consider an enterprise campus operating under the domain example.net that provides essential services, such as logistics, to users on its campus. If the transit connection to the broader Internet were to fail, the consequences could be significant. Even when all infrastructure (DNS recursive and authoritative, and the servers for the services themselves, etc) is on premise a failure to resolve the delegation between top level domain .net and example.net would eventually lead to inability to contact services.

Another example is a small island nation state that has a number of its government services running on the island under its own TLD. Now considers a cable cut scenario where all upstream connectivity is lost. After a while, when authority information starts to time out from caches (for some implementations after 24 hours), connections to services on the island will start to fail.

A less benign example is an intervention in the DNS root. Where delegation data for a country's level top level domain(ccTLD) gets altered or removed. Such intervention would eventually debilitate users which rely on services within that ccTLDs domain, usually government services and local media outlets within that country.

While unthinkable even a few years ago these sort of scenario are now being considered in the context of international stability in cyberspace.

In this document we document an operational approach that, with minor support of recursive nameserver can offer one of the elements towards greater autonomy and resilience of infrastructure dependent on a specific domain. While certainly not the only approach to increase resiliency (e.g. the small island nation state example would be solved by having a local anycast instance of the root) we introduce this to offer confidence building mechanism that does not fundamentally change the DNS design. This approach is consistent with the architecture, design, and operation of the DNS. By following practices herein we avoid namespace fragmentation. We also avoid fundamental protocol changes, in particular we avoid alternative roots.

The approach called 'in-tree hints', offers protection against various attack vectors that could compromise the delegation process. For instance, on-path attackers may attempt to alter delegation records, which could lead to denial of service, particularly in systems utilizing Domain Name System Security Extensions (DNSSEC). Additionally, threats such as DNS supply chain attacks or inadvertent errors can result in unauthorized changes to the delegation, including DS (Delegation Signer) records. More general, we solve for the case that a DNS resolver receives parental data that is inconsistent with the intent from the domain owner, i.e. receiving data that is inconsistent with what is published on authoritative servers. That includes not receiving data at all.

In-tree hints can be seen as a building block for resiliency of critical infrastructure or digital autonomy. The approach is complementary to serving stale data from the cache [RFC8767], more on this in section Section 3.3.

In this memo we describe what the parties that are critically dependent on a specific domain and those that serve zones within that domain will need to do in order to guarantee continuous operation.

In section Section 2 we describe the idea and the requirements for a recursive DNS server and the requirements of the zone associated with. In section Section 3.2 we shortly point to other measures that must be taken in combination with this mechanism. In section Section 6 we discuss some policy considerations and the dilemmas that exist with respect to intentions of the DNS parent and child.

This document uses uppercase SHOULD, RECOMMENDED and MUST in the meaning defined by [RFC2119]. Their lowercase equivalents do not have normative meaning.

2. The in-tree hints concept

[RFC9499] describes the root hints file "Operators who manage a DNS recursive resolver typically need to configure a 'root hints file'. This file contains the names and IP addresses of the authoritative name servers for the root zone, so the software can bootstrap the DNS resolution process. For many pieces of software, this list comes built into the software."

The in-tree hints borrows this from this idea: by configuring a 'hints file' for a specific domain one allows oneself to bootstrap from that domain down, even if its parents are not available. Implementing it requires a modification in recursive nameservers and adherence to some operational practices.

2.1. Recursive nameserver

Recursive nameserver software will need to be modified to deal to work with in-tree hints.

An in-tree hints is configuration for a recursive resolver that provides the names and IP addresses of authoritative name servers for a specific domain. A recursive name server may be configured for in-tree hints for multiple domains.

When there are no in-domain (in bailiwick) nameservers ([RFC9499]) in the NS set for the domain then this mechanism MUST [OMK: SHOULD?] not be used. Without this requirement the resiliency properties can potentially not be achieved as there are dependencies outside of control of the domain. This requirement can be enforced by the recursive nameserver software at the moment of configuration parsing. In addition the in bailiwick server should fate share IP connectivity with its dependendants. For instance, in our island example one in-domain name server should be on the isle. In our enterprise example one in-domain server should be on campus.

In-tree hints are only useful if the domain owner follows certain practices. A recursive nameserver MAY only implement the in-tree hints mechanism for a specific domain if the domain owner indicates it does so. Section Section 3.1 describes the RECOMMENDED way for domain name owners to signal their intent. [OMK: REVIEW 2019 Keywords]

In-tree hints MUST only be used in combination with a DNSSEC trust-anchor. i.e. a trusted public DNSSEC key that is associated with the name. The trust-anchor MUST be maintained. It SHOULD be maintained by the mechanism described in [RFC5011]. Alternatively an appropriate and trustworthy off-band mechanism MAY be used. The operator of a recursive nameserver must validate that the domain associated with the in-tree hints follows the operational practices described in this memo. This can be achieved by out-of band mechanisms, or by querying the TXT record as described in {#auth}

When a recursive nameserver is configured with an in-tree hint then the NS Resource Record set contained in the in-tree hint MUST be used during the resolution process. Which means that they always overwrite the NS and DS resource records received from the parent.

When the NS RRset on the domain's authoritative server changes and has been validated using DNSSEC against configured key then the in-hints tree configuration SHOULD be updated with the changed authoritative NS set. This requirement guarantees that the intent of the domain holder will be followed.

The recursive nameserver should honor the TTLs to regular check a change of the authoritative NS RRset. Operators that implement in-tree hints SHOULD use tooling, possibly implemented in the recursive nameserver, to log and signal inconsistencies between information in the parents and the in-tree configuration to the operators of the recursive nameserver, these inconsistencies need to be well understood. They could be the result of a bona-fide re-delegation (in which case the parental records are likely a subset of the authoritative NS RR set), the withdrawal of the delegation by the parent, or an error or attack.

The trust anchor MUST be used for the validation of record within the tree-hint's domain even when a parental DS record exists. Nota bene, section 5 of [RFC5011] allows for deletion if a superior trust point exists - when a trust anchor is part of an in-tree hint that deletion with the motivation that a superior trust point exists MUST not happen. When a tree-hint exists for a subordinate domain, that trust anchor MUST take precedence.

Recursive nameservers that implement this mechanism SHOULD have a fallback mechanism implemented that will eventually allow them to reach the in-domain nameserver when other servers in the NS resource record set fail. [OMK: I think this is an existing requirement somewhere else in the mountain of RFCs]

2.2. Domain Owner

This section describes the operational practices that the domain owner has to follow in order to achieve the resiliency within the domain.

The domain owner MUST maintain its DNSSEC configuration using the mechanism described in [RFC5011].

The domain owner MUST have at least one in-domain authoritative nameserver in its NS set. If that nameserver's name is within a delegated child domain, then the nameservers for that delegated domain MUST also have at least one in-domain authoritative nameserver. This requirement is recursive for further delegation.

In order to benefit from the resiliency properties provided by this mechanism, the domain owner should require that delegated domains (zones) within the domain all have one nameserver that are in-domain. Note that delegated domains do not have to maintain a trust anchor and can rely on there being a chain of trust established using DS records from the trust-anchor down. [OMK: is this actually clear? Domain, sub-domain, in-domain, may become confusing]

Furthermore, the in-domain nameserver SHOULD be positioned in a network that shares connectivity fate with the clients. For instance, in our enterprise example it should be in the enterprise campus network. More generally the location is subject to a risk based assessment about the likelihood of not being able to obtain an IP connection the in-domain nameserver.

[OMK: should there be language here about out-of-domain nameservers?]

The domain owner should communicate to its community that it is deploying practices that support in-tree hints. That communication MAY be out of band. A RECOMMENDED in-band signaling mechanism in-band described in section Section 3.1.

3. Operational Considerations

bla

3.1. Signaling

It is RECOMMENDED that a domain owner (the owner of <domain>) signals to its user community that they are using the mechanism described in this section. Signaling is done by putting a TXT resource record with owner name _in-tree.<domain> containing an expiry timestamp in [RFC3339] format. The expiry timestamp indicates the date to which the owner is committed to follow the instructions in section Section 2.2.

The recursive nameserver operator should at first opportunity, but not longer than 30 days after the expiration, validate if a new expiry record has been published by the domain owner. If not, they SHOULD disable the in-tree hints configuration for the domain.

_in-tree.<domain> TXT <expiry timestamp>

[OMK: Alternatively we create a trivial RR type for this. EXP RR containing a timestamp as defined in RFC4034 section-3.1.5 ]

Out of band signaling is not in scope for this memo.

3.2. Achieving true resiliency of services within the domain.

This memo describes a method to achieve resiliency of name resolution for a community of interest of a particular domain. This is, by far, not sufficient to achieve actual resiliency for services that are provided within the domain. While a detailed discussion is out of scope for this memo we like to remind the reader of the following:

  • The in-domain nameservers should run on IP addresses that can reasonably be expected to be reachable by the community of use. For example, if a service is critical for on-campus enterprise use then the in-domain nameserver should run on the campus network.

  • Any service provider that offers a service under a certain name within the domain should make sure that those services itself can be reasonably expected to be reachable by the community of use. Any service dependencies should also be local.

  • In an effort to create local resiliency one should not forget that resiliency is also achieved by having no single source of failure. Having in-domain nameservers, and having services in reach of the community of interest does not mean that one deploys infrastructure elsewhere.

3.3. Serving stale data

In-tree hints are complementary to serving stale data [RFC8767]. Serving stale data will allow continuity for all zones when their authoritative servers are not reachable and the data happens to be in the resolvers cache. In-tree hints works for specific domains when data does not happen to be available in recursive nameserver caches or when the parent's server(s) deliver faulty delegation data.

In-tree hints is not scalable in the sense that there is significant operational overhead for both the domain owner, they have to run in-domain nameservers and follow [RFC5011], and the recursive nameserver operator as they will have to troubleshoot inconsistencies. Serving stale data is highly scalable as it only needs one configuration within the recursive nameserver and then it applies for all domains.

4. Conclusions

[TODO]

5. Security Considerations

In-tree hints can be used in recursive nameservers in combination with protective block-lists and does therefore not debilitate the available mechanism to protect the community of users of a recursive nameserver.

Malwares that use their own recursive nameservers configured with in-trees for their command and control domains to circumvent de-delegation by the parents. However, those recursive nameservers are likely under the control of the malware administrators and the risk of disproportional damage for blocking these recursive nameservers DNS after it has been established that they are used in command and control seems proportionate.

This mechanism intends to provide resilience for network failures. However, it adds complexity in software and operational procedures, thereby increasing the fragility.

When DNS validation takes place by clients that are 'behind' a recursive nameserver that is configured with in-tree hints for a particular domain then behavior in case of inconsistencies between the domain and its parent will lead to undefined behavior. These validating clients SHOULD also implement in-tree hints.

6. Policy Considerations

Inherently the approach described in this memo provides a mechanism for a community of users of a domain to overwrite the policies from the parent domain. For instance, it allows the community of users to continue to use the domain even when e.g. the delegation for that domain expires. As such, this mechanism allows a community to continue to use a domain when the parent has de-delegated the domain for instance in the context of a court order. At the same time this in-tree approach can be a building block to create resilience for a critical infrastructure. It can potentially be applied to a country code top-level domain (CCTLD) and its user community. While the failure mode at CCTLD level is extremely low, this approach may add to confidence in the domain name system as a whole in times of international tensions.

When an inconsistency exists between what is published in the parent and what is used as in-tree-hints there is a fragmentation of the DNS namespace. The operators of the recursive nameservers should pro-actively restore the situation to consistency. Note that there is no technical enforcement mechanism to aid that restoration but it is expected that if a recursive nameserver operator configures an in-tree domain he is part of the community of interest and therefore has out of band means to contact the domain administrator. Also note that the operators of the domain (e.g. example.net) do not have communication mechanism that can enforce the use or non-use of in-tree hints by recursive nameserver operators.

The authority for using or not using in-tree hints is with the operator of the recursive nameserver - as a user agent for its community. Users have in general been able to overwrite their DNS configuration since the first deployment of the DNS system. Users can use a recursive nameserver that does not use in-tree hints for a particular domain and therefore can opt-out of the mechanism.

7. IANA Considerations

No IANA considerations herein.

8. Acknowledgments

This document is inspired by various hallway conversations about digital autonomy.

The author is an employee of the Internet Society, this document does not necessarily reflect the position of the Internet Society.

{olaf: source="olaf"}

9. References

9.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC3339]
Klyne, G. and C. Newman, "Date and Time on the Internet: Timestamps", RFC 3339, DOI 10.17487/RFC3339, , <https://www.rfc-editor.org/rfc/rfc3339>.
[RFC5011]
StJohns, M., "Automated Updates of DNS Security (DNSSEC) Trust Anchors", STD 74, RFC 5011, DOI 10.17487/RFC5011, , <https://www.rfc-editor.org/rfc/rfc5011>.
[RFC7344]
Kumari, W., Gudmundsson, O., and G. Barwood, "Automating DNSSEC Delegation Trust Maintenance", RFC 7344, DOI 10.17487/RFC7344, , <https://www.rfc-editor.org/rfc/rfc7344>.

9.2. Informative References

[E-Gov-Resilience]
Sommese et al, "Assessing e-Government DNS Resilience", IEEE Proceedings of the 2022 International Conference on Network and Service Management (CNSM 2022), .
[RFC8767]
Lawrence, D., Kumari, W., and P. Sood, "Serving Stale Data to Improve DNS Resiliency", RFC 8767, DOI 10.17487/RFC8767, , <https://www.rfc-editor.org/rfc/rfc8767>.
[RFC9499]
Hoffman, P. and K. Fujiwara, "DNS Terminology", BCP 219, RFC 9499, DOI 10.17487/RFC9499, , <https://www.rfc-editor.org/rfc/rfc9499>.

Author's Address

Olaf Kolkman