--- title: In-tree Hints for DNS Resilliency abbrev: in-tree-hints docname: draft-kolkman-in-tree-hints category: info ipr: trust200902 area: ops workgroup: dnsop keyword: Internet-Draft stand_alone: yes pi: RFCedstyle: yes toc: yes tocindent: yes sortrefs: yes symrefs: yes strict: yes comments: yes inline: yes text-list-symbols: -o*+ author: ins: O. Kolkman name: Olaf Kolkman email: kolkman@isoc.org normative: RFC2119: RFC3339: RFC5011: RFC7344: informative: RFC9499: RFC8767: E-Gov-Resilience: title: "Assessing e-Government DNS Resilience" date: 2022 author: - ins: Sommese et al. seriesinfo: "IEEE": Proceedings of the 2022 International Conference on Network and Service Management (CNSM 2022) --- abstract Lorem ipsum dolor sit amet, has no senserit reformidans liberavisse. Laudem signiferumque pri in, et vix facer maluisset interesset. Sit consul minimum recusabo at, causae scriptorem cu qui. Cetero democritum consetetur eos ea. Brute delenit invenire eu vix, pri illum iusto definitionem et, altera quaeque quo ut. Quem offendit eu vel. Ea sumo utroque tacimates has. --- middle Introduction ============ The Domain Name System (DNS) is a remarkably stable and resilient system. However, in many environments people are looking on how they can remain in control over their own environments and reduce external dependencies. In this document we design an operational approach that, with minor support of recursive nameserver can offer one of the elements towards greater autonomy and resilience of infrastructure dependent on a specific domain. An imaginary case in which this approach is useful where an enterprise with domain example.net has a critical function (e.g. financial service) to customers that are on networks that connect a set of campus networks that are interconnected but share one transit connection. If the transit connection breaks, and thereby the connection to the rest of the Internet breaks, the DNS resolution on the campus networks will fail when the domain data is not in cache and the delegation from .net to example.net is not available. At that moment customers will fail to do business with the enterprise, even when the enterprise services the customers from one of the campus networks. Another failure case that this mechanism protects against are attacks that target the delegation. Either MITM attacks that change delegation records (leading to denial of service in case of the use of DNSSEC) or DNS supply chain attacks or errors by which the delegation, including DS records, are changed. The approach is designed for proving resiliency for the Internet's naming function and does not bring full resiliency by itself. But we see this as a building block for resiliency of critical infrastructure or digital autonomy. The approach is complementary to serving stale data from the cache {{RFC8767}}, more on this in section {{stale}}. Our approach is designed to be consistent with the architecture, design, and operation of the DNS. We avoid namespace fragmentation or fundamental protocol changes, in particular we avoid the need for alternative roots. The approach describes what the parties that are critically dependent on a specific domain and those that serve zones within that domain will need to do in order to guarantee continuous operation in the case that there is breakage such as their nameservers not being reachable or a broken delegation from the ancestor domain. In More general, breakage means that DNS resolver receives data that is inconsistent with the intent from the domain owner, i.e. receiving data that is inconsistent with what is published on authoritative servers. That include not receiving data at all. In section {{concept}} we describe the idea and the requirements for a recursive DNS server and the requirements of the zone associated with. In section {{resilience}} we shortly point to other measures that must be taken in combination with this mechanism. In section {{policy}} we discuss some policy considerations. This document uses uppercase SHOULD, RECOMMENDED and MUST in the meaning defined by {{RFC2119}}. Their lowercase equivalents do not have normative meaning. The in-tree hints concept {#concept} ========================== {{RFC9499}} describes the root hints file "Operators who manage a DNS recursive resolver typically need to configure a 'root hints file'. This file contains the names and IP addresses of the authoritative name servers for the root zone, so the software can bootstrap the DNS resolution process. For many pieces of software, this list comes built into the software." The in-tree hints borrows this from this idea. It requires a modification in recursive nameservers and adherence to some operational practices. Recursive nameserver {#rec} ---------------------------- An in-tree hints is configuration for a recursive resolver that provides the names and IP addresses of authoritative name servers for a specific domain. One recursive name server may be configured for in-tree hints for multiple domains. If there are no in-domain nameservers ({{RFC9499}}) in the NS set for the domain then this mechanism MUST not be used. The reason for this requirement is that when there is no in-domain nameserver the resiliency properties cannot be achieved as there are external name dependencies. In-tree hints are only useful if the domain owner follows certain practices and MAY only be followed if the domain owner indicates it does so. Section {{signal}} describes the RECOMMENDED way for signaling the intent. In-tree hints MUST only be used in combination with a trust-anchor. i.e. a trusted public DNSSEC key that is associated with the name. The trust-anchor MUST be maintained. It SHOULD be maintained by the mechanism described in {{RFC5011}}. Alternatively an appropriate and trustworthy off-band mechanism MAY be used. The operator of a recursive nameserver must validate that the domain associated with the in-tree hints follows the operational practices described in this memo. This can be achieved by out-of band mechanisms, or by querying the TXT record as described in {#auth} When a recursive nameserver is configured with an in-tree hint then the NS Resource Record set contained in the in-tree hint configuration should be refreshed and used in the cache. The trust anchor MUST be used for the validation of record within the tree-hint's domain even when a parental DS record exists. Nota bene, section 5 of {{RFC5011}} allows for deletion if a superior trust point exists - when a trust anchor is part of an in-tree hint that deletion with the motivation that a superior trust point exists MUST not happen. When a tree-hint exists for a subordinate domain, that trust anchor MUST take precedence. Once the NS set is its data MUST be used forthwith as an indication for the location of the authoritative NS records. Recursive nameservers should own cache these records and respect the resource record's TTL and actively refresh them from the authoritative servers. In addition, the data will occasionally need to be refreshed in the configuration. This can be achieved with external automation. Operators that implement in-tree hints SHOULD use tooling, possibly implemented in the recursive nameserver, to log and signal inconsistencies between information in the parents and the in-tree configuration to the operators of the recursive nameserver, in particular for changes for the in-domain nameservers. It is assumed that all modern nameservers have a fallback mechanism implemented that will eventually allow them to reach the in-domain nameserver when other servers in the NS resource record set fail. Domain Owner {#auth} -------------------- This section describes the operational practices that the domain owner has to follow in order to achieve the resiliency within the domain. The domain owner MUST maintain its DNSSEC configuration using the mechanism described in {{RFC5011}}. The domain owner MUST have at least one in-domain authoritative nameserver in its NS set. If that nameserver's name is within a delegated child domain, then the nameservers for that delegated domain MUST also have at least one in-domain authoritative nameserver. This requirement is recursive for further delegation. In order to benefit from the resiliency properties provided by this mechanism, the domain owner should require that zones within the domain all have one in-domain nameserver. Note that delegated domains do not have to maintain a trust anchor and can rely on there being a chain of trust established using DS records from the trust-anchor down. TODO OMK: should there be language here about out-of-domain nameservers? The domain owner should communicate to its community that it is using this method. That communication MAY be out of band. A RECOMMENDED in-band signalling mechanism in-band described in section {{signal}}. Operational Considerations {#operational} ====================== bla Signalling {#signal} -------------------- It is RECOMMENDED that a domain owner (the owner of ``) signals to its user community that they are using the mechanism described in this section. Signalling is done by putting a TXT resource record with owner name `_in-tree.` containing an expiry timestamp in {{RFC3339}} format. The expiry timestamp indicates the date to which the owner is committed to follow the instructions in section {{auth}}. The recursive nameserver operator should at first opportunity, but not longer than 30 days after the expiration, validate if a new expiry record has been published by the domain owner. If not they SHOULD disable the in-tree hints configuration for the domain. ``` _in-tree. TXT ``` [OMK: Alternatively we create a trivial RR type for this. EXP RR containing a timestamp as defined in RFC4034 section-3.1.5 ] Out of band signalling is not in scope for this memo. Achieving true resiliency of services within the domain. {#resilience} -------------- This memo describes a method to achieve resiliency of name resolution for a community of interest of a particular domain. This is, by far, not sufficient to achieve actual resiliency for services that are provided within the domain. While further out of scope for this memo we like to remind the reader of the following: * The in-domain nameservers should run on IP addresses that can reasonably be expected to be reachable by the community of use. For example, if a service is critical for on-campus enterprise use then the in-domain nameserver should run on the campus network. * Any service provider that offers a service under a certain name within the domain should make sure that those services itself can be reasonably expected to be reachable by the community of use. Any service dependencies should also be local. * In an effort to create local resiliency one should not forget that resiliency is also achieved by having no single source of failure. Having in-domain nameservers, and having services in reach of the community of interest does not mean that one deploys infrastructure elsewhere. Serving stale data {#stale} ---------------- In-tree hints are complementary to serving stale data {{RFC8767}}. Serving stale data will allow continuity for all zones when their authoritative servers are not reachable and the data happens to be in the resolvers cache. In-tree hints works for specific domains when data does not happen to be available in recursive nameserver caches or when the parent's server(s) deliver faulty delegation data. In-tree hints is not scalable in the sense that there is significant operational overhead for both the domain owner, they have to run in-domain nameservers and follow {{RFC5011}}, and the recursive nameserver operator as they will have to troubleshoot inconsistencies. Serving stale data is highly scalable as it only needs one configuration within the recursive nameserver and then it applies for all domains. Conclusions ============= [TODO] Security Considerations ======================= In-tree hints can be used in recursive nameservers in combination with protective block-lists and does therefore not debilitate the available mechanism to protect the community of users of a recursive nameserver. Mallwares that use their own recursive nameservers configured with in-trees for their command and control domains to circumvent de-delegation by the parents. However, those recursive nameservers are likely under the control of the mallware administrators and the risk of disproportional damage for blocking these recursive nameservers DNS after it has been established that they are used in command and control seems proportionate. Policy Considerations {#policy} ===================== Inherently the approach described in this memo provides a mechanism for a community of users of a domain to overwrite the policies from the parent domain. For instance, it allows the community of users to continue to use the domain even when e.g. the delegation for that domain expires. As such, this mechanism allows a community to continue to use a domain when the parent has de-delegated the domain in the context of a court order. At the same time this in-tree approach, when applied to a country code top-level domain (CCTLD) and its user community, can be a building block to create resilience for a countries critical infrastructure. While the failure mode at CCTLD level is extremely low, this approach may add to confidence in the domain name system as a whole. When an inconsistency exists between what is published in the parent and what is used as in-tree-hints there is a fragmentation of the DNS namespace. The operators of the recursive nameservers should proactively restore the situation to consistency. Note that there is no technical enforcement mechanism to aid that restoration but it is expected that if a recursive nameserver operator configures an in-tree domain he is part of the community of interest and therefore has out of band means to contact the domain administrator. Also note that the operators of the domain (e.g. example.net) do not have communication mechanism that can enforce the use or non-use of in-tree hints by recursive nameserver operators. The authority for using or not using in-tree hints is with the operator of the recursive nameserver - as a user agent for its community. Users can in general overwrite their DNS configuration to use a recursive nameserver that does not use in-tree hints for a particular domain and therefore can opt-out. IANA Considerations =================== The authore Acknowledgements ================= This document is inspired by a conversation with [that guy from Internet.nl] in a discussion during about digital autonomy. The author is an employee of the Internet Society, this document does not necessarily reflect the position of the Internet Society. {olaf: source="olaf"}