
What came first: the CNAME or the A record?
On January 8, 2026, a routine update to 1.1.1.1 aimed at reducing memory usage accidentally triggered a wave of DNS resolution failures for users across the Internet. The root cause wasn't an attack or an outage, but a subtle shift in the order of records within our DNS responses. While most modern software treats the order of records in DNS responses as irrelevant, we discovered that some implementations expect CNAME records to appear before everything else. When that order changed, resolution started failing. This post explores the code change that caused the shift, why it broke specific DNS clients, and the 40-year-old protocol ambiguity that makes the "correct" order of a DNS response difficult to define. All timestamps referenced are in Coordinated Universal Time (UTC). Time Description 2025-12-02 The record reordering is introduced to the 1.1.1.1 codebase 2025-12-10 The change is released to our testing environment 2026-01-07 23:48 A global release containing the change starts 2026-01-08 17:40 The release reaches 90% of servers 2026-01-08 18:19 Incident is declared 2026-01-08 18:27 The release is reverted 2026-01-08 19:55 Revert is completed. Impact ends While making some improvements to lower the memory usage of our cache implementation, we introduced a subtle change to CNAME record ordering. The change was introduced on December 2, 2025, released to our testing environment on December 10, and began deployment on January 7, 2026. How DNS CNAME chains work When you query for a domain like www.example.com , you might get a CNAME (Canonical Name) record that indicates one name is an alias for another name. It’s the job of public resolvers, such as 1.1.1.1 , to follow this chain of aliases until it reaches a final response: www.example.com → cdn.example.com → server.cdn-provider.com → 198.51.100.1 As 1.1.1.1 traverses this chain, it caches every intermediate record. Each record in the chain has its own TTL (Time-To-Live) , indicating how long we can cache it. Not all the TTLs in a CNAME chain need to be the same: www.example.com → cdn.example.com (TTL: 3600 seconds) # Still cached cdn.example.com → 198.51.100.1 (TTL: 300 seconds) # Expired When one or more records in a CNAME chain expire, it’s considered partially expired. Fortunately, since parts of the chain are still in our cache, we don’t have to resolve the entire CNAME chain again - only the part that has expired. In our example above, we would take the still valid www.example.com → cdn.example.com chain, and only resolve the expired cdn.example.com A record . Once that’s done, we combine the existing CNAME chain and the newly resolved records into a single response. The code that merges these two chains is where the change occurred. Previously, the code would create a new list, insert the existing CNAME chain, and then append the new records: impl PartialChain { /// Merges records to the cache entry to make the cached records complete. pub fn fill_cache(&self, entry: &mut CacheEntry) { let mut answer_rrs = Vec::with_capacity(entry.answer.len() + self.records.len()); answer_rrs.extend_from_slice(&self.records); // CNAMEs first answer_rrs.extend_from_slice(&entry.answer); // Then A/AAAA records entry.answer =...
Preview: ~500 words
Continue reading at Cloudflare
Read Full Article