Skip to main content

Cloudflare 1.1.1.1 Outage Report (July 14, 2025): Global DNS Disruption Root Cause Analysis

 

Cloudflare logo with colorful '1.1.1.1' text above the slogan 'The free app that makes your Internet safer' on a white background."

Key takeaways

  • Global DNS outage: Cloudflare's 1.1.1.1 resolver failed worldwide for 62 minutes on July 14, 2025, due to a configuration error in their service topology .
  • Root cause: A dormant misconfiguration from June 6 linked 1.1.1.1 to a non-production service. When activated, it withdrew critical IP prefixes globally .
  • Traffic impact: UDP/TCP/DoT queries dropped sharply, but DNS-over-HTTPS (DoH) via cloudflare-dns.com stayed stable thanks to separate IPs .
  • Unrelated hijack: Tata Communications (AS4755) advertised 1.1.1.0/24 during the outage, worsening routing issues for some users .
  • Resolution: Cloudflare restored services by 22:54 UTC after reverting configurations and manually re-announcing routes .

Why 1.1.1.1 matters for the internet

You might not think much about DNS resolvers, but they’re like the phonebooks of the internet. Cloudflare’s 1.1.1.1 launched back in 2018 as a faster, privacy-focused alternative to ISP-provided DNS. It quickly became one of the most used resolvers globally, handling billions of queries daily. The service uses anycast routing to direct traffic to the nearest data center, which usually means quick responses and reliability. But on July 14, that same design amplified a failure across every continent. For users relying solely on 1.1.1.1, the internet basically stopped working, websites wouldn’t load, apps froze, and confusion spread. Alot of folks didn’t realize how dependent they’d become on this single service until it vanished .


Timeline of the outage: When everything went dark

Here’s how the incident unfolded, minute by minute :

  • 21:48 UTC: A config change for Cloudflare’s Data Localization Suite (DLS) triggered a global refresh. This activated the dormant error from June 6.
  • 21:52 UTC: 1.1.1.1 prefixes began withdrawing from BGP tables. DNS traffic plummeted within minutes.
  • 21:54 UTC: Tata Communications (AS4755) started advertising 1.1.1.0/24, an unrelated hijack now visible due to Cloudflare’s withdrawal.
  • 22:01 UTC: Internal alerts fired. Incident declared.
  • 22:20 UTC: Fix deployed after reverting configurations.
  • 22:54 UTC: Full service restoration after routes stabilized.

Table: Affected IP ranges during the outage

Table displaying IP prefixes. IPv4 column lists four prefixes; IPv6 column lists corresponding prefixes, with one missing entry. Green header separates columns.

This 62-minute disruption showed how a small config error can cascade into global chaos. Engineers initially missed the June 6 mistake because it didn’t cause immediate problems, no alerts, no complaints. But when that second change hit, it all unraveled fast .


Technical breakdown: What actually broke

The core issue was a service topology misconfiguration. Cloudflare uses internal systems to map which IPs should be advertised where, especially for services like their Data Localization Suite (DLS) that restrict traffic to specific regions. On June 6, a config update accidentally tied 1.1.1.1’s prefixes to a non-production DLS service. Since that service wasn’t live yet, no one noticed .

Then, on July 14, an engineer attached a test location to that same DLS service. This triggered a global refresh of routing policies. Because of the earlier error, 1.1.1.1’s topology got reduced to one offline data center. Routers worldwide immediately withdrew announcements for its IP ranges. Traffic couldn’t reach Cloudflare’s DNS servers at all.

The legacy system managing these topologies lacked safeguards like canary deployments or staged rollouts. A peer-reviewed change still went global in one shot, no gradual testing, no kill switches. Cloudflare’s newer topology system avoids hardcoded IP lists, but migrating between systems created fragility. They’ve since acknowledged this "error-prone" approach needs retiring .


Why detection took 9 minutes: Monitoring gaps

Cloudflare’s internal alerts didn’t fire until 22:01 UTC, 9 minutes after traffic nosedived. Why the delay? A few reasons stand out:

  1. No immediate metric drops: The BGP withdrawal caused routing failure, not server crashes. Queries didn’t fail; they never arrived. Monitoring systems tuned for server errors missed this.
  2. Alert thresholds: Teams avoid overly sensitive alerts to prevent false alarms. As one Hacker News comment noted, operators often wait 5+ minutes before escalating to avoid "alert fatigue" .
  3. Legacy dependencies: Health checks relied on systems that themselves needed DNS resolution, creating blind spots during outages.

This lag highlights a tricky balance: catching failures fast without drowning teams in noise. Cloudflare’s post-mortem implies tighter BGP monitoring might help, but they haven’t detailed specific fixes yet .


The BGP hijack that wasn’t: Tata’s role

As Cloudflare’s routes vanished, something weird happened: Tata Communications (AS4755) started advertising 1.1.1.0/24. ThousandEyes observed this hijack propagating through some networks, worsening connectivity for users whose queries got routed to Tata .

Crucially, this wasn’t malicious. Tata likely advertised 1.1.1.0/24 due to old internal configurations, that prefix was used for testing long before Cloudflare claimed it. Once Cloudflare re-announced their routes, Tata withdrew the hijacked prefix. But for ~25 minutes, it added chaos. This incident underscores how fragile BGP can be when major routes vanish unexpectedly .


Impact analysis: Who felt the outage?

The outage hit hardest for users and apps relying exclusively on 1.1.1.1. But patterns emerged in the data :

  • Protocol differences:
    • UDP/TCP/DoT traffic dropped ~90% (these use IPs like 1.1.1.1 directly).
    • DoH (DNS-over-HTTPS) via cloudflare-dns.com stayed near normal. Its IPs weren’t tied to the faulty topology.
  • Backup resolver users: People using 1.1.1.1 with 1.0.0.1 or third-party DNS (e.g., 8.8.8.8) saw minimal disruption. Failovers kicked in.
  • Regional variances: Reports spiked in North America, Europe, and Asia. Cloudflare Radar confirmed global impact.

Table: Traffic recovery post-fix

"Traffic Restoration Timeline table shows three events from 22:20 to 22:54 UTC. Traffic restoration levels progress from 40% to 98% restored."

Ironically, the outage proved Cloudflare’s DoH resilience. By decoupling DNS from raw IPs, it avoided single points of failure. As one user noted, "DoH was working" when traditional DNS failed .


Lessons for the internet’s infrastructure

This outage wasn’t a cyberattack or hardware failure, it was process and system design flaws. Key takeaways for engineers :

  1. Staged rollouts save lives: Had Cloudflare used canary deployments for config changes, they’d have caught the error in one region first. Their new topology system supports this, but legacy tech didn’t.
  2. Validate dormant configs: "No impact" isn’t "safe." Systems must flag unused configurations that could activate later.
  3. Enforce resolver redundancy: Clients should always use multiple DNS resolvers (e.g., 1.1.1.1 + 8.8.8.8). Single-provider setups risk total outages.
  4. Monitor routing layer: Services need BGP/advertisement visibility, not just server health.

Cloudflare’s pledged to accelerate retiring legacy systems. But as they noted, "This was a humbling event." For the rest of us, it’s a reminder: even giants stumble, and backups matter .


FAQs about the Cloudflare 1.1.1.1 outage

Q: Could using 1.0.0.1 as a backup have helped?
A: Yes, but not completely. 1.0.0.1 shares infrastructure with 1.1.1.1, so both failed. Ideal backups use unrelated resolvers like Google’s 8.8.8.8 or Quad9 .

Q: Why did DNS-over-HTTPS (DoH) keep working?
A: DoH uses domain names (e.g., cloudflare-dns.com), not raw IPs. Those domains resolved via unaffected infrastructure. Always prefer DoH/DoT domains over IPs for resilience .

Q: Was this a BGP hijack?
A: Partially, but not by Cloudflare. Tata’s route advertisement was a side effect of Cloudflare’s withdrawal, not the cause. It amplified issues for some users though .

Q: How often does Cloudflare go down?
A: Rarely. In the last 30 days, 1.1.1.1 had 99.09% uptime vs. 99.99% for Google’s 8.8.8.8. This was an exception, not routine .

Q: Did the outage affect other Cloudflare services?
A: Mostly no. Core CDN, security, and dashboard services use different IPs and weren’t withdrawn. The 1.1.1.1 resolver was the primary casualty .

Popular posts from this blog

PepsiCo Stock Jumps as Elliott Management Takes $4B Activist Stake, Proposes Turnaround for 50% Upside

PepsiCo Stock Jumps as Elliott Management Takes $4B Activist Stake, Proposes Turnaround for 50% Upside Key Takeaways Elliott Management disclosed a  $4 billion stake  in PepsiCo, making them one of the company's largest shareholders and immediately triggering a  5% stock price jump  . The activist investor believes PepsiCo has  undervalued potential  and proposes operational changes that could lead to a  50% upside  in the stock price from current levels . PepsiCo's  North American beverages division  has been a particular underperformer, with strategic missteps and operational issues hurting growth and margins . This isn't PepsiCo's first rodeo with activist investors - Nelson Peltz  pushed for similar changes  about a decade ago but was unsuccessful . The company's response has been  cautiously open  to feedback, stating they'll review Elliott's perspectives within their existing strategy . So What Exactly Happened ...

Nestlé CEO Laurent Freixe Dismissed After Romantic Relationship Probe with Subordinate | Philipp Navratil Appointed New CEO

Nestlé CEO Laurent Freixe Dismissed After Romantic Relationship Probe with Subordinate | Philipp Navratil Appointed New CEO Key Takeaways CEO dismissed for policy violation : Laurent Freixe was ousted immediately after an investigation found he had an undisclosed romantic relationship with a direct subordinate, breaching Nestlé's Code of Business Conduct . Seasoned replacement : Philipp Navratil, a Nestlé veteran since 2001 who most recently led Nespresso, has been appointed as the new CEO effective immediately . Board emphasizes values : Chairman Paul Bulcke stated the dismissal was "necessary" to uphold the company's governance foundations and values, despite thanking Freixe for his years of service . No strategy change expected : The Board confirmed Nestlé will maintain it's current strategic direction under Navratil's leadership . Second CEO departure in a year : This marks Nestlé's second abrupt CEO change in approximately 12 months, following Mark Sc...

Rhode Island's Taylor Swift Tax on Luxury Vacation Homes Sparks Nationwide Trend: Policy Impact & Market Reactions

Rhode Island's Taylor Swift Tax on Luxury Vacation Homes Sparks Nationwide Trend: Policy Impact & Market Reactions Key takeaways The "Taylor Swift Tax"  is Rhode Island's new surcharge on non-owner-occupied properties valued over $1 million, adding  $2.50 per $500  above the threshold This is part of a broader trend  of states targeting wealthy second-home owners to address housing affordability issues, with similar measures in Montana, Los Angeles, and other areas Reactions are deeply divided  between supporters who see it as addressing housing inequality and critics who argue it punishes economic contributors and may backfire The market response  includes buyers hesitating, exploring loopholes, or looking at neighboring states, though wealth flight hasn't happened yet Implementation challenges  include enforcement difficulties, potential legal challenges, and questions about revenue projections What exactly is this "Taylor Swift Tax"? So Rhode Is...

Equinor's $941M Lifeline: Ørsted Rescue Amid Trump's Offshore Wind Attacks | Energy Crisis

Equinor's $941M Lifeline: Ørsted Rescue Amid Trump's Offshore Wind Attacks | Energy Crisis Key Takeaways Norway's Equinor is injecting $941 million  into Danish offshore wind giant Ørsted to maintain its 10% stake, despite massive financial losses from U.S. political headwinds . Trump administration's targeted attacks  on offshore wind have caused severe project delays and cancellations, including stop-work orders on nearly completed projects . The offshore wind industry faces massive consolidation  as companies struggle with inflation, supply chain issues, and political uncertainty, leading to abandoned projects worldwide . Equinor's investment represents both a vote of confidence  and a strategic necessity, as the company aims to secure board representation and deeper collaboration with Ørsted . The future of U.S. offshore wind remains uncertain  as companies weigh legal challenges, project restructuring, and potential policy changes against continuing politic...

Trump's Federal Reserve Board Control: Implications for Interest Rates, Economic Independence & Market Stability

Trump's Federal Reserve Board Control: Implications for Interest Rates, Economic Independence & Market Stability Key Takeaways President Trump's attempt to remove Federal Reserve Governor Lisa Cook represents an  unprecedented challenge  to central bank independence, with potential long-term consequences for monetary policy . Historical examples from  Turkey and Argentina  demonstrate how political interference in central banking can lead to hyperinflation, currency instability, and economic crisis . The Federal Reserve's  independence from political pressure  has been a cornerstone of U.S. economic stability for decades, allowing for data-driven monetary decisions . Financial markets have shown  some concern but overall complacency  regarding Trump's Fed actions, though economists warn this could change rapidly if independence erodes further . Legal experts question whether Trump has  proper constitutional authority  to remove a sit...

Easier to Pump: Trump-Backed American Bitcoin (ABTC) Merges with Gryphon Digital Mining for Nasdaq September 2025 Debut | Eric Trump & Donald Trump Jr. Major Stakeholders | Crypto Policy Expansion

Easier to Pump: Trump-Backed American Bitcoin (ABTC) Merges with Gryphon Digital Mining for Nasdaq September 2025 Debut | Eric Trump & Donald Trump Jr. Major Stakeholders | Crypto Policy Expansion Key Takeaways American Bitcoin will begin trading on Nasdaq  in early September under ticker ABTC after completing it's reverse merger with Gryphon Digital Mining Trump family and Hut 8 maintain overwhelming control  - Combined 98% ownership stake in the new entity raises some corporate governance questions Strategic expansion into Asian markets  already underway with Eric Trump touring Hong Kong and Japan to scout acquisition targets Pro-crypto Trump administration policies  creating favorable regulatory environment for Bitcoin businesses What is American Bitcoin Anyway? American Bitcoin launched just this past March (2025) as a collaboration between Hut 8 Corp and the Trump brothers - Eric Trump and Donald Trump Jr. The company bills itself as a "pure-play bitcoin min...

American Eagle Stock Surges 25% After Sydney Sweeney Jeans Campaign Boosts Earnings and Brand

American Eagle Stock Surges 25% After Sydney Sweeney Jeans Campaign Boosts Earnings and Brand Key Takeaways Stock Performance : American Eagle (AEO) stock surged  25%  in after-hours trading following better-than-expected Q2 2025 earnings, largely credited to their Sydney Sweeney marketing campaign . Campaign Impact : The controversial "Sydney Sweeney has great jeans" campaign generated  40 billion impressions  and led to sell-out products within days while adding  700,000 new customers  . Cultural Impact : The campaign sparked nationwide controversy and became an unlikely culture war flashpoint, with commentary ranging from accusations of eugenics references to endorsement from former President Trump . Future Challenges : Despite the success, American Eagle faces significant headwinds including  $20 million in Q3 tariff impacts  and questions about whether they can sustain this momentum . The Campaign That Shook Retail So how did a jeans commerci...