Cloud

Company that's supposed to prevent outages caused a big one

By Diana Goovaerts Jul 19, 2024 10:11am

A bad CrowdStrike update crashed Microsoft 365 systems, causing widespread outages across the globe
CrowdStrike said a fix is now available, but implementing it could be tricky and take time
Analysts were left scratching their heads and wondering how this could happen

IT professionals woke up to quite the headache early Friday morning. A bad update from cloud cybersecurity company CrowdStrike bricked millions of Microsoft machines around the world – those used by banks, airlines, governments and even 911 systems – displaying the dreaded blue screen.

The incident left analysts and techies across the globe wondering how such a catastrophic failure happened. And involving one of the very companies whose sole purpose is preventing outages, no less.

As AvidThink Founder Roy Chua noted on LinkedIn, “One would think this error should have been caught much earlier as part of a good software development QA, CI/CD process. Or a staged rollout for an Falcon channel update?”

He added that on one hand, the scope of the outage reflects CrowdStrike’s impressively large global footprint. But on the other hand, he said it highlights the fact that “developers who provide auto-updates for near-system-level software (drivers, security sensors, monitoring and telemetry sensors) that hook deep into OSes need to vet their updates with much more diligence.”

Microsoft right now. pic.twitter.com/5AA5YuseR5
— Matthew Highton (@MattHighton) July 19, 2024

Similarly, Moor Insights and Strategy Founder Patrick Moorhead questioned “why enterprises globally update a .sys file without an airgapped test prior to deployment. Speed? Confidence because ‘it never happened before’?”

Meanwhile, Patrick Kelly, Founder of Appledore Research had an eye on the fallout. He warned that the outage will “cost companies hundreds of billions of $$$ in lost productivity. Although a fix is in process, it will take many weeks to unwind the damage.”

CrowdStrike, what happened?

According to a statement from Microsoft on X, an issue impacting customers’ ability to use their Microsoft 365 services was first noticed on Thursday evening.

Microsoft’s outage dashboard indicated that the breakdown was caused by a “configuration change in a portion of our Azure backend workloads” which interrupted the connection between storage and compute resources. That in turn resulted in access failures downstream.

Intel from Fierce Network's internal IT team suggests the outage impacted the Central US region of Azure.

More cloud security woes

Cloud security is already under scrutiny this week as AT&T recovers from a data breach due to a security lapse with Snowflake.

“The breach was caused by exploiting the inherent vulnerability of single-factor credentials – stolen Snowflake customer credentials – that were then used in a credential-stuffing attack to gain access to the customer's databases,” Semperis principal technologist Sean Deuby told Fierce earlier in the week.

This is a developing story.

Microsoft Crowdstrike outage network outage service outage cloud cybersecurity Cloud

Company that's supposed to prevent outages caused a big one

CrowdStrike, what happened?

Related

More cloud security woes

Related