Talk about irony: The software that paralyzed Windows computers around the world late Thursday night and early Friday morning was planted by a company that protects Windows computers against malware.
That company is CrowdStrike, a publicly traded cybersecurity firm based in Austin, Texas. It acknowledged the problem around 11 p.m. Thursday and started working on a solution, offering a workaround in the wee hours Friday and a fix a few hours later.
The vast sea of “Blue Screens of Death” triggered by CrowdStrike’s error is a testament to the market-leading status of the company’s software, which detects and defends against malicious code planted by hackers. Its approach is known as “endpoint security” because it installs its defenses on devices that connect to the internet, such as computers and smartphones.
Here’s a quick explanation for how things went wrong so quickly for so many Windows users around the world, including airlines, hospitals, banks and government agencies.
The Falcon Sensor update
One of the selling points of CrowdStrike service is that it can improve its defenses rapidly as new threats are discovered. As part of that service, it continuously and automatically updates the Falcon Sensor software on its customers’ machines.
Automatic updates are, under normal circumstances, a good cybersecurity practice because they prevent clients from having machines with outdated defenses on their networks. But the latest incident reveals the flip side of the coin.
According to CrowdStrike, the problem was triggered by a “single content update” for its customers with Windows PCs. The buggy code wasn’t detected until after it had been downloaded and installed on many of CrowdStrike’s clients’ machines.
Once loaded, the bad update interfered with core functions of the PC, causing Microsoft’s infamous blue error screen to pop up and convey a message along the lines of, “Your PC ran into a problem and needs to restart.” And as long as the update remained in place, restarting the machine led to the same errant result.
The fix offered by CrowdStrike
CrowdStrike stopped sending out the faulty update early Friday morning, so machines that had not loaded it yet were spared the turmoil.
For machines caught in the cycle of blue screen hell, the company initially offered step-by-step instructions for how to reboot Windows in a mode that would allow them to find and delete the buggy update. The drawback, as many commenters online noted, is that this machine-by-machine approach isn’t much help for organizations with hundreds or thousands of bricked PCs.
According to the tech website 404, Microsoft also suggested that rebooting a crashed machine multiple times — as many as 15 — could solve the problem.
Within a few hours, CrowdStrike was distributing a piece of software that removed the buggy code. This worked only for customers whose machines were able to connect to the internet and download the fix, though; everyone else would be left with the PC-by-PC workaround.
The lessons from the CrowdStrike debacle
Some Macintosh and Linux users, who were immune to the CrowdStrike-induced upheaval, devoted a portion of their morning Friday to spiking the football on Windows, even though the problem wasn’t caused by Microsoft.
Other observers argued that the incident demonstrated the risk of having one potential point of failure affecting millions of computers — a problem that has been demonstrated repeatedly during the broadband era.
At the very least, the incident demonstrates that we need better software in critical systems, said Dan O’Dowd, a developer of security software for the military.
“The immense body of software developed using Silicon Valley’s ‘move fast and break things’ culture means that the software our lives depend on is riddled with defects and vulnerabilities,” O’Dowd said in a statement. “Defects in this software can result in a mass failure event even more serious than the one we have seen today.”
He added, “We must convince the CEOs and Boards of Directors of the companies that build the systems our lives depend on to rewrite their software so that it never fails and can’t be hacked. … These companies will not take cybersecurity seriously until the public demands it. And we must demand it now, before a major disaster strikes.”
This story originally appeared in Los Angeles Times.
Add Comment