Arrow-right Camera
The Spokesman-Review Newspaper
Spokane, Washington  Est. May 19, 1883

Congress grills CrowdStrike about multibillion-dollar July outage

Many flights were delayed or canceled at Washington’s Reagan National Airport and other airports after the CrowdStrike-generated outage July 19.  (Danny Nguyen/The Washington Post)
By Joseph Menn</p><p>Washington Post</p><p>

Members of Congress grilled a senior executive of security company CrowdStrike on Tuesday, demanding to know why it triggered a cascading, multibillion-dollar tech failure in July that shut down 911 call centers, handicapped hospitals and stranded airplane passengers around the world.

House Homeland Security Committee Chairman Mark Green, R-Tenn., faulted chief executive George Kurtz for ducking his request to testify and sending a threat intelligence expert instead to face questioning on one of the most catastrophic failures in tech history.

“Everywhere Americans turned, basic societal functions were unavailable,” Green said. “We cannot allow a mistake of this magnitude to happen again.”

CrowdStrike Senior Vice President Adam Meyers apologized for the disaster, echoing previous statements by Kurtz and laying out the technical missteps that allowed a faulty configuration update to balloon into a “Blue Screen of Death” on more than 8 million Windows devices running CrowdStrike’s antivirus sensors. For the first day, rebooting worked only if someone talked each user through a process specific to their machine.

Meyers’ account before the House committee, like a deeper analysis that CrowdStrike has published, accepted responsibility for the outage but couched it as a consequence of the market-leading company’s quest for efficiency in responding quickly to new threats.

More than a dozen former employees, however, recently told web publication Semafor that the company prioritized speed over quality, recounting multiple instances when they had sought to review and improve code before shipping it, only to be rebuffed.

CrowdStrike told the publication that it was practical to deploy programs and then make them better. The company didn’t immediately respond to a request for comment from The Washington Post.

But after the July 19 collapse, the company acknowledged having violated well-established best practices, including testing changes on a wide array of devices and distributing them to a small group of customers before sending them everywhere. Meyers told lawmakers on Tuesday that those patterns had been corrected.

A key reason that the impact was so great is that CrowdStrike’s programs occupy a privileged position inside computers, with access to the Windows kernel that controls nearly all levels of the device. That access is standard for scores of security programs, because they need to be up and running quickly, before viruses or hackers can get deep into the device and shut down the security protections.

Some efforts are underway to find security approaches that entail less systemic vulnerability. Earlier this month Microsoft convened a meeting of software architects, security companies and regulators at its campus and recently said it would develop an alternative to kernel access that offers much of the same functionality. Parts of a test version could be ready in as little as six months, David Weston, Microsoft’s vice president for operating system security, told The Post.

While there are no plans to force security vendors to such an alternative, Weston said any new rules would apply equally to Microsoft’s own security offerings. He said Microsoft would also require better testing regimens for those relying on kernel access.

In the meantime, the estimated $5.4 billion failure has cost high-flying CrowdStrike tens of millions of dollars in revenue and billions in stock market value. Hard-hit Delta Air Lines, which canceled thousands of flights and is being sued by its travelers, has threatened to sue CrowdStrike for $500 million.

Kurtz said CrowdStrike was legally responsible for less than $10 million of those damages. He and Microsoft said Delta’s poor practices left it in much worse shape for longer than other airline customers.

The outage underscored the deep interdependence of computer-based systems, driving home widespread fears about the adequacy of efforts to limit the impact of a deliberate attack by a foreign government.

It has also given new impetus to efforts to devise a way to hold software providers legally liable for gross negligence. They are now broadly exempt.

Jen Easterly, director of the Cybersecurity and Infrastructure Security Agency, said in an interview that she has been speaking to members of the Homeland Security Committee and others about a plan that would permit such lawsuits, supplemented by significant “safe harbor” provisions that would exempt companies following good practices.