Sunday, January 12, 2025
spot_img
More
    HomeLatest NewsMicrosoft Outage Forces Industry to Reflect on "Fragile and Interconnected Systems"

    Microsoft Outage Forces Industry to Reflect on “Fragile and Interconnected Systems”

    The Microsoft Outage, which impacted several operations across the world including flights, hospitals, financial institutions and corporate offices, has shaken up the industry that has now been forced to reflect on the reliability of digital systems. The outage, which start on 19 July 2024, is yet to be fully resolved with various airports functioning manually in the wake of the “blue screen of death” phenomenon that the Microsoft outage gave rise to.

    Both Crowdstrike and Microsoft have acknowledged the issue, and have promised to find a resolution for the issue soon.”Yesterday, CrowdStrike released an update that began impacting IT systems globally. We are aware of this issue and are working closely with CrowdStrike and across the industry to provide customers technical guidance and support to safely bring their systems back online,” said Satya Nadella, CEO, Microsoft. 

    Also read: Global Microsoft CrowdStrike Outage Impacts Airports Worldwide, Handwritten Boarding Passes Issued

    CrowdStrike, on the other hand, while confirming that the Microsoft outage was not a cyber attack, issued an apology to everyone impacted. “I want to sincerely apologize directly to all of you for today’s outage. All of CrowdStrike understands the gravity and impact of the situation. We quickly identified the issue and deployed a fix, allowing us to focus diligently on restoring customer systems as our highest priority. The outage was caused by a defect found in a Falcon content update for Windows hosts. Mac and Linux hosts are not impacted. This was not a cyberattack,” said George Kurtz, founder and CEO, CrowdStrike.

    Industry Reacts to Microsoft Outage

    The issue has given rise to varied industry reactions that are urging organisations across the world to look into the reliability of their systems. “The outages represents how fragile and interconnected our systems are. Companies like MSFT have great practices, and the fact that a bug passes through its process is unfortunate. It reiterates the need for good practices of testing before releasing new software to production systems,” said Srirang Srikantha, Founder and CEO, Yethi Consulting.

    In the same vein, Preeti Singh, Director of Information security and GRC, OSTTRA, says that the fragility of digital ecosystems had been exposed through the Microsoft outage. “This incident highlighted how heavily reliant our entire digital ecosystem is on a handful of service providers. It’s clear that relying on a single type of operating system or security tool poses significant risks. To build resilience, we must adopt strategies that diversify our technological dependencies,” she said.

    She further adds: “The incident also underscores the importance of basic risk management practices. Had there been thorough testing and a contingency plan in place, the impact could have been mitigated. It’s a stark reminder to always verify updates and maintain transparent communication with stakeholders.”

    Preeti Singh also highlighted the need for enhancing incident response capabilities within organisations. “Going forward, we must prioritize redundancy in our systems, rigorously test updates before deployment, and enhance our incident response capabilities. This approach not only improves security but also ensures high availability and quick recovery from system failures,” she advices.

    Concentration and Consolidation Creates Risks like Microsoft Outage

    Lina Khan, Chair, Federal Trade Commission has shared that this incident reveals how concentration can create fragile systems. “All too often these days, a single glitch results in a system-wide outage, affecting industries from healthcare and airlines to banks and auto-dealers. Millions of people and businesses pay the price. These incidents reveal how concentration can create fragile systems,” she says on X, in a series of tweets.

    She further adds: “Concentrating production can concentrate risk, so that a single natural disaster or disruption has cascading effects. This fragility has contributed to shortages in areas ranging from IV bags to infant formula,” while cautioning people that industries may also lack resiliency when in comes to cloud computing as the majority of the industry relies on a handful of cloud providers, and consolidation of this sought may create single points of failure.

    How to Recover from Microsoft Outage?

    The company has created a dedicated blog to help those impacted from the issue to recover quickly. “We have noticed that some Azure VMs are successfully updating via the CrowdStrike Falcon agent after multiple manual Virtual Machine restarts,” says Microsoft while adding that customers can attempt to do so as follows: 

    • Using the Azure Portal – attempting ‘Restart’ on affected VMs.
    • Using the Azure CLI or Azure Shell.

    Author

    RELATED ARTICLES

    LEAVE A REPLY

    Please enter your comment!
    Please enter your name here

    Most Popular

    spot_img
    spot_img