This is an unpopular opinion, and I get why – people crave a scapegoat. CrowdStrike undeniably pushed a faulty update demanding a low-level fix (booting into recovery). However, this incident lays bare the fragility of corporate IT, particularly for companies entrusted with vast amounts of sensitive personal information.

Robust disaster recovery plans, including automated processes to remotely reboot and remediate thousands of machines, aren’t revolutionary. They’re basic hygiene, especially when considering the potential consequences of a breach. Yet, this incident highlights a systemic failure across many organizations. While CrowdStrike erred, the real culprit is a culture of shortcuts and misplaced priorities within corporate IT.

Too often, companies throw millions at vendor contracts, lured by flashy promises and neglecting the due diligence necessary to ensure those solutions truly fit their needs. This is exacerbated by a corporate culture where CEOs, vice presidents, and managers are often more easily swayed by vendor kickbacks, gifts, and lavish trips than by investing in innovative ideas with measurable outcomes.

This misguided approach not only results in bloated IT budgets but also leaves companies vulnerable to precisely the kind of disruptions caused by the CrowdStrike incident. When decision-makers prioritize personal gain over the long-term health and security of their IT infrastructure, it’s ultimately the customers and their data that suffer.

  • John Richard@lemmy.worldOP
    link
    fedilink
    English
    arrow-up
    0
    ·
    4 months ago

    I upvoted because you actually posted technical discussion and details that are accurate. PXE and remote power management is the way. Most workstation BIOS will have IPMI functionality already included. I agree thought that being that these are remote endpoints, it can be more challenging. Having a script to reboot their endpoints into a recovery environment though would be a basic step though in any DR scenario. Mounting the OS partition to delete a file & reboot wouldn’t be a significant endeavor, although one that they’d need to make sure they got right. Still though, it would be hard to mess up for anyone with intermediate computer skills… and you’d hope these companies at least have someone trained to do that rather quickly. They’d have to spend more time writing up a CR explaining all the steps, and then joining a conference call with like 100 people with babies crying in the background… and managers insisting they remain on the call while they write the script.