It’s hard to look anywhere without seeing reference to the CloudStrike/Microsoft disaster that is still causing issues around the globe. There is plenty of plaudits for the way that both CloudStrike and Microsoft have handled the fall out and remediation (https://www.crowdstrike.com/falcon-content-update-remediation-and-guidance-hub/) but you can’t escape the conclusion that it shouldn’t have happened in the first place. Something clearly went wrong in the processes either in place, or worse, not in place, to make sure that software releases are thoroughly tested before release. I also read somewhere that there had been a previous problem with CloudStrike software releases which affected at least 2 versions of Linux, but that this went largely unnoticed. I suppose the predominance of Windows machines in the marketplace would make it impossible to hide a problem of this magnitude.
All that said, what is clear is that there was nothing that an organisation using this application, could have done themselves to prevent it, neither could most disaster recovery plans have dealt with this successfully. The remediation has to come from both CloudStrike and Microsoft, which it is.
I wrote a piece recently which included the difference between disaster recovery and business continuity planning (https://hah2.co.uk/what-are-the-questions-business-owners-ask-when-considering-cyber-security/). Disaster Recovery focuses specifically on restoring IT infrastructure and data after a disaster has occurred, and as already pointed out, in this case that fix has to come from outside the affected organisations and there was very little they could do.
Business Continuity refers to the proactive strategies and plans put in place to ensure that essential business functions can continue in the event of a disruption or disaster. This where organisation can help themselves. Of course, all we really see on the news is the effects of the crash of systems, it’s what makes good television. They don’t show organisations that had good business continuity plans in place and could continue to operate, albeit with reduced functionality.
What struck me, watching it all unfold, was that there were some big organisations that were caught completely on the hop. We saw airline staff reverting to manual ticketing but the overall impression is that this was being done on the initiative of individuals and onsite managers, it didn’t seem to be part of coherent planning. Likewise, we saw the same type of issues in the UK NHS and GP surgeries. If there really was a coherent plan in place, I apologise for suggesting that that wasn’t the case, but it sure didn’t look like it was. Those 2 examples are the really big ones that hit the news. There were quite literally hundreds of organisations that were hit and struggled badly.
When I started out in the Cyber Security game, disaster recovery and business continuity planning were absolutely must haves, in fact, as we know, you can’t achieve ISO 2700x certification without it. These days I see very little emphasis being put on this. Have we reached a stage of total reliance on technology and tech giants like CloudStrike and Microsoft, so that we have fallen into a complacency, relying on our suppliers to look after us? If we have, I think that this shows that this is a big mistake. A great saying is that you can outsource your IT but you can’t outsource your responsibility.
Which leads us neatly onto another point. Supply chain security. We talk a lot about making sure our supply chain is as robust as our own systems and that they have good security, and good policies and processes. But this shows that we need to go further than that. We just can’t trust that any software installed will work and not cause problems, we need to ask questions about how rigorous their testing is, who signs off on a release, how is released and by whom? What tests were done before release? These are perfectly valid questions and any software supplier worth their salt has to have good answers for these questions. Any of you ever asked?
As a provider of protective monitoring solutions which require a light touch agent to be installed on systems, albeit on a much smaller scale than CloudStrike, this has given considerable pause for thought. I have already had these discussions with my supply chain and got good answers, but I’m not going to take my foot off the gas and will keep asking before agent upgrades, which admittedly, don’t happen often. But there will be a certain nervousness in the future when it does happen.
Recent Comments