Navigating the responsibility of the dreaded software and firmware upgrade

It certainly appears of late that we are seeing more and more instances where firmware and software updates go horribly wrong – and it’s becoming a very real, business impacting, issue for
many organisations.

There are many sounds reasons for upgrading systems and a plethora of information is available regarding why it is important to maintain your systems – feature enhancements, improved security, and support requirements factor heavily into why. In fact, the Essential 8, a guideline developed by the Australian Government to highlight eight critical areas of security for business, states that business should “regularly update and patch applications, including operating systems, to address vulnerabilities and reduce the risk of exploitation by attackers.” Clearly, there is no ambiguity around the importance of patching!

But what happens when patching and firmware updates break the very systems you’re trying to enhance or protect – often causing business impacting outages? Why is it happening so much more frequently than ever before? And who should be accountable for the quality and stability of firmware and patching releases?

While it cannot be stressed highly enough the importance that well managed and maintained systems play when supporting and securing your network, is it time for product vendors to take some accountability in this area?

When we purchase a product, we expect that the product developer, when releasing software and firmware updates, has done significant development and testing prior to release. We naturally expect that this testing encompasses compatibility with various systems and configurations, for both their solution and the most common third-party applications they integrate with. But far too often, it feels like the product developer lets us down. Often, in their rush to push out system updates and stay relevant, it appears they pay too little attention to these critical steps?

It’s no secret that threat actors are prolific in recent times – and this has placed greater pressure on developers to mitigate threats quickly by getting updates out in a timely manner, but there is a difference between timely and rushed – and oftentimes, the haste with which the updates are pushed causes us to question whether quality control has been sacrificed or diminished. Anyone working in the information technology industry for several years, has more than likely been impacted by a vendor update that has gone wrong – despite having detailed testing procedures and recovery procedures in place. Also, even once a successful upgrade has occurred, often bugs are discovered very soon post the upgrade, forcing another upgrade or even worse, downgrade of the solution.

As we have recently seen, Service Providers and MSPs are particularly at risk to this, as they have performance SLA’s and are accountable for impact of their services to their customer networks. This very act of updating has the potential to take a Service Provider or MSP out of business. But not upgrading, could be more devastating!

This then raises the question – do you have any rights when it comes to vendor updates going wrong? Is there any recourse for business impacting outages that cause real material harm to your business?

The answer to this question is unfortunately more complex than you expect – the area is not well regulated in terms of having structured quality control (and this is a much wider conversation), so negligence is often difficult to establish. Other things to consider, that could impact on upgrading include how your network is designed, and how you implement change.

In terms of accountability, there is perhaps now a good argument for a wider discussion around how vendors and their developers maintain quality control and security when developing bug fixes and firmware updates. Vendors can vary drastically in terms of governance – and updates can often be rushed and not well planned or tested. I would suggest a greater governance in this area could go a long way to assisting users establish a clearer case for negligence. But even if this is not achievable, well-known governance visible to purchases, may be the very thing that influences buying decisions. Another consideration is, while the knee-jerk reaction might be to blame the vendor, the actual responsibility may often be shared, with flaws in the process of change management or lack of prior testing, poor design in your network or systems, and failing to have a sound roll back plan, you may be the very reason that the upgrade process fails.

When you are updating your systems, you must look to follow best practice implementation procedures. This includes testing the new release in a controlled or lab environment that mirrors the production environment. Also, having a comprehensive backup and recovery plan is equally vital to mitigate any negative impacts. Having a Change Advisory Board with both management and technical involvement, reviewing and approving all changes is also imperative to a successful change. For more complex upgrades, implement strong partnerships with your vendors. Collaboration with your vendors is essential during the upgrade process. Ensure you have established support channels with your vendors and communicate any issues promptly. Consider even logging a pre-emptive support ticket.

As an organisation, it isn’t advisable to be the pioneer when it comes to upgrading systems, unless it’s absolutely necessary. Of course, you also want to ensure your updates are timely – so, when is the right time to update?

This will vary according to criticality. Each update should be carefully considered, and its importance understood and discussed internally in your business by decision makers. If the update isn’t critical, you can often hold off and once several organisations have already been through the process, you and the vendor will be aware of potential risks involved based on what others have experienced. A rule we follow and worth considering is the n-1 policy. If n is the latest version of software, n-1 is the previous version. Although in your lab environment you should be testin the latest software, in your production environment, operate on the previous version of software. Providing the software is not end of support, vendors will provide security updates for both. The reason for moving to a new version of software is often functionality. Ask yourself, is this something your business must have right now.

However, in the case of a critical security update, you simply may not have the benefit of time. In these situations, if you are unable to put adequate protections in place, you may simply have to look at doing the update sooner rather than later. But even in these situations, understand the risk v the reward. Having any new updates running in your lab environment can assist in mitigating risk. What you should take away from this blog is, “the responsibility for the impact of software and firmware upgrades is a shared endeavour”. Preparedness is key, regardless, as it can assist in a smoother deployment. Where you can, test the update in a lab environment or a on a subsection of your environment before going all in.

Michael Demery – Managing Director – Seccom Global