Beyond Your Firewall: Third-Party Risk, Testing, and the Human Side of Recovery

Written by Innovate | Jan 27, 2026 10:00:00 AM

Your biggest continuity risks sit with the cloud and suppliers you depend on, so this post shows how to pressure-test third parties, hardwire business continuity/disaster recovery into governance, and prove your plans work before a real outage does.

Your Risk Does Not End at Your Firewall

Modern IT is a web of cloud platforms, SaaS products and managed service providers. Outsourcing can shift responsibility, but it doesn’t remove it.

If you rely on these providers for critical services, you should be crystal clear on:

SLAs and uptime commitments
What do they actually guarantee, and what happens when they fail?
The provider’s own BC/DR capabilities
What are their RTO and RPO? How do they test them?
Data export and portability
How quickly and in what format can you get your data out if you need to move or recover elsewhere?
Evidence of testing and certification
For example, ISO 22301 for business continuity, ISO 27001 for information security.

A useful question to ask yourself:

“What would happen in our business if one major provider failed for 24 or 48 hours?”

If you don’t like the answer, you have work to do.

Putting BC and DR on the Leadership Agenda

Business continuity and disaster recovery are not “IT projects”. They are shared responsibilities.

A simple, workable ownership model looks like this:

Board or executive sponsor
Often the COO, CIO or a risk executive. Owns the overall agenda.
Business owners
Accountable for continuity requirements for their services and processes. They decide what “acceptable” looks like.
IT
Accountable for designing, implementing and operating the technical solutions and DR.
Risk and compliance
Support with policy, assurance and reporting.

Then you integrate BC and DR into core governance:

The enterprise risk register
Investment cases and project approvals
Vendor selection and contract reviews

If BC/DR isn’t showing up in those mechanisms, it isn’t truly embedded.

Put Your Plan to the Test Before Reality Does It for You

Having a document called “BCP” or “DRP” is not the same as being prepared.

You don’t want the first real test of your plans to be an actual emergency.

Increase the maturity of your testing over time:

Tabletop or walkthrough exercises
Bring business and IT together. Walk through an incident scenario and talk about who would do what, when.
Partial technical failover tests
For example, fail over one critical application or service and measure how long it takes and what breaks.
Full DR simulations or game days
Switch key services to your DR environment, or simulate a cloud region outage. See what really happens.

While you test, pay close attention to:

How long recovery actually took compared to stated RTOs
Where manual workarounds did or did not work
Gaps in roles, communication and decision-making

Use those outcomes to drive budget and improvements. If a DR test shows you can’t meet your targets, that’s information you can act on.

The Human and Operational Side: Not Just Servers and Scripts

You can have every technical control in place and still fail in practice if you ignore the human side.

Some uncomfortable but necessary questions:

Are key IT and business staff reachable and able to work during an incident?
Do you have clear, usable documentation: runbooks, contact lists, decision trees?
Are those documents stored somewhere that will be available in an outage, rather than on the very systems that have just gone dark?

Then there is communication.

Internal Communication

Staff need clear, timely instructions:

Do they keep working as normal?
Are they switching to manual processes?
Who is handling customer queries and how?

External Communication

Customers will want to know what is going on and when they can expect service to resume. Silence is rarely your friend.
Regulators will be concerned about service impact and data safety, especially in a cyber incident.
Media may show interest if your outage is big enough. Journalists move quickly when something has gone wrong. Better to have a prepared line than scramble in the moment.

Finally, training. Short, regular awareness sessions for leaders and key staff ensure people understand their role in an outage. A beautiful plan that no one remembers under stress is not worth much.

In the last blog post in this series, we'll look at how to measure resilience, present it to the board and turn it into a multi-year roadmap.

View full post