Cultural resilience key to dealing with crisis

Australian companies are fostering internal “cultural resilience” with extensive training, by testing staff on potential outage scenarios and by using past incidents as learning opportunities. Systemic weaknesses in digital platforms have been exposed by massive disruptions in recent years and the government, clients, consumers and shareholders now expect corporate Australia to take action to repel, reduce and rapidly recover from these outages.

APRA’s incoming CPS 230 regulation, taking force on July 1, 2025, requires an APRA-regulated financial sector entity to “be able to continue to deliver its critical operations within tolerance levels through severe disruptions with a credible business continuity plan”.

Jane Stanton, partner for risk consulting at professional services company Grant Thornton, says cultural resilience entails a company setting up processes and systems to maintain operational resilience and assist with rapid recovery in case of an incident. But it’s not simply “set and forget” – rapid recovery frameworks require constant assessment of the controls that are in place to ensure they remain effective.

Companies should understand the potential magnitude of possible outages, she says. “If the risk actually eventuates, and you do have a risk event, how long will it take to restore those critical operations so that you reduce the impact on the customer, so you continue to service the customer and provide a core business to them?”

She adds that companies can test plausible scenarios against a tolerance of X number of hours for recovery. “But also, this component in the [CPS 230] standard around severe and plausible scenarios, and that’s really saying test what you’re doing to make sure that you set a tolerance of X number of hours,” she says. “Will that actually work in practice? And does everyone actually understand what their role is in that?”

A new white paper by AFR Intelligence, in partnership with PagerDuty, Ahead of the curve: the challenges and opportunities of shifting to proactive operational resilience, says a bad outage can “catalyse a cascade of damaging impacts”, but on the other hand, it can also be reframed as an opportunity for strategic learning.

A disruption, the paper says, can be “a break from business as usual: these are moments where teams can embark on structured reviews, identify root causes and chart actionable road maps”.

By avoiding the kneejerk reaction of assigning blame, organisations can encourage a “culture of continuous improvement”, the paper adds, and “in the long run, that approach can help knit stakeholders closer together, fostering a strong sense of teamwork and accountability”.

Cloudflare, a US internet infrastructure and website security company, had a huge disruption in 2024. The CEO accepted blame and published a detailed blog post outlining what had happened and apologising. Clients were kept informed, dampening the backlash.

“They weren’t shy about it, they didn’t deflect the blame,” says Shashank Kaul, chief technology officer at Webjet OTA, in the white paper. “Instead, they owned it, improved their systems and moved on.”

Stanton says training and awareness is an important line of defence for all organisations, including the APRA-regulated finance sector entities.

Some send out test emails to determine whether employees recognise them as something that should be blocked or reported.

Some use detailed scenarios to test organisational responses at every level and ensure everyone fully understands their responsibilities in the event of an outage, to make sure the chain of communication operates as intended, and to determine tolerances around critical operations.

Root cause analysis

“So you know that if your system goes down, you’ll still be able to communicate with customers within a certain time frame,” she says. “Particularly with some of the data breaches, you have to communicate with your customers.”

Looming natural disasters, such as Cyclone Alfred, which hammered southern Queensland and northern NSW with huge seas, torrential rain and gusty winds in early March, trigger waves of system testing to find holes and streamline processes before a potential outage occurs.

Australia knew Alfred was coming for days in advance, and there were early forecasts of flooding rains and potentially damaging winds.

When an outage does happen, Stanton says, a “root cause analysis” is critical to determine what exactly failed from a management perspective and how it can be prevented from happening again.

In the months since APRA released its finalised Prudential Standard CPS 230 in July 2023, aimed at “ensuring banks, insurers and superannuation trustees can better manage operational risks and respond to business disruptions”, the Australian financial sector has worked hard to achieve compliance, Stanton says.

“The level of awareness around what the requirements are and what it actually means in practice, I think it’s very high,” she says. “I think for some organisations, it’s been a little bit of an eye-opener.”

Organisations often talk about putting the customer first, Stanton says, but CPS 230 requires them to ensure they can meet set tolerances in terms of being able to service customers at difficult times, and “that’s a test that maybe hasn’t necessarily been applied in that way before”.

Australian Financial Review