CIO Network: Architecture Resilience | BCS

CIO Network: Architecture Resilience | BCS

With the IT industry still in the shadow of the recent global CrowdStrike outage, the BCS CIO Network met to discuss service resiliency. Speakers and guests used the cyber incident – ​​which brought millions of Windows-based computers to their knees – as a starting point to explore why IT systems fail. They also looked at the technical, cultural, leadership and diversity changes in the team that, if made, could help increase the robustness of systems and the ability to withstand cyber attacks and incidents.

Following the event, keynote speakers appeared in a special edition of the The jewel of all mechanisms podcast discussing resilience in architecture.

Past CIO Network events:

The resilience of architecture

If there was one key takeaway from the event, it was this: the days when the consequences of cyber incidents were confined to the digital realm are long gone. Software is now integrally woven into the fabric of everyday life, and if software fails – or is slow to recover from a failure – the consequences can be devastating.

Speaking at the event and opening the first fireside chat, Alastair Revell FBCS, Chair of BCS, said: “The public is starting to wake up to the need for trust in the systems we depend on. Having confidence in critical national infrastructure is incredibly important… We’re also thinking about accountability—what accountability do people have at board level?

Alastair looked broadly at the points and places where IT has hit the headlines for all the wrong reasons. Beyond the CrowdStrike outage — which shut down airports, stifled hospital admissions and stifled e-commerce — he pointed to the Post Office Horizon Scandal as a prime example of how failures in critical software can lead to dire real-world consequences .

Alastair said that when critical systems fail and lives are affected, there is often a rush to blame IT teams. He observed, “A critical part of the solution, I think, is for boards to be culpable, responsible and accountable for what their organizations put into software.”

Against this backdrop, the event’s guest speakers – active CIOs and board members – explored how successful organizations can build strategies to prevent, survive and recover from critical IT failures. Much of the focus was on practical ways that boards can work productively with CIOs and how CIOs and IT departments can support their governance committees.

See also  Companies often pay ransomware attackers multiple times

The CIO Network meeting was busy, wide-ranging and highly interactive. It also provided many opinions, ideas and perspectives. They fell into three main categories:

  • The case for professionalism
  • Practical advice for boards
  • Useful tips for CIOs

In short, the CIO Network focused on assessing digital risk and the roles of the board and CIO in managing these issues.

The case for professionalism

Given the increasing criticality of IT in business and the public’s daily life, the group discussed the case for professionalizing IT.

“It’s something I’m very passionate about,” said Alastair, pointing to digital health and social care as evidence of the importance of technology in maintaining public safety.

Another speaker, supporting this point, discussed the impact of the June 2024 ransomware attack on a pathology laboratory responsible for processing blood on behalf of NHS organizations in London.

The fallout from the attack rippled through the NHS and eventually led to hospitals in the capital canceling operations due to a lack of blood. This shortage eventually sparked a national appeal for UK blood donors to come forward. A Russian cybercriminal group was believed to be behind the attack.

In this context, the panelists and speakers at the level asked why IT systems can be designed and built by people with few specific IT qualifications. However, hiring an inspector, surgeon, lawyer or accountant would be unthinkable without seeing that they are properly validated and certified.

Experts believed that a licensed IT professional should manage the most critical IT projects, where public safety, trust and taxpayer dollars are at greatest risk.

It was hoped that one day, parents would be proud to say that their children are Chartered IT Professionals in the same way that they might be if their child becomes a Chartered Accountant or, indeed, a physical architect.

Information for CIOs

It was felt that CIOs should prioritize understanding and mitigating the risks associated with their digital infrastructure. Specifically, it should consider analyzing the dependencies between connected systems and ensure that failsafes are in place. A significant theme was the integration of cybersecurity considerations into every aspect of IT operations, particularly regarding potential vulnerabilities in third-party software.

See also  Alibaba Cloud releases over 100 open-source AI models

Modern apps place a lot of emphasis on mapping and understanding user journeys and experiences. However, the panelists felt that this focus on understanding the critical steps in software operation must extend well beyond the presentation and interaction layers.

Instead, they felt that a “golden process” needed to be understood, appreciated, drawn and documented. Organizations need to clearly understand the connections between systems and architecture that combine to enable a service. Without this understanding, making a clear connection between technology and business processes is challenging.

CIOs have been encouraged to adopt rigorous governance frameworks, including processes such as technical and business design authorities, to ensure that all technology decisions are well structured and documented. Another vital recommendation was for CIOs to proactively encourage root cause analysis when IT failures occur, take stock, identify risks early and track those risks with indicators to guide decision-making.

The importance of clear and structured communication with non-technical leaders was also highlighted. CIOs should be able to translate complex IT risks and strategies into a language that executives and other decision makers can easily understand. This includes presenting risks in a way that shows how they will be mitigated and how those mitigations align with overall business objectives.

A recurring theme was how technical risk is communicated to boards of directors, whose members are generally non-technical. Boards naturally speak and understand the language of profit and loss statements. As such, it was felt that there was a niche that could be filled for IT team members who could present cyber risks to boards in ways they would naturally understand, appreciate and then prioritize.

Other vital suggestions for CIOs included:

  • Focus on component uptime: move beyond measuring overall server uptime and focus on collecting performance data about critical modules and components that impact user experience
  • Balance agility with security: integrate security and resiliency into agile development processes to avoid compromising long-term stability
  • Address the technical debt: prioritize resolving technical debt proactively to prevent future system vulnerabilities and inefficiencies
  • Encouraging transparent risk sharing: create an environment where teams feel safe, escalating risks and challenges ahead of time, encouraging collaborative problem solving
  • Invest in continuing education: ensure ongoing training and upskilling of teams to keep pace with technological advances and cyber security needs. This is where SFIAplus and RoleModelplus can prove useful
  • Implement metrics beyond hardware: moving from traditional hardware-based metrics to user-centric measures that reflect accurate service availability and quality
  • Adopt proactive risk management: continuously identify and mitigate emerging risks, ensuring the resilience of IT systems
See also  UN effort underway to address 'world's AI governance deficit'

Findings for plates

Key takeaways and recurring themes for boards and executives underscore the need for increased accountability, deeper engagement with IT systems, and more robust governance over technology decisions. Boards must accept greater responsibility for IT risks, moving beyond traditional financial oversight to ensure the resilience and security of digital infrastructure. This includes promoting a better understanding of the risks that come with IT systems, particularly around service continuity and potential disruptions.

Experts felt that board members should ensure they are well-informed about IT risks and require clear communication from CIOs and technical leaders to help them understand the implications of new technologies or system changes.

A recurring theme was the need for boards to engage more actively with the technical architecture of their organizations. This means asking the right questions about systems resilience and security, and not relying solely on the IT department for this information.

Training and certification are also essential. Boards should be encouraged to ensure that IT professionals, including system architects, are properly certified, similarly to other professions requiring formal qualifications.

This also applies to board members, who should receive training to better understand the technical dimensions of their oversight.

The recurring message is that boards need to take a proactive, informed and strategic role in IT risk management to ensure organizational resilience.

Summarizing

The CIO Network event ended with the expert guests breaking into smaller groups and discussing their main takeaways.

Looking specifically at the CrowdStrike incident, experts felt that this specific software failure was unlikely to change attitudes and approaches to risk. That’s because organizations expect critical software systems to fail or be unavailable – it’s a reality.


#CIO #Network #Architecture #Resilience #BCS

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *