Insights

Whitepapers

Implementing Chaos Engineering & Continuous Compliance for Financial Services

The Simian Army was originally developed by Netflix as a set of tools to ensure that its video streaming service was always available to global customers without any service degradation (while ensuring compliance with all policies relating to security, conformity and cost). That goal may seem relatively straight forward. However, given the scale of its operations, it is anything but. To frame the challenge, consider some of the following statistics:

Read the full version of this whitepaper

Download

Netflix clearly faces requirements (in terms of scale and service quality) that would dwarf those posed by most financial applications. But financial institutions have other unique factors to consider – specifically much more stringent policies and regulations governing access controls, information security and availability (for example MAS 644 specifies a maximum total downtime of 4-hours in a 12-month period for all critical systems operated by banks).

This paper explains why the concepts introduced by the Simian Army are important to any financial institution adopting cloud services. It provides an overview of those concepts – specifically chaos engineering and continuous compliance – along with a more detailed explanation of relevant tools and guidance on how to implement them, both from a practical perspective and in terms of a suggested organisational model.

Why is the Simian Army Important to Financial Institutions?

Public cloud adoption by the financial services industry has lagged behind other sectors. Financial services firms are heavily regulated and subject to more stringent requirements relating to data privacy and security.

Applicable regulations, to name a few, include Dodd-Frank, FFIEC, PCI DSS, GLBA, SOX, USA Patriot Act, MAS TRM, MAS 644, HKMA TM-G and GDPR. Additionally, high profile data leaks have tempered some of the appetite for hosting critical workloads and sensitive data in the cloud, emphasising the importance of controls and continuous compliance.

Operating in a cloud paradigm has some fundamental differences to traditional modes of managing IT infrastructure and software. Cloud supports the creation, modification, and destruction of resources with orders of magnitude greater speed than traditional systems. Cloud environments generally expect relatively high rates of component failures because they are built on large quantities of inexpensive, commodity components.

The growing use of public cloud services in the financial services industry therefore requires a rethink of some key aspects of application development, service management and support:

1) Availability

Public cloud services can experience a higher rate of component failure than traditional on-premise dedicated infrastructure. It is therefore vital that applications developed for the cloud are built to fail. Resilience needs to be architected into software. This core requirement has triggered several corresponding trends in software design, including adoption of microservices, a move from stateful to stateless architectures and a tendency to decouple data from applications.

Similarly, when it comes to service management, the ease with which cloud services can be provisioned enables applications to be re-built more easily and at regular intervals – ensuring system entropy (another potential cause of availability issues) can be re-set.

2) Security

When it comes to information security, as well as identity and access management, the financial industry is subject to much more exacting standards than most other verticals. Although many financial institutions have grown comfortable with the use of Infrastructure as a Service (IaaS) by implementing an Infrastructure as Code (IAC) approach to define and enforce minimum security standards, the adoption of Platform as a Service (PaaS) has introduced greater complexity and new challenges.

The need to lock down all potential attack surfaces in an environment that has primarily been architected to be internet-based, open and multi-tenant requires continuous monitoring to ensure all security policies are properly implemented and do not change.

Stay current on your favourite topics

Subscribe

3) Cost

Cloud economics can be compelling when using appropriate software architectures but it requires good hygiene. Making resources easier to procure can lead to sprawl, so organisations will need to continuously monitor services to ensure they are making use of everything they procure. Equally, cloud resources are most cost effective when software is architected appropriately, with modern architectures helping to reduce reliance on dedicated resources and ensure firms only pay for the CPU cycles necessary to support application processes.

4) Conformity

The move towards Agile & DevOps development methodologies has evolved in tandem with the adoption of cloud. These approaches encapsulate a crucial benefit that financial services are trying to unlock – enabling software development teams to innovate faster. However, as resources become easier to provision and application teams take on more responsibility for their own destiny, new risks need to be managed.

As more responsibility shifts to the application teams, it is vital that those teams are continuously monitored to ensure they conform with all relevant IT policies.


Download this whitepaper now to read the rest of these sections:

Introducing the Simian Army

Learn how tools like the Simian Army help organisations adapt to the cloud and minimise the risks associated with software defined environments.

Chaos Engineering

Discover the different techniques used in Chaos Engineering to provide a holistic set of capabilities for enforcing compliance in the Cloud.

Continuous Compliance

Continuous compliance tools have evolved to help address other key aspects of application design and service management that require a rethink in the cloud – namely, security, cost and conformity. Learn about the different techniques used for continuous compliance in the Cloud.

A Roadmap for Implementation

How to prepare developers and ITSM personnel alike for the Simian Army by applying a “shift left” philosophy and embedding chaos into the environments supporting the organisation’s software development and release pipeline.

Organisational Model – Who Should be Involved

Techniques you can use to bring chaos engineering and continuous compliance into your organisation, requiring buy-in from cross-functional teams spanning multiple roles and responsibilities within an organisation which, historically, have not collaborated particularly effectively.

Build versus Buy

Some key factors to consider when determining whether to buy, build or integrate open source toolsets in implementing chaos engineering and continuous compliance.

 

Download this whitepaper now>


Would you like to know more about our work?


Authors

Ian Tivey

Ian Tivey

Associate Partner, New York

Ian has a broad background across DevOps and Infrastructure disciplines in the design, build and operation of globally-distributed market data distribution and trading platforms. He currently leads Citihub Consulting’s Cloud Practice, having worked with clients in Europe, Asia, and North America to design and build hybrid cloud solutions in highly regulated banking environments.

ian.tivey@citihub.com
Brett Aukburg

Brett Aukburg

CTO & Associate Partner, New York

Brett is a senior technology professional with over 15 years of experience in IT engineering, operations, strategy, and architecture. He has broad technical expertise and is an experienced people manager. He has in depth knowledge of enterprise-wide technologies and their integration in financial services.

brett.aukburg@citihub.com
Paul Jones

Paul Jones

Associate Partner, London

Paul is an AWS, Google and Cloudera-certified IT consultant focused on topics such as software architecture, DevOps & CI/CD, Cloud and Big Data. He has a strong software project delivery expertise including Agile methodologies and product management. Recently, he led a multi-disciplinary team that delivered a bank-wide Data Lake platform serving regulatory and analytical requirements. Paul has 5 years experience of running an international eTrading platform delivery, integration and support team resulting in a deep understanding of the FX trade lifecycle and accompanying technologies.

paul.jones@citihub.com
Jim Oulton

Jim Oulton

Associate Partner, London

Jim is an accomplished distributed infrastructure specialist with almost 20 years’ experience with some of the most challenging application platform environments in financial services. He has enjoyed success on a variety of roles with Citihub Consulting, which have called on him to demonstrate a versatile blend of technical, commercial and organisational skills.

jim.oulton@citihub.com
Ming Zheng

Ming Zheng

Senior Consultant, New York

Ming is a senior software engineer with over 10 years of experience in IT engineering, strategy, and architecture. He has broad software development expertise in building applications on public cloud. In the past, he helped a startup company build a large-scale web application which serves millions of users. He also helped a leading US bank develop the Simian Army, a tool to keep cloud environment clean, secure, resilient and compliant.

ming.zheng@citihub.com