John Bartlett CBCI, DBCI |
Many organisations I talk to fail to adopt
appropriate Business Continuity (BC) strategies. They either plan for scenarios
that are unlikely, decide on an approach that is unachievable or fail to align
BC strategies with other strategic initiatives. Identifying, implementing and
maintaining appropriate BC strategies will determine how successful (or not)
your organisation is when responding to major or localised incident.
The principle behind identifying the appropriate BC
strategies is one of synergy and practicality. BC strategies should be aligned
and integrated with other strategies (such as Business, product/service, IT and
premises strategies) and be capable of meeting organisation requirements. These
requirements were collected in the Business Impact Analysis (BIA) and
Continuity Requirements Analysis and (CRA).
Scenarios
Organisations often believe they should plan for
individual scenarios (such as fire, flood, flu pandemic, power outage, etc.).
The truth is that this often provides little benefit as it is time consuming
and impossible to plan for all eventualities. Instead, time and effort is
better utilised planning for the consequences of such scenarios (such loss or
unavailability of people, infrastructure components, suppliers, information or
a combination of these).
Strategies
Information collected in the BIA and CRA for the
individual products and services is required to identify suitable strategies
for each. This will include how quickly the product or service needs to be
recovered (the Recovery Time Objective or RTO), the target point in time for acceptable
data loss (the Recovery Point Objective or RPO) and the maximum tolerable
limits for these before the organisation suffers irreparable damage, known as
the Maximum Tolerable Period of disruption (MTPD) and Maximum Tolerable Data
Loss (MTDL).
Strategies to be considered for essential
activities and products/services include:
Diverse site – carry out activities for the product/service at more than one site. When an incident affects one of the sites, the other can be used to conduct essential activities for that product/service;
Replication - replicate the capability to another site (e.g. a third party site or another office of the organisation) so it is ready to use but only use that other site when the main site is affected by an incident;
Standby facilities – have facilities available at another location which is only activated, setup and made available when an incident affects the main site;
Subcontracting work – Use a third party to carry out some or all activities for a product or service when an incident occurs or using a dual supply facility with multiple suppliers normally in case one fails, for example manufacturing or a call centre (normally these arrangements would be put in place prior to an incident occurring);
Post-incident acquisition – Purchase equipment, facilities or alternate site after the main site and activities have been affected by the incident (a shopping list and potential suppliers should be available in advance of an incident);
Insurance – To cover financial compensation for loss of assets, business interruption and death/injury, however this would normally only be considered with other strategies;
Replication - replicate the capability to another site (e.g. a third party site or another office of the organisation) so it is ready to use but only use that other site when the main site is affected by an incident;
Standby facilities – have facilities available at another location which is only activated, setup and made available when an incident affects the main site;
Subcontracting work – Use a third party to carry out some or all activities for a product or service when an incident occurs or using a dual supply facility with multiple suppliers normally in case one fails, for example manufacturing or a call centre (normally these arrangements would be put in place prior to an incident occurring);
Post-incident acquisition – Purchase equipment, facilities or alternate site after the main site and activities have been affected by the incident (a shopping list and potential suppliers should be available in advance of an incident);
Insurance – To cover financial compensation for loss of assets, business interruption and death/injury, however this would normally only be considered with other strategies;
Do nothing – Where the RTO/RPO is a considerable period of time (e.g. a month or two) it may be practical to decide on the strategy after the incident.
The purpose of defining the tactical responses is to identify what needs to be done to implement the chosen strategies for each product/service. Appropriate tactics will need to be chosen to cover the core requirements relating to:
People – Quantity, skills and knowledge;
Premises – Buildings and office facilities (furniture, filing, faxes, photocopiers, telephones, printed stationery, desktop stationery, etc.);
Resources – Central IT systems, voice and data communications links, paper based information, equipment (such as PCs, laptops, printers, scanners, etc.);
Suppliers – Products, services and materials provided by third parties and which third parties can and should be used.
Premises – Buildings and office facilities (furniture, filing, faxes, photocopiers, telephones, printed stationery, desktop stationery, etc.);
Resources – Central IT systems, voice and data communications links, paper based information, equipment (such as PCs, laptops, printers, scanners, etc.);
Suppliers – Products, services and materials provided by third parties and which third parties can and should be used.
Some organisations (such as manufacturing) should
also take into account:
- Production processes;
- Materials, logistics and inventory/stock;
- Power and utilities (e.g. water, gas, etc.).
- Cross-training individuals to be able to carry out the roles of others;
- Maintaining documented processes and procedures;
- Storing critical equipment off-site;
- Setting up a dual arrangement with another organisation (a suitable distance away) to share office space in the event of an incident that affects either of you;
- Providing key staff with remote access to log in to systems from home;
- Installing backup generators and UPS power supplies to cover critical equipment;
- Hold extended/contingency levels of stock at alternative premises;
- Hold spare or old equipment, or additional spares for critical equipment;
- Conducting extended due diligence on suppliers to validate robustness.
Assessing Appropriateness
- Conduct an assessment to ensure there is an appropriate balance between the speed of recovery and cost. If the latter is too high then a compromise RTO/RPO may be required for the product/service.
- Ensure that any alternative supplier or alternative/backup site is suitably far enough away from the main site of the organisation so as not to be affected by the same threat/incident;
- Ensure that any alternative location is suitably accessible by staff (there’s little point in having an alternate site in Barka, if most of your staff live in Wadi Kabir);
- Having alternative suppliers is fine, but ensure their supply is separate and cannot be affected by the same problem (e.g. two different telecoms suppliers is fine, but ensure they do not share the same route of infrastructure into the premises, ensure the two suppliers are not sourcing from the same wholesaler); Having a strategy to purchase equipment (such as PCs, Servers, Laptops) after an incident is fine, but if there is a wide-scale incident affecting a number of areas and organisations in Muscat, there is a likelihood that everyone will be trying to purchase equipment and there may be short supply and delivery delays!
Maintenance
As with other aspects of BCM, it is vitally important to involve individuals who have specialist knowledge (e.g. procurement, capacity planners, IT strategists, Premises Management, etc.) to ensure the most appropriate strategy and tactical choices are made and ensure these are regularly reviewed and updated as your organisation changes.
No comments:
Post a Comment