Chapter 3 Planning for Contingencies

Chapter 3

Planning for Contingencies

Chapter Overview

The third chapter of the book will articulate the need for contingency planning and explore the major components of contingency planning. In this chapter, the reader will learn how to create a simple set of contingency plans using business impact analysis and prepare and execute a test of contingency plans.

Chapter Objectives

When you complete this chapter, you will be able to:

· Understand the need for contingency planning

· Know the major components of contingency planning

· Create a simple set of contingency plans, using business impact analysis

· Prepare and execute a test of contingency plans

· Understand the unified contingency plan approach

Introduction

This chapter focuses on planning for the unexpected event, when the use of technology is disrupted and business operations come close to a standstill.

“Procedures are required that will permit the organization to continue essential functions if information technology support is interrupted.”

On average, over 40% of businesses that don't have a disaster plan go out of business after a major loss.

What Is Contingency Planning?

The overall planning for unexpected events is called contingency planning (CP).

CP is the process by which organizational planners position their organizations to prepare for, detect, react to, and recover from events that threaten the security of information resources and assets, both human and artificial.

The main goal of CP is the restoration to normal modes of operation with minimum cost and disruption to normal business activities after an unexpected event.

CP Components

Incident response plan (IRP) focuses on immediate response to an incident.

Disaster recovery plan (DRP) focuses on restoring operations at the primary site after disasters occur.

Business continuity plan (BCP) facilitates establishment of operations at an alternate site, until the organization is able to either resume operations back at their primary site or select a new primary location.

To ensure continuity across all of the CP processes during the planning process, contingency planners should:

· Identify the mission- or business-critical functions.

· Identify the resources that support the critical functions.

· Anticipate potential contingencies or disasters.

· Select contingency planning strategies.

· Implement selected strategy.

· Test and revise contingency plans.

Four teams of individuals are involved in contingency planning and contingency operations:

· The CP team

· The incident recovery (IR) team.

· The disaster recovery (DR) team

· The business continuity plan (BC) team

Contingency Planning

NIST describes the need for this type of planning as follows:

“These procedures (contingency plans, business interruption plans, and continuity of operations plans) should be coordinated with the backup, contingency, and recovery plans of any general support systems, including networks used by the application. The contingency plans should ensure that interfacing systems are identified and contingency/disaster planning coordinated.”

Components of Contingency Planning

Incident Response Plan

The incident response plan (IRP) is a detailed set of processes and procedures that anticipate, detect, and mitigate the impact of an unexpected event that might compromise information resources and assets.

In CP an unexpected event is called an incident.

An incident occurs when an attack (natural or man-made) impacts information resources and/or assets, whether through actual damage or the act of successfully attacking.

Incident response (IR), then, is a set of procedures that commence when an incident is detected.

The IRP is usually activated when an incident causes minimal damage—according to criteria set in advance by the organization—with little or no disruption to business operations.

When a threat becomes a valid attack, it is classified as an information security incident if:

· It is directed against information assets

· It has a realistic chance of success

· It threatens the confidentiality, integrity, or availability of information resources and assets

It is important to understand that IR is a reactive measure, not a preventative one.

During the incident

First, planners develop and document the procedures that must be performed during the incident.

These procedures are grouped and assigned to individuals.

The planning committee drafts a set of function-specific procedures.

After the incident

Once the procedures for handling an incident are drafted, planners develop and document the procedures that must be performed immediately after the incident has ceased.

Separate functional areas may develop different procedures.

Before the incident

Finally, the planners draft a third set of procedures, those tasks that must be performed to prepare for the incident.

These procedures include the details of the data backup schedules, disaster recovery preparation, training schedules, testing plans, copies of service agreements, and business continuity plans, if any.

Preparing to Plan

Planning for an incident and the responses to it requires a detailed understanding of the information systems and the threats they face.

The IR planning team seeks to develop a series of pre-defined responses which will guide the team and information security staff through the steps needed for responding to an incident.

Pre-defining incident responses enables the organization to react quickly and effectively to the detected incident without confusion or wasted time and effort.

The IR team consists of professionals capable of handling the information systems and functional areas affected by an incident.

Each member of the IR team must know his or her specific role, work in concert with each other, and execute the objectives of the IRP.

Incident Detection

The challenge for every IR team is determining whether an event is the product of routine systems use or an actual incident.

Incident classification is the process of examining a possible incident, or incident candidate, and determining whether or not it constitutes an actual incident.

Initial reports from end users, intrusion detection systems, host- and network-based virus detection software, and systems administrators are all ways to track and detect incident candidates.

Careful training in the reporting of an incident candidate allows end users, the help desk staff, and all security personnel to relay vital information to the IR team.

Possible indicators:

· Presence of unfamiliar files.

· Presence or execution of unknown programs or processes.

· Unusual consumption of computing resources.

· Unusual system crashes.

Probable indicators:

· Activities at unexpected times.

· Presence of new accounts.

· Reported attacks.

· Notification from IDS.

Definite indicators:

· Use of dormant accounts.

· Changes to logs.

· Presence of hacker tools.

· Notifications by partner or peer.

· Notification by hacker.

Occurrences of Actual Incidents:

· Loss of availability.

· Loss of integrity.

· Loss of confidentiality.

· Violation of policy.

· Violation of law.

Incident Response

Once an actual incident has been confirmed and properly classified, the IR team moves from the detection phase to the reaction phase.

In the incident response phase, a number of action steps taken by the IR team and others must occur quickly and may occur concurrently.

These steps include notification of key personnel, the assignment of tasks, and documentation of the incident.

Notification of Key Personnel

As soon as the IR team determines that an incident is in progress, the right people must be immediately notified in the right order.

An alert roster is a document containing contact information on the individuals to be notified in the event of an actual incident.

There are two ways to activate an alert roster:

· Sequentially

· Hierarchically

The alert message is a scripted description of the incident and consists of just enough information so that each responder knows what portion of the IRP to implement without impeding the notification process.

Not everyone is on the alert roster, only those individuals who must respond to a specific actual incident.

During this phase, other key personnel not on the alert roster, such as general management, must be notified of the incident.

This notification should occur only after the incident has been confirmed, but before media or other external sources learn of it.

It is up to the IR planners to determine in advance whom to notify and when, and to offer guidance about additional notification steps to take.

Documenting an Incident

As soon as an incident has been confirmed and the notification process is underway, the team should begin to document it.

The documentation should record the who, what, when, where, why and how of each action taken while the incident is occurring.

This documentation serves as a case study after the fact to determine if the right actions were taken, and if they were effective.

It can also prove the organization did everything possible to deter the spread of the incident.

Incident Containment Strategies

One of the most critical components of IR is to stop the incident or contain its scope or impact. Incident containment strategies vary depending on the incident, and on the amount of damage caused by the incident.

Incident containment strategies focus on two tasks:

· stopping the incident and

· recovering control of the systems

The IR team can stop the incident and attempt to recover control by means of several strategies:

· Disconnect the affected communication circuits.

· Dynamically apply filtering rules to limit certain types of network access.

· Disabling compromised user accounts

· Reconfiguring firewalls to block the problem traffic

· Temporarily disabling the compromised process or service

· Taking down the conduit application or server

· Stopping all computers and network devices

Incident Escalation

At some point in time the incident may increase in scope or severity to the point that the IRP cannot adequately handle the event.

Each organization will have to determine, during the business impact analysis, the point at which the incident becomes a disaster.

The organization must also document when to involve outside response, as discussed in other sections.

Incident Recovery

Once the incident has been contained, and system control regained, incident recovery can begin.

The IR team must assess the full extent of the damage in order to determine what must be done to restore the systems.

The immediate determination of the scope of the breach of confidentiality, integrity, and availability of information and information assets is called incident damage assessment.

Those who document the damage must be trained to collect and preserve evidence, in case the incident is part of a crime or results in a civil action.

Once the extent of the damage has been determined, the recovery process begins:

· Identify the vulnerabilities that allowed the incident to occur and spread. Resolve them.

· Address the safeguards that failed to stop or limit the incident, or were missing from the system in the first place. Install, replace or upgrade them.

· Evaluate monitoring capabilities (if present). Improve detection and reporting methods, or install new monitoring capabilities.

· Restore the data from backups.

· Restore the services and processes in use. Compromised (and interrupted) services and processes must be examined, cleaned, and then restored.

· Continuously monitor the system.

· Restore the confidence of the members of the organization’s communities of interest.

After Action Review

Before returning to routine duties, the IR team must conduct an after-action review, or AAR.

The after-action review is a detailed examination of the events that occurred from first detection to final recovery.

All team members review their actions during the incident and identify areas where the IR plan worked, didn’t work, or should improve.

Law Enforcement Involvement

When an incident violates civil or criminal law, it is the organization’s responsibility to notify the proper authorities.

Selecting the appropriate law enforcement agency depends on the type of crime committed.

· Federal

· State

· Local

Involving law enforcement agencies has both advantages and disadvantages.

Law enforcement agencies are usually much better equipped at processing evidence, obtaining statements from witnesses, and building legal cases.

However, involving law enforcement can result in loss of control of the chain of events following an incident, including the collection of information and evidence, and the prosecution of suspects.

Disaster Recovery

Disaster recovery planning (DRP) is the preparation for and recovery from a disaster, whether natural or man made.

In general, an incident is a disaster when:

1) the organization is unable to contain or control the impact of an incident, or

2) the level of damage or destruction from an incident is so severe the organization is unable to quickly recover.

The key role of a DRP is defining how to reestablish operations at the location where the organization is usually located.

Disaster Classifications

A DRP can classify disasters in a number of ways.

The most common method is to separate natural disasters, from man-made disasters.

Another way of classifying disasters is by speed of development.

· Rapid onset disasters

· Slow onset disasters

Planning for Disaster

To plan for disaster, the CP team engages in scenario development and impact analysis, and thus categorizes the level of threat each potential disaster poses.

When generating a disaster recovery scenario, start first with the most important asset – people.

Do you have the human resources with the appropriate organizational knowledge to restore business operations?

The DRP must be tested regularly so that the DR team can lead the recovery effort efficiently.

The key points the CP team must build into the DRP include:

· Clear delegation of roles and responsibilities.

· Execution of the alert roster and notification of key personnel.

· Clear establishment of priorities.

· Documentation of the disaster.

· Inclusion of action steps to mitigate the impact of the disaster on the operations of the organization.

· Inclusion of alternative implementations for the various systems components, should primary versions be unavailable.

Crisis Management

Crisis management is a set of focused steps that deal primarily with the people involved taken during and after a disaster.

The DR team works closely with the crisis management team to assure complete and timely communication during a disaster.

The crisis management team “is responsible for managing the event from an enterprise perspective and covers the following major activities:

· Supporting personnel and their loved ones during the crisis

· Determining the event's impact on normal business operations and, if necessary, making a disaster declaration

Keeping the public informed about the event and the actions being taken to ensure the recovery of personnel and the enterprise

Communicating with major customers, suppliers, partners, regulatory agencies, industry organizations, the media, and other interested parties.”

Two key tasks of the crisis management team are:

· Verifying personnel status.

· Activating the alert roster.

Responding to the Disaster

When a disaster strikes and the DRP is activated, actual events can at times outstrip even the best of plans.

To be prepared, the CP team should incorporate a degree of flexibility into the DRP.

If the physical facilities are intact, the DR team should begin the restoration of systems and data to work toward full operational capability.

If the organization’s facilities are destroyed, alternative actions must be taken until new facilities can be acquired.

When a disaster threatens the viability of an organization at the primary site, the disaster recovery process becomes a business continuity process.

Business Continuity Planning

Business continuity planning ensures that critical business functions can continue if a disaster occurs.

Unlike the DRP, which is usually managed by the IT community of interest, the business continuity plan (BCP) is most properly managed by the CEO of an organization.

The BCP is activated and executed concurrently with the DRP when the disaster is major or long term and requires fuller and complex restoration of information and information resources.

While the BCP reestablishes critical business functions at an alternate site, the DRP team focuses on the reestablishment of the technical infrastructure and business operations at the primary site.

The identification of critical business functions and the resources to support them is the cornerstone of BCP, as these functions are the first that must be reestablished at the alternate site.

Continuity Strategies

A CP team can choose from several continuity strategies in its planning for business continuity.

The determining factor is usually cost.

In general there are three exclusive-use options:

· hot sites,

· warm sites, and

· cold sites,

and three shared-use options:

· timeshare,

· service bureaus, and

· mutual agreements.

Exclusive Use Options

Hot Sites: A fully configured computer facility, with all services, communications links, and physical plant operations.

Warm Sites: Provides many of the same services and options of the hot site, but typically software applications are either not included, or not installed and configured.

Cold Sites: Provides only rudimentary services and facilities.

Shared Use Options

Timeshares: Operates like an exclusive use site, but is leased with a business partner or other organization.

Service Bureaus: A service agency that, for a fee, provides physical facilities during a disaster.

Mutual Agreements: A mutual agreement is a contract between two organizations for each to assist the other in the event of a disaster.

Specialized alternatives:

· rolling mobile site

· externally stored resources

Off-Site Disaster Data Storage

To get any of these sites up and running quickly, the organization must be able to move data into the new site’s systems.

Options include:

· Electronic vaulting - The bulk batch-transfer of data to an off-site facility.

· Remote Journaling - The transfer of live transactions to an off-site facility.

· Database shadowing - The storage of duplicate online transaction data, along with the duplication of the databases at the remote site to a redundant server.

Putting a Contingency Plan Together

The CP team should include:

· Champion.

· Project manager.

· Team members.

· Business managers

· Information technology managers

· Information security managers.

Business Impact Analysis

The Business impact analysis (BIA) provides the CP team with information about systems and the threats they face, and is the first phase in the CP process.

The BIA is a crucial component of the initial planning stages, as it provides detailed scenarios of the impact each potential attack can have on the organization.

One of the fundamental differences between a BIA and the risk management process is that risk management focuses on identifying the threats, vulnerabilities, and attacks to determine what controls can protect the information.

The BIA assumes that these controls have been bypassed, have failed, or are otherwise ineffective, and that the attack was successful.

The CP team conducts the BIA in the following stages:

· Threat attack identification

· Business unit analysis

· Attack success scenarios

· Potential damage assessment

· Subordinate plan classification

Threat Attack Identification and Prioritization

An organization that has followed the risk management process will have already identified and prioritized threats facing it.

For the BIA, these organizations need only update the threat list and add one additional piece of information, the attack profile.

An attack profile is a detailed description of the activities that occur during an attack.

Business Unit Analysis

The second major BIA task is the analysis and prioritization of business functions within the organization.

Attack Success Scenario Development

Next the BIA team must create a series of scenarios depicting the impact of an occurrence of each threat on each functional area.

Attack profiles should include scenarios depicting a typical attack, including its methodology, the indicators of attack, and the broad consequences.

Then attack success scenarios with more detail are added to the attack profile, including alternate outcomes—best, worst, and most likely.

Potential Damage Assessment

From these detailed scenarios, the BIA planning team must estimate the cost of the best, worst, and most likely outcomes by preparing an attack scenario end case.

This will allow you to identify what must be done to recover from each possible case.

Related Plan Classification

Once the potential damage has been assessed, and each scenario and attack scenario end case has been evaluated, a related plan must be developed or identified from among existing plans already in place.

Each attack scenario end case is categorized as disastrous or not.

Attack end cases that are disastrous find members of the organization waiting out the attack, and planning to recover after it is over.

Combining the DRP and the BCP

Because the DRP and BCP are closely related, most organizations prepare them concurrently, and may combine them into a single document.

Such a comprehensive plan must be able to support the reestablishment of operations at two different locations; one immediately at an alternate site, and one eventually back at the primary site.

Therefore, although a single planning team can develop the combined DRP/BRP, execution requires separate teams.

A Sample Disaster Recovery Plan

1. Name of agency.

2. Date of completion or update of the plan and test date.

3. Agency staff to be called in the event of a disaster:

4. Emergency services to be called (if needed) in event of a disaster

5. Locations of in-house emergency equipment and supplies.

6. Sources of off-site equipment and supplies.

7. Salvage Priority List.

8. Agency Disaster Recovery Procedures

9. Follow-up Assessment

Testing Contingency Plans

Once problems are identified during the testing process, improvements can be made, and the resulting plan can be relied on in times of need.

There are five testing strategies that can be used to test contingency plans:

1. Desk Check

2. Structured walkthrough

3. Simulation

4. Parallel testing

5. Full interruption

Continuous Improvement

As a closing thought, just as in all organizational efforts, iteration results in improvement.

A formal implementation of this methodology is a process known as continuous process improvement (CPI).

Each time the organization rehearses its plans, it should learn from the process, improve the plans, and then rehearse again.

Through the constant evaluation and improvement, the organization continues to move forward, and continually improves upon the process, so that it can strive for an improved outcome.

Key Terms

After-action review

Alert message

Alert roster

Attack profile

Attack scenario end case

Business continuity planning (BCP)

Business Impact Analysis (BIA)

Champion

Cold site

Contingency planning (CP)

Crisis management

Database shadowing

Desk check

Electronic vaulting

Full-interruption

Hierarchical roster

Hot site

Incident candidate

Incident classification

Incident damage assessment

Mutual agreement

Parallel testing

Project manager

Rapid-onset disasters

Remote journaling

Scenarios

Sequential roster

Service bureau

Simulation

Slow-onset disaster

Structured walk-through

Team members

Timeshare

Warm site

Search This Blog