Chapter 3
Planning for Contingencies
Chapter Overview
The third chapter of the book will
articulate the need for contingency planning and explore the major components
of contingency planning. In this chapter, the reader will learn how to create a
simple set of contingency plans using business impact analysis and prepare and
execute a test of contingency plans.
Chapter Objectives
When you complete this chapter, you will be able to:
·
Understand the need for contingency planning
·
Know the major components of contingency
planning
·
Create a simple set of contingency plans, using
business impact analysis
·
Prepare and execute a test of contingency plans
·
Understand the unified contingency plan approach
Introduction
This chapter focuses on planning for the unexpected event,
when the use of technology is disrupted and business operations come close to a
standstill.
“Procedures are required that will permit the organization to
continue essential functions if information technology support is interrupted.”
On average, over 40% of businesses that don't have a disaster
plan go out of business after a major loss.
What Is Contingency
Planning?
The overall planning for unexpected events is called
contingency planning (CP).
CP is the process by which organizational planners position
their organizations to prepare for, detect, react to, and recover from events
that threaten the security of information resources and assets, both human and
artificial.
The main goal of CP is the restoration to normal modes of
operation with minimum cost and disruption to normal business activities after
an unexpected event.
CP
Components
Incident response plan (IRP) focuses on immediate response to
an incident.
Disaster recovery plan (DRP) focuses on restoring operations
at the primary site after disasters occur.
Business continuity plan (BCP) facilitates establishment of
operations at an alternate site, until the organization is able to either
resume operations back at their primary site or select a new primary location.
To ensure continuity across all of the CP processes during the
planning process, contingency planners should:
·
Identify the mission- or business-critical
functions.
·
Identify the resources that support the critical
functions.
·
Anticipate potential contingencies or disasters.
·
Select contingency planning strategies.
·
Implement selected strategy.
·
Test and revise contingency plans.
Four teams of individuals are involved in contingency planning
and contingency operations:
·
The CP team
·
The incident recovery (IR) team.
·
The disaster recovery (DR) team
·
The business continuity plan (BC) team
Contingency
Planning
NIST describes the need for this type of planning as follows:
“The se procedures
(contingency plans, business interruption plans, and continuity of operations
plans) should be coordinated with the backup, contingency, and recovery plans
of any general support systems, including networks used by the
application. The
contingency plans should ensure that interfacing systems are identified and
contingency/disaster planning coordinated.”
Components of
Contingency Planning
Incident Response Plan
The incident response plan (IRP) is a detailed set of
processes and procedures that anticipate, detect, and mitigate the impact of an
unexpected event that might compromise information resources and assets.
In CP an unexpected event is called an incident.
An incident occurs when an attack (natural or man-made)
impacts information resources and/or assets, whether through actual damage or
the act of successfully attacking.
Incident response (IR), then, is a set of procedures that
commence when an incident is detected.
The IRP is usually activated when an incident causes minimal
damage—according to criteria set in advance by the organization—with little or
no disruption to business operations.
When a threat becomes a valid attack, it is classified as an
information security incident if:
·
It is directed against information assets
·
It has a realistic chance of success
·
It threatens the confidentiality, integrity, or
availability of information resources and assets
It is important to understand that IR is a reactive measure,
not a preventative one.
During the incident
First, planners develop and document the procedures that must
be performed during the incident.
These procedures are grouped and assigned to individuals.
The planning committee drafts a set of function-specific
procedures.
After the incident
Once the procedures for handling an incident are drafted,
planners develop and document the procedures that must be performed immediately
after the incident has ceased.
Separate functional areas may develop different procedures.
Before the incident
Finally, the planners draft a third set of procedures, those
tasks that must be performed to prepare for the incident.
These procedures include the details of the data backup
schedules, disaster recovery preparation, training schedules, testing plans,
copies of service agreements, and business continuity plans, if any.
Preparing to Plan
Planning for an incident and the responses to it requires a
detailed understanding of the information systems and the threats they face.
The IR planning team seeks to develop a series of pre-defined
responses which will guide the team and information security staff through the
steps needed for responding to an incident.
Pre-defining incident responses enables the organization to
react quickly and effectively to the detected incident without confusion or
wasted time and effort.
The IR team consists of professionals capable of handling the
information systems and functional areas affected by an incident.
Each member of the IR team must know his or her specific role,
work in concert with each other, and execute the objectives of the IRP.
Incident Detection
The challenge for every IR team is determining whether an event
is the product of routine systems use or an actual incident.
Incident classification is the process of examining a possible
incident, or incident candidate, and determining whether or not it constitutes
an actual incident.
Initial reports from end users, intrusion detection systems,
host- and network-based virus detection software, and systems administrators
are all ways to track and detect incident candidates.
Careful training in the reporting of an incident candidate
allows end users, the help desk staff, and all security personnel to relay
vital information to the IR team.
Possible indicators:
·
Presence of unfamiliar files.
·
Presence or execution of unknown programs or
processes.
·
Unusual consumption of computing resources.
·
Unusual system crashes.
Probable indicators:
·
Activities at unexpected times.
·
Presence of new accounts.
·
Reported attacks.
·
Notification from IDS.
Definite indicators:
·
Use of dormant accounts.
·
Changes to logs.
·
Presence of hacker tools.
·
Notifications by partner or peer.
·
Notification by hacker.
Occurrences of Actual Incidents:
·
Loss of availability.
·
Loss of integrity.
·
Loss of confidentiality.
·
Violation of policy.
·
Violation of law.
Incident Response
Once an actual incident has been confirmed and properly
classified, the IR team moves from the detection phase to the reaction phase.
In the incident response phase, a number of action steps taken
by the IR team and others must occur quickly and may occur concurrently.
These steps include notification of key personnel, the assignment
of tasks, and documentation of the incident.
Notification of Key Personnel
As soon as the IR team determines that an incident is in
progress, the right people must be immediately notified in the right order.
An alert roster is a document containing contact information
on the individuals to be notified in the event of an actual incident.
There are two ways to activate an alert roster:
·
Sequentially
·
Hierarchically
The alert message is a scripted description of the incident
and consists of just enough information so that each responder knows what
portion of the IRP to implement without impeding the notification process.
Not everyone is on the alert roster, only those individuals
who must respond to a specific actual incident.
During this phase, other key personnel not on the alert
roster, such as general management, must be notified of the incident.
This notification should occur only after the incident has
been confirmed, but before media or other external sources learn of it.
It is up to the IR planners to determine in advance whom to
notify and when, and to offer guidance about additional notification steps to
take.
Documenting an Incident
As soon as an incident has been confirmed and the notification
process is underway, the team should begin to document it.
The documentation should record the who, what, when, where,
why and how of each action taken while the incident is occurring.
This documentation serves as a case study after the fact to
determine if the right actions were taken, and if they were effective.
It can also prove the organization did everything possible to
deter the spread of the incident.
Incident Containment Strategies
One of the most critical components of IR is to stop the
incident or contain its scope or impact. Incident containment strategies vary
depending on the incident, and on the amount of damage caused by the incident.
Incident containment strategies focus on two tasks:
·
stopping the incident and
·
recovering control of the systems
The IR team can stop the incident and attempt to recover
control by means of several strategies:
·
Disconnect the affected communication circuits.
·
Dynamically apply filtering rules to limit
certain types of network access.
·
Disabling compromised user accounts
·
Reconfiguring firewalls to block the problem
traffic
·
Temporarily disabling the compromised process or
service
·
Taking down the conduit application or server
·
Stopping all computers and network devices
Incident Escalation
At some point in time the incident may increase in scope or
severity to the point that the IRP cannot adequately handle the event.
Each organization will have to determine, during the business
impact analysis, the point at which the incident becomes a disaster.
The organization must also document when to involve outside
response, as discussed in other sections.
Incident Recovery
Once the incident has been contained, and system control
regained, incident recovery can begin.
The IR team must assess the full extent of the damage in order
to determine what must be done to restore the systems.
The immediate determination of the scope of the breach of
confidentiality, integrity, and availability of information and information
assets is called incident damage assessment.
Those who document the damage must be trained to collect and
preserve evidence, in case the incident is part of a crime or results in a
civil action.
Once the extent of the damage has been determined, the
recovery process begins:
·
Identify the vulnerabilities that allowed the
incident to occur and spread. Resolve
them.
·
Address the safeguards that failed to stop or
limit the incident, or were missing from the system in the first place.
Install, replace or upgrade them.
·
Evaluate monitoring capabilities (if present).
Improve detection and reporting methods, or install new monitoring
capabilities.
·
Restore the data from backups.
·
Restore the services and processes in use.
Compromised (and interrupted) services and processes must be examined, cleaned,
and then restored.
·
Continuously monitor the system.
·
Restore the confidence of the members of the
organization’s communities of interest.
After Action Review
Before returning to routine duties, the IR team must conduct
an after-action review, or AAR .
The after-action review is a detailed examination of the
events that occurred from first detection to final recovery.
All team members review their actions during the incident and
identify areas where the IR plan worked, didn’t work, or should improve.
Law Enforcement Involvement
When an incident violates civil or criminal law, it is the
organization’s responsibility to notify the proper authorities.
Selecting the appropriate law enforcement agency depends on
the type of crime committed.
·
Federal
·
State
·
Local
Involving law enforcement agencies has both advantages and
disadvantages.
Law enforcement agencies are usually much better equipped at
processing evidence, obtaining statements from witnesses, and building legal
cases.
However, involving law enforcement can result in loss of
control of the chain of events following an incident, including the collection
of information and evidence, and the prosecution of suspects.
Disaster Recovery
Disaster recovery planning (DRP) is the preparation for and
recovery from a disaster, whether natural or man made.
In general, an incident is a disaster when:
1) the
organization is unable to contain or control the impact of an incident, or
2) the
level of damage or destruction from an incident is so severe the organization
is unable to quickly recover.
The key role of a DRP is defining how to reestablish
operations at the location where the organization is usually located.
Disaster Classifications
A DRP can classify disasters in a number of ways.
The most common method is to separate natural disasters, from
man-made disasters.
Another way of classifying disasters is by speed of
development.
·
Rapid onset disasters
·
Slow onset disasters
Planning for Disaster
To plan for disaster, the CP team engages in scenario
development and impact analysis, and thus categorizes the level of threat each
potential disaster poses.
When generating a disaster recovery scenario, start first with
the most important asset – people.
Do you have the human resources with the appropriate
organizational knowledge to restore business operations?
The DRP must be tested regularly so that the DR team can lead
the recovery effort efficiently.
The key points the CP team must build into the DRP include:
·
Clear delegation of roles and responsibilities.
·
Execution of the alert roster and notification
of key personnel.
·
Clear establishment of priorities.
·
Documentation of the disaster.
·
Inclusion of action steps to mitigate the impact
of the disaster on the operations of the organization.
·
Inclusion of alternative implementations for the
various systems components, should primary versions be unavailable.
Crisis Management
Crisis management is a set of focused steps that deal
primarily with the people involved taken during and after a disaster.
The DR team works closely with the crisis management team to
assure complete and timely communication during a disaster.
The crisis management team “is responsible for managing the
event from an enterprise perspective and covers the following major activities:
·
Supporting personnel and their loved ones during
the crisis
·
Determining the event's impact on normal
business operations and, if necessary, making a disaster declaration
Keeping the public informed about the event and the actions
being taken to ensure the recovery of personnel and the enterprise
Communicating with major customers, suppliers, partners,
regulatory agencies, industry organizations, the media, and other interested
parties.”
Two key tasks of the crisis management team are:
·
Verifying personnel status.
·
Activating the alert roster.
Responding to the Disaster
When a disaster strikes and the DRP is activated, actual
events can at times outstrip even the best of plans.
To be prepared, the CP team should incorporate a degree of
flexibility into the DRP.
If the physical facilities are intact, the DR team should
begin the restoration of systems and data to work toward full operational
capability.
If the organization’s facilities are destroyed, alternative
actions must be taken until new facilities can be acquired.
When a disaster threatens the viability of an organization at
the primary site, the disaster recovery process becomes a business continuity
process.
Business Continuity Planning
Business continuity planning ensures that critical business
functions can continue if a disaster occurs.
Unlike the DRP, which is usually managed by the IT community
of interest, the business continuity plan (BCP) is most properly managed by the
CEO of an organization.
The BCP is activated and executed concurrently with the DRP
when the disaster is major or long term and requires fuller and complex
restoration of information and information resources.
While the BCP reestablishes critical business functions at an
alternate site, the DRP team focuses on the reestablishment of the technical
infrastructure and business operations at the primary site.
The identification of critical business functions and the
resources to support them is the cornerstone of BCP, as these functions are the
first that must be reestablished at the alternate site.
Continuity Strategies
A CP team can choose from several continuity strategies in its
planning for business continuity.
The determining factor is usually cost.
In general there are three exclusive-use options:
·
hot sites,
·
warm sites, and
·
cold sites,
and three shared-use options:
·
timeshare,
·
service bureaus, and
·
mutual agreements.
Exclusive Use Options
Hot Sites: A fully configured computer facility, with all
services, communications links, and physical plant operations.
Warm Sites: Provides many of the same services and options of the
hot site, but typically software applications are either not included, or not
installed and configured.
Cold Sites: Provides only rudimentary services and facilities.
Shared Use Options
Timeshares: Operates like an exclusive use site, but is leased
with a business partner or other organization.
Service Bureaus: A service agency that, for a fee, provides
physical facilities during a disaster.
Mutual Agreements: A mutual agreement is a contract between
two organizations for each to assist the other in the event of a disaster.
Specialized alternatives:
·
rolling mobile site
·
externally stored resources
Off-Site Disaster Data Storage
To get any of these sites up and running quickly, the
organization must be able to move data into the new site’s systems.
Options include:
·
Electronic vaulting - The bulk batch-transfer of
data to an off-site facility.
·
Remote Journaling - The transfer of live
transactions to an off-site facility.
·
Database shadowing - The
storage of duplicate online transaction data, along with the duplication of the
databases at the remote site to a redundant server.
Putting a Contingency
Plan Together
The CP team should include:
·
Champion.
·
Project manager.
·
Team members.
·
Business managers
·
Information technology managers
·
Information security managers.
Business Impact Analysis
The Business impact analysis (BIA) provides the CP team with
information about systems and the threats they face, and is the first phase in
the CP process.
The BIA is a crucial component of the initial planning stages,
as it provides detailed scenarios of the impact each potential attack can have
on the organization.
One of the fundamental differences between a BIA and the risk
management process is that risk management focuses on identifying the threats,
vulnerabilities, and attacks to determine what controls can protect the
information.
The BIA assumes that these controls have been bypassed, have
failed, or are otherwise ineffective, and that the attack was successful.
·
Threat attack identification
·
Business unit analysis
·
Attack success scenarios
·
Potential damage assessment
·
Subordinate plan classification
Threat Attack Identification and Prioritization
An organization that has followed the risk management process
will have already identified and prioritized threats facing it.
For the BIA, these organizations need only update the threat
list and add one additional piece of information, the attack profile.
An attack profile is a detailed description of the activities
that occur during an attack.
Business Unit Analysis
The second major BIA task is the analysis and prioritization
of business functions within the organization.
Attack Success Scenario Development
Next the BIA team must create a series of scenarios depicting
the impact of an occurrence of each threat on each functional area.
Attack profiles should include scenarios depicting a typical
attack, including its methodology, the indicators of attack, and the broad
consequences.
Then attack success scenarios with more detail are added to
the attack profile, including alternate outcomes—best, worst, and most likely.
Potential Damage Assessment
From these detailed scenarios, the BIA planning team must
estimate the cost of the best, worst, and most likely outcomes by preparing an
attack scenario end case.
This will allow you to identify what must be done to recover
from each possible case.
Related Plan Classification
Once the potential damage has been assessed, and each scenario
and attack scenario end case has been evaluated, a related plan must be
developed or identified from among existing plans already in place.
Each attack scenario end case is categorized as disastrous or
not.
Attack end cases that are disastrous find members of the
organization waiting out the attack, and planning to recover after it is over.
Combining the DRP and the BCP
Because the DRP and BCP are closely related, most
organizations prepare them concurrently, and may combine them into a single
document.
Such a comprehensive plan must be able to support the
reestablishment of operations at two different locations; one immediately at an
alternate site, and one eventually back at the primary site.
Therefore, although a single planning team can develop the
combined DRP/BRP, execution requires separate teams.
A Sample Disaster Recovery Plan
1. Name
of agency.
2. Date
of completion or update of the plan and test date.
3. Agency
staff to be called in the event of a disaster:
4. Emergency
services to be called (if needed) in event of a disaster
5. Locations
of in-house emergency equipment and supplies.
6. Sources
of off-site equipment and supplies.
7. Salvage
Priority List.
8. Agency
Disaster Recovery Procedures
9. Follow-up
Assessment
Testing Contingency
Plans
Once problems are identified during the testing process,
improvements can be made, and the resulting plan can be relied on in times of
need.
There are five testing strategies that can be used to test
contingency plans:
1.
Desk Check
2.
Structured walkthrough
3.
Simulation
4.
Parallel testing
5. Full
interruption
Continuous Improvement
As a closing thought, just as in all organizational efforts,
iteration results in improvement.
A formal implementation of this methodology is a process known
as continuous process improvement (CPI).
Each time the organization rehearses its plans, it should
learn from the process, improve the plans, and then rehearse again.
Through the constant evaluation and improvement, the
organization continues to move forward, and continually improves upon the
process, so that it can strive for an improved outcome.
Key Terms
After-action review
Alert message
Alert roster
Attack profile
Attack scenario end case
Business continuity planning (BCP)
Business Impact Analysis (BIA)
Champion
Cold site
Contingency planning (CP)
Crisis management
Database shadowing
Desk check
Electronic vaulting
Full-interruption
Hierarchical roster
Hot site
Incident candidate
Incident classification
Incident damage assessment
Mutual agreement
Parallel testing
Project manager
Rapid-onset disasters
Remote journaling
Scenarios
Sequential roster
Service bureau
Simulation
Slow-onset disaster
Structured walk-through
Team members
Timeshare
Warm site
Comments
Post a Comment