APPENDIXCComputer Security Incident Handling Guide
Recommendations of the National Institute of Standards and Technology
Executive Summary
Computer security incident response has become an important component of information technology (IT) programs. Security-related threats have become not only more numerous and diverse but also more damaging and disruptive. New types of security-related incidents emerge frequently. Preventative activities based on the results of risk assessments can lower the number of incidents, but not all incidents can be prevented. An incident response capability is therefore necessary for rapidly detecting incidents, minimizing loss and destruction, mitigating the weaknesses that were exploited, and restoring computing services. To that end, this publication provides guidelines for incident handling, particularly for analyzing incidentrelated data and determining the appropriate response to each incident. The guidelines can be followed independently of particular hardware platforms, operating systems, protocols, or applications.
Because performing incident response effectively is a complex undertaking, establishing a successful incident response capability requires substantial planning and resources. Continually monitoring threats through intrusion detection systems (IDSs) and other mechanisms is essential. Establishing clear procedures for assessing the current and potential business impact of incidents is critical, as is implementing effective methods of collecting, analyzing, and reporting data. Building relationships and establishing suitable means of communication with other internal groups (e.g., human resources, legal) and with external groups (e.g., other incident response teams, law enforcement) are also vital.
This publication seeks to help both established and newly formed incident response teams. This document assists organizations in establishing computer security incident response capabilities and handling incidents
265
efficiently and effectively. More specifically, this document discusses the following items:
+Organizing a computer security incident response capability
Establishing incident response policies and procedures
Structuring an incident response team, including outsourcing considerations
Recognizing which additional personnel may be called on to participate in incident response.
Handling incidents from initial preparation through the post-incident lessons learned phase
+Handling specific types of incidents
Denial of Service (DoS) —an attack that prevents or impairs the authorized use of networks, systems, or applications by exhausting resources
Malicious Code— a virus, worm, Trojan horse, or other codebased malicious entity that infects a host
Unauthorized Access— a person gains logical or physical access without permission to a network, system, application, data, or other resource
Inappropriat**e Usage**— a person violates acceptable computing use policies
Multiple Component— a single incident that encompasses two or more incidents; for example, a malicious code infection leads to unauthorized access to a host, which is then used to gain unauthorized access to additional hosts.
Implementing the following requirements and recommendations should facilitate efficient and effective incident response for Federal departments and agencies.
Organizations must create, provision, and operate a formal incide**nt response capability. Federal law requires Federal agencies to report incidents to the Federal Computer Incident Response Center (FedCIRC) office within the Department of Homeland Security.**
The Federal Information Security Management Act (FISMA) of 2002requires Federal agencies to establish incident response capabilities. Each Federal civilian agency must designate a primary and secondary point of contact (POC) with FedCIRC, report all incidents, and internally document corrective actions and their impact. Each agency is responsible for determining specific ways in which these requirements are to be fulfilled.
Establishing an incident response capability should include the following actions:
+Creating an incident response policy
- Developing procedures for performing incident handling and reporting, based on the incident response policy
+Setting guidelines for communicating with outside parties regarding incidents
+Selecting a team structure and staffing model
Establishing relationships between the incident response team and other groups, both internal (e.g., legal department) and external (e.g., law enforcement agencies)
Determining what services the incident response team should provide + Staffing and training the incidentresponse team.
Organizations should reduce the frequency of incidents by effectively securing networks, systems, and applications.
Preventing problems is normally less costly and more effective than reacting to them after they occur. Thus, incident prevention is an important complement to an incident response capability. If security controls are insufficient, high volumes of incidents may occur, overwhelming the resources and capacity for response, which would result in delayed or incomplete recovery and possibly more extensive damage and longer periods of service and data unavailability. Incident handling can be performed more effectively if organizations complement their incident response capability with adequate resources to actively maintain the security of networks, systems, and applications, freeing the incident response team to focus on handling serious incidents.
Organizations should document their guidelines for interactions with other organizations regarding incidents.
During incident handling, theorganization may need to communicate with outside parties, including other incident response teams, law enforcement, the media, vendors, and external victims. Because such communications often need to occur quickly, organizations should predetermine communication guidelines so that only the appropriate information is shared with the right parties. If sensitive information is released inappropriately, it can lead to greater disruption and financial loss than the incident itself. Creating and maintaining a list of internal and external POCs, along with backups for each contact, should assist in making communications among parties easier and faster.
Organizations should emphasize the importance of incident detection and analysis throughout the organization.
Inan organization, thousands or millions of possible signs of incidents may occur each day, recorded mainly by logging and computer security software. Automation is needed to perform an initial analysis of the data and select events of interest for human review. Event correlation software and centralized logging can be of great value in automating the analysis process. However, the effectiveness of the process depends on the quality of the data that goes into it. Organizations should establish logging standards and procedures to ensure that adequate information is collected by logs and security software and that the data is reviewed regularly.
Organizations should create written guidelines for prioritizing incidents.
Prioritizing the handling of individual incidents is a critical decision point in the incident response process. Incidents should be prioritized based on the following:
+Criticality of the affected resources (e.g., public Web server, user workstation)
- Current and potential technical effect of the incident (e.g., root compromise, data destruction).
Combining the criticality of the affected resources and the current and potential technical effect of the incident determines the business impact of the incident — for example, data destruction on a user workstation might result in a minor loss of productivity, whereas root compromise of a public Web server might result in a major loss of revenue, productivity, access to services, and reputation, as well as the release of confidential data (e.g., credit card numbers, Social Security numbers).
Incident handlers may be under great stress during incidents, so it is important to make the prioritization process clear. Organizations should decide how the incident response team should react under various circumstances, and then create a Service Level Agreement (SLA) that documents the appropriate actions and maximum response times. This documentation is particularly valuable for organizations that outsource componentsof their incident response programs. Documenting the guidelines should facilitate faster and more consistent decision-making.
Organizations should use the lessons learned process to gain value from incidents.
After a major incident has been handled, the organization should hold a lessons learned meeting to review how effective the incident handling process was and identify necessary improvements to existing security controls and practices. Lessons learned meetings should also be held periodically for lesser incidents. The information accumulated from all lessons learned meetings should be used to identify systemic security weaknesses and deficiencies in policies and procedures. Follow-up reports generated for each resolved incident can be important not onlyfor evidentiary purposes but also for reference in handling future incidents and in training new incident response team members. An incident database, with detailed information on each incident that occurs, can be another valuable source of information for incident handlers.
Organizations should strive to maintain situational awareness during large-scale incidents.
Organizations typically find it very challenging to maintain situational awareness for the handling of large-scale incidents because of their complexity. Many people within the organization may play a role in the incident response, and the organization may need to communicate rapidly and efficiently with various external groups. Collecting, organizing, and analyzing all the pieces of information,so that the right decisions can be made and executed, are not easy tasks. The key to maintaining situational awareness is preparing to handle large-scale incidents, which should include the following:
Establishing, documenting, maintaining, and exercising on-hours and off-hours contact and notification mechanisms for various individuals and groups within the organization (e.g., chief information officer [CIO], head of information security, IT support, business continuity planning) and outside the organization (e.g., incident response organizations, counterparts at other organizations).
Planning and documenting guidelines for the prioritization of incident response actions based on business impact.
Preparing one or more individuals to act as incident leads who are responsible for gathering information from the incident handlers and other parties, and distributing relevant information to the parties that need it.
Practicing the handling of large-scale incidents through exercises and simulations on a regular basis; such incidents happen rarely, so incident response teams often lack experience in handling them effectively.
Appendix A — Recommendations
Appendix A lists the major recommendations presented in Sections 2 through 8 of this document. The first group of recommendations applies to organizing an incident response capability. The remaining recommendations have been grouped by the phases of the incident response life cycle — preparation; detection and analysis; containment, eradication, and recovery; and postincident activity. Each group contains general recommendations for its incident response phase and any applicable recommendations for handling particular categories of incidents (e.g., denial of service [DoS]) during the phase.
A.1 Organizing a Computer Security Incident Response Capability
+Establish a formal incident response capability.Organizations should be prepared to respond quickly and effectively when computer security defenses are breached. The Federal Information Security Management Act (FISMA) requires Federal agencies to establish incident response capabilities.
A.1.1IncidentResponse Policy and Procedure Creation
+Create an incident response policy and use it as the basis for incident response procedures.The incident response policy is the foundation of the incident response program. It defines which events are considered incidents, establishes the organizational structure for incident response, defines roles and responsibilities, and lists the requirements for reporting incidents, among other items.
+Establish policies and procedures regarding incident-related information**sharing.**The organization will want or be required to communicate incident details with outside parties, such as the media, law enforcement agencies, and incident reporting organizations. The incident response team should discuss this requirement at lengthwith the organization’s public affairs staff, legal advisors, and management to establish policies and procedures regarding information sharing. The team should comply with existing organization policy on interacting with the media and other outside parties.
+Provide pertinent information on incidents to the appropriate incident reporting organization.Federal civilian agencies are required to report incidents to the Federal Computer Incident Response Center (FedCIRC). Reporting benefits the agencies because the incident reporting organizations use the reported data to provide information to the agencies regarding new threats and incident trends.
A.1.2Incident Response Team Structure and Services
+Consider the relevant factors when selecting an appropria**te incident response team model.**Organizations should carefully weigh the advantages and disadvantages of each possible team structure model and staffing model in the context of the organization’s needs and available resources.
+Select people with appropr**iate skills for the incident response team.**The credibility and proficiency of the team depend largely on the technical skills of its members. Poor technical judgment can undermine the team’s credibility and cause incidents to worsen. Critical technical skills include system administration, network administration, programming, technical support, and intrusion detection. Teamwork and communications skills are also needed for effective incident handling.
+Identify other groups within the organization that ma**y need to participate in incident handling.**Every incident response team relies on the expertise and judgment of other teams, including management, information security, information technology (IT) support, legal, public affairs, and facilities management.
+Determine which services the team should offer.Although the main focus of the team is incident response, most teams perform additional functions. Examples include distributing security advisories, performing vulnerability assessments, educating users on security, and monitoring intrusion detection sensors.
A.2 Preparation
A.2.1Denial of Service Incidents
+Acquire tools and resources that may be of value during incident handling.The team will be more efficient at handling incidents if various tools and resources are already available to them. Examples include contact lists, encryption software, network diagrams, backup devices, computer forensic software, port lists, and security patches.
+Prevent incidents from occurring by ensuring that networks, sy**stems, and applications are sufficiently secure.**Preventing incidents is beneficial to the organization and reduces the workload of the incident response team. Performing periodic risk assessments and reducing the identified risks to an acceptable level are effective in reducing the number of incidents. User, IT staff, and management awareness of security policies and procedures is also very important.
+Configure firewall rulesets to prevent reflector attacks.Most reflector attacks can be stopped through network-based and host-based firewall rulesets that reject suspicious combinations of source and destination ports.
+Configure border routers to prevent amplifier attacks.Amplifier attacks can be blocked byconfiguring border routers not to forward directed broadcasts.
+Determine how the organization’s Internet service providers (ISP) and second-tier providers can assist in handling network-based DoS attacks.ISPs can often filter or limit certain types of traffic, slowing or halting a DoS attack. They can also provide logs of DoS traffic and may be able to assist in tracing the source of the attack. The organization should meet with the ISPs in advance to establish procedures for requesting such assistance.
+Configure security software to detect DoS attacks.Intrusion detection software can detect many types of DoS activity. Establishing network and system activity baselines, and monitoring for significant deviations from those baselines, can also be usefulin detecting attacks.
+Configure the network perimeter to deny all incoming and outgoing traffic that is not expressly permitted.By restricting the types of traffic that can enter and leave the environment, the organization will limit the methods that attackers can use to perform DoS attacks.
A.2.2Malicious Code Incidents
+Make users aware of malicious code issues.Users should be familiar with the methods that malicious code uses to propagate and the symptoms of infections. Holding regular user education sessions helps to ensure that users are aware of the risks that malicious code poses. Teaching users how to safely handle e-mail attachments should reduce the number of infections that occur.
+Read antivirus bulletins.Bulletins regarding new maliciouscode threats provide timely information to incident handlers.
+Deploy host-based intrusion detection systems, including file integrity checkers, to critical hosts.Host-based IDS software, particularly file integrity checkers, can detect signs of malicious code incidents, such as configuration changes and modifications to executables.
+Use antivirus software, and keep it updated with the latest virus signatures.Antivirus software should be deployed to all hosts and all applications that may be used to transfer malicious code. The software should be configured to detect and disinfect or quarantine malicious code infections. All antivirus software should be kept current with the latest virus signatures so the newest threats can be detected.
+Configure sof**tware to block suspicious files.**Files that are very likely to be malicious should be blocked from the environment, such as those with file extensions that are usually associated with malicious code and files with suspicious combinations of file extensions.
+Eliminate open Windows shares.Many worms spread through unsecured shares on hosts running Windows. A single infection may rapidly spread to hundreds or thousands of hosts through unsecured shares.
A.2.3Unauthorized Access Incidents
+Configure intrusion detection software to alert on attempts to gain unauthorized access.Network and host-based intrusion detection software (including file integrity checking software) is valuable for detecting attempts to gain unauthorized access. Each type of software may detect incidents that the other types of software cannot, so the use of multiple types of computer security software is highly recommended.
+Configure all hosts to use centralized logging.Incidents are easier to detect if data from all hosts across the organization is stored in a centralized, secured location.
+Establish procedures for having all users change their passwords.A password compromise may force the organization to require all users of an application, system, or trust domain — or perhaps the entire organization — to change their passwords.
+Configure the network perimeter to deny all incoming traffic that is not expressly permitted.By limiting the types of incoming traffic, attackers should be able to reach fewer targetsand should be able to reach the targets using designated protocols only. This should reduce the number of unauthorized access incidents.
+Secure all remote access methods, including modems and virtual private networks (VPN).Unsecured modems provide easily attainable unauthorized access to internal systems and networks. Remote access clients are often outside the organization’s control, so granting them access to resources increases risk.
+Put all publicly accessible services on secured demilitarized zone(DMZ) network segments.This action permits the organization to allow external hosts to initiate connections to hosts on the DMZ segments only, not to hosts on internal network segments. This should reduce the number of unauthorized access incidents.
+Di**sable all unneeded services on hosts and separate critical services.**Every service that is running presents another potential opportunity for compromise. Separating critical services is important because if an attacker compromises a host that is running acritical service, immediate access should be gained only to that one service.
+Use host-based firewall software to limit individual host’s exposure to attacks.Deploying host-based firewall software to individual hosts and configuring it to deny all activity that is not expressly permitted should further reduce the likelihood of unauthorized access incidents.
+Create and implement a password policy.The passwordpolicy should require the use of complex, difficult-to-guess passwords and should ensure that authentication methods are sufficiently strong for accessing critical resources. Weak and default passwords are likely to be guessed or cracked, leading to unauthorized access.
A.2.4Inappropriate Usage Incidents
+Discuss the handling of inappropriate usage incidents with the organization’s human resources and legal departments.Processes for monitoring and logging user activities should comply with the organization’s policies and all applicable laws. Procedures for handling incidents that directly involve employees should incorporate discretion and confidentiality.
+Discuss liability issues with the organization’s legal departments.Liability issues may arise during inappropriate usage incidents, particularly for incidents that are targeted at outside parties. Incident handlers should understand when they should discuss incidents with the allegedly attacked party and what information they should reveal.
+Configu**re network-based intrusion detection software to detect certain types of inappropriate usage.**Intrusion detection software has built-in capabilities to detect certain inappropriate usage incidents, such as the use of unauthorized services, outbound reconnaissance activity and attacks, and improper mail relay usage (e.g., sending spam).
+Log basic information on user activities.Basic information on user activities such as File Transfer Protocol (FTP) commands, Web requests, and e-mail headers may be valuable for investigative and evidentiary purposes.
+Configure all e-mail servers so they cannot be used for unauthorized mail relaying.Mail relaying is commonly used to send spam.
+Implement spam filtering software on all e-mail servers.Spam filtering software can block much of the spam sent by external parties to the organization’s users and spam sent by internal users.
+Implement uniform resource locator (URL) filtering software.URL filtering software prevents access to many inappropriate Web sites. Users should be required to use the software, typically by preventing access to external Web sites unless the traffic passes through a server that performs URL filtering.
A.2.5Multiple Component Incidents
+Use centralized logging and event correlation softw**are.**Incident handlers should identify an incident as having multiple components more quickly if all precursors and indications are accessible from a single point of view.
A.3 Detection and Analysis
+Identify precursors and indications through alerts gene**rated by several types of computer security software.**Network and hostbased intrusion detection systems, antivirus software, and file integrity checking software are valuable for detecting signs of incidents. Each type of software may detect incidents thatthe other types of software cannot, so the use of several types of computer security software is highly recommended. Third-party monitoring services can also be helpful.
+Establish mechanisms for outside parties to report incidents.Outside parties may want to report incidents to the organization; for example, they may believe that one of the organization’s users is attacking them. Organizations should publish a phone number and e-mail address that outside parties can use to report such incidents.
+Requi**re a baseline level of logging and auditing on all systems, and a higher baseline level on all critical systems.**Logs from operating systems, services, and applications frequently provide value during incident analysis, particularly if auditing was enabled. The logs can provide information such as which accounts were accessed and what actions were performed.
+Profile networks and systems.Profiling measures the characteristics of expected activity levels so that changes in patterns can be more easily identified. If the profiling process is automated, deviations from expected activity levels can be detected and reported to administrators quickly, leading to faster detection of incidents and operational issues.
+Understand the normal behaviors of networks, systems, and applications.Team members who understand what normal behavior is should be able to recognize abnormal behavior more easily. This knowledge can best be gained by reviewing log entries and security alerts; the handlers should become familiar with the typical data and caninvestigate the unusual entries to gain more knowledge.
+Use centralized logging and create a log retention policy.Information regarding an incident may be recorded in several places. Organizations should deploy centralized logging servers and configuredevices to send duplicates of their log entries to the centralized servers.
The team benefits because it can access all log entries at once; also, changes made to logs on individual hosts will not affect the data already sent to the centralized servers. Alog retention policy is important because older log entries may show previous instances of similar or related activity.
+Perform event correlation.Indications of an incident may be captured in several logs. Correlating events among multiple sources canbe invaluable in collecting all the available information for an incident and validating whether the incident occurred. Centralized logging makes event correlation easier and faster.
+Keep all host clocks synchronized.If the devices reporting events haveinconsistent clock settings, event correlation will be more difficult. Clock discrepancies may also cause issues from an evidentiary standpoint.
+Maintain and use a knowledge base of information.Handlers need to reference information quickly during incident analysis; a centralized knowledge base provides a consistent, maintainable source of information. The knowledge base should include general information, such as commonly used port numbers and links to virus information, and data on precursors and indications of previous incidents.
+Create a diagnosis matrix for less experienced staff.Help desk staff, system administrators, and new incident response team members may need assistance in determining what type of incident may be occurring. A diagnosis matrix that lists incident categories and the symptoms associated with each category can provide guidance as to what type of incident is occurring and how the incident can be validated.
+Start recording all information as soon as the team suspects that an in**cident has occurred.**Every step taken, from the time the incident was detected to its final resolution, should be documented and timestamped. Information of this nature can serve as evidence in a court of law if legal prosecution is pursued. Recording thesteps performed can also lead to a more efficient and systematic, and less error-prone handling of the problem.
+Safeguard incident data.It often contains sensitive information regarding such elements as vulnerabilities, security breaches, and users thatmay have performed inappropriate actions. The team should ensure that access to incident data is restricted properly, both logically and physically.
+Prioritize incidents by business impact, based on the criticality of the affected resources and the tech**nical impact of the incident.**Because of resource limitations, incidents should not be handled on a first-come, first-served basis. Instead, organizations should establish written guidelines that outline how quickly the team must respond to the incident and what actions should be performed, based on the incident’s current and potential business impact. This guidance saves time for the incident handlers and provides a justification to management and system owners for their actions. Organizations should alsoestablish an escalation process for those instances when the team does not respond to an incident within the designated time.
+Include provisions regarding incident reporting in the organization’s incident response policy.Organizations should specify which incidents must be reported, when they must be reported, and to whom. The parties most commonly notified are the chief information officer (CIO), head of information security, local information security officer, other incident response teams within the organization, and system owners.
A.4 Containment, Eradication, and Recovery
A.4.1Denial of Service Incidents
+Establish strategies and procedures for containing incidents.It is important to contain incidents quickly and effectively to limit their business impact. Organizations should define acceptable risks in containing incidents and develop strategies and procedures accordingly. Containment strategies should vary based on the type of incident.
+Follow established procedures for evidence gathering and h**andling.**The team should clearly document how all evidence has been preserved. Evidence should be accounted for at all times. The team should meet with legal staff and law enforcement agencies to discuss evidence handling, then develop procedures based onthose discussions.
+Capture volatile data from systems as evidence.This effort includes lists of network connections, processes, login sessions, open files, network interface configurations, and the contents of memory. Running carefully chosen commands from trusted media can collect the necessary information without damaging the system’s evidence.
+Obtain system snapshots through full forensic disk images, not file system backups.Disk images should be made to sanitized writeprotectable or write-once media. This process is superior to a file system backup for investigatory and evidentiary purposes. Imaging is also valuable in that it is much safer to analyze an image than it is to perform analysis on the original system because the analysis may inadvertently alter the original.
+Create a containment strategy that includes several solutions in sequence.The decision-making process for containing DoS incidents
is easier if recommended solutions are predetermined. Because the effectiveness of each possible solution will vary among incidents, organizations should select several solutions and determine the sequence in which the solutions should be attempted.
A.4.2Malicious Code Incidents
+Contain malicious code incidents as quickly as possible.Because malicious code works surreptitiously and can propagate to other systems rapidly, early containment of a malicious code incident is needed to stop it from spreadingand causing further damage. Infected systems should be disconnected from the network immediately. Organizations may need to block malicious code at the e-mail server level, or even temporarily suspend e-mail services to gain control over serious e-mail-borne malicious code incidents.
A.4.3Unauthorized Access Incidents
+Provide change management information to the incident response team.Indications such as system shutdowns, audit configuration changes, and executable modifications are probably caused by routine system administration, rather than attacks. When such indications are detected, the team should be able to use change management information to verify that the indications are caused by authorized activity.
+Select containment strategies that balan**ce mitigating risks and maintaining services.**Incident handlers should consider moderate containment solutions that focus on mitigating the risks as much as is practical while maintaining unaffected services.
+Restore or reinstall systems that appear to h**ave suffered a root compromise.**The effects of root compromises are often difficult to identify completely. The system should be restored from a known good backup, or the operating system and applications should be reinstalled from scratch. The system should then be secured properly so the incident cannot recur.
A.4.4 Multiple Component Incidents
+Contain the initial incident and then search for signs of other incident components.It can take an extended period of time for a handler to authoritatively determine that an incident has only a single component; meanwhile, the initial incident has not been contained. It is generally better to contain the initial incident first.
A.5 Post-Incident Activity
A.5.1Unauthorized Access Incidents
+Hold lessons learned meetings after major incidents.Lessons learned meetings are extremely helpful in improving security measures and the incident handling process itself.
+Separately prioritize the**handling of each incident component.**Resources are probably too limited to handle all incident components simultaneously. Components should be prioritized based on response guidelines for each component and how current each component is.