Chapter 3 — AI Risk Program Management — Part F: AI Incident Response, BIA, Business Continuity, and Disaster Recovery

On this page

3.21AI Business Impact Analysis
3.22Prepare
3.23Identify and Report
3.24Assess
3.25Respond
3.26Postincident Review

Part F: AI Incident Response, BIA, Business Continuity, and Disaster Recovery

As AI systems become increasingly integrated into critical infrastructures, business operations, and everyday applications, the need for a structured approach to incident response has become more crucial. AI-driven systems are vulnerable to unique threats, including adversarial attacks, bias exploitation, data poisoning, and system failures. Incident response for AI aims to identify, mitigate, and prevent such threats while ensuring system reliability, security, and ethical compliance.

An incident management program focuses on identifying events that deviate from normal planned operations and determining the proper response to contain and prevent disruptions to business operations. The desired goal is to reduce the impact felt by an enterprise and recover and resume operations at acceptable levels. The speed with which an enterprise can identify, analyze, respond to, and recover from an incident reduces its impact and, ultimately, the associated costs. It is important that security professionals understand the challenges that AI presents to incident management due to reliance on AI models, large datasets, and the ability to make real-time decisions. Common AI-related incidents include:

Adversarial attacks—Manipulating input data to deceive AI models into making incorrect decisions (e.g., misleading facial recognition systems or autonomous vehicles)
Data poisoning—Injecting harmful or biased data into training datasets to corrupt model outputs
Bias exploitation—Using AI models’ inherent biases to generate unethical or unfair outcomes
Model drift—The degradation of AI model performance over time due to changing data patterns
Unauthorized access and exploitation—Hacking into AI systems to manipulate or extract sensitive information
Automation failures—Errors in AI-driven automation leading to incorrect or dangerous outcomes

The use of AI in incident response must follow a structured approach, framework, and guidance consistent with traditional cybersecurity incident response plans but tailored to AI-specific threats. The phases of AI incident response are shown in figure 3.33.

A cyclic diagram shows the steps for handling security incidents. — Figure 3.33—ISO 27035-1 Process for Handling Security Incidents

3.21 AI Business Impact Analysis

Business impact analysis (BIA) plays a pivotal role in ensuring organizational resilience, particularly as enterprises increasingly depend on AI systems to support critical business processes. In the context of AI, BIA serves as a structured process to identify and prioritize those business functions that rely on AI capabilities, thereby enabling informed continuity planning and recovery strategies.

At its core, BIA establishes continuity requirements by determining the impact of losing support from any resource, including AI systems. This involves identifying critical business processes that depend on AI outputs or operations and assessing the consequences of their disruption over time. By doing so, BIA provides senior management with reliable data to make strategic decisions about resource allocation and risk treatment related to AI dependencies.

A comprehensive BIA in AI-driven environments enables organizations to:

Identify and prioritize critical AI-dependent business processes—The addition of AI and automated workflows results in AI being a central part of some critical business processes. Previous BIAs may need to be revisited as a result of AI implementations to see how loss of an AI solution impacts those areas.
Determine protection and recovery priorities—Understanding which AI-supported services require higher levels of protection and the sequence in which they should be restored following an incident guides effective resource deployment.
Define recovery objectives—These objectives specify acceptable downtime and data loss thresholds for AI systems, aligning recovery efforts with business needs and strategic priorities.
Inform business continuity and disaster recovery planning—The insights from BIA guide the development of continuity plans that address AI-specific risk and dependencies, ensuring that critical AI functions can be maintained or rapidly restored.
Capture resource requirements for alternate site—BIA identifies the necessary personnel, technical infrastructure, and processes needed to sustain operations at reduced capacity during AI disruptions.

Results of the BIA inform control selection for the AI solutions deployed in an enterprise and provide insight into the risk of using AI solutions for business-critical operations.

3.22 Prepare

Being prepared provides an organization with the best chance for recovery and long-term resiliency. Being prepared includes having runbooks and playbooks. A runbook details the specific, technical steps to complete a task, while a playbook defines the overall strategy and decision framework for a security incident. In other words, runbooks are the step-by-step instructions (“how do I do it”), and playbooks are the general guide that includes the “what to do,” “why,” and “who does it.” They often contain multiple runbooks for different scenarios. Many organizations have an incident response program (IRP), but most have not adapted their programs, policies and procedures, and people to address AI-specific incidents, including:

Abuse of AI output to create harm to society
Disclosure of confidential or personal data from the training dataset
Incorrect, hallucinated, or biased predictions

An AI-ready IRP should address the unique and additive risk and the impact an AI incident could have on the organization and the welfare of humans.

3.22.1 Policies, Procedures, and Model Documentation

An incident response policy should address AI-specific threats, techniques for detection and response, and ethical considerations. Debate and decision making over the use of AI should occur outside of active incident response efforts. Maintaining thorough documentation (e.g., model cards) helps incident investigators better interpret an AI model’s architecture, training data, and decision-making processes. Documentation should include a communication plan and stakeholder escalation protocols to ensure the proper internal and external teams, business functions, and law enforcement entities are informed and called upon for support.

3.22.2 Incident Response Team

The organization should ensure that the AI incident response team includes adequate representation of key stakeholder groups, subject matter experts, and risk management functions. In addition to conventional IT and security incident responders, these stakeholders include:

Data stewards/owners—Subject matter experts on the dataset used for developing the AI model
Data engineers and scientists—Resources involved in the data preprocessing and training of the AI model who can help interpret anomalous output and behavior
Privacy experts—Resources who can guide the incident response on the possible impact to personal data, data subject rights, and compliance with privacy and data sovereignty regulations
AI ethicists—Resources trained on and responsible for the safe and ethical use of AI

3.22.3 Tabletop Exercises

The organization should train the AI incident response team on standard operating procedures (SOPs) related to the necessary skills and resources (e.g., audit logs, forensic tools, AI model cards) to handle an incident effectively. The procedures for evaluating a system breach with forensic analysis tools are very different from those used to evaluate data poisoning of an AI model. The latter requires access to or a copy of the data lake in which the data was poisoned. This requires collaboration between the incident response team and data engineering/science teams to analyze the dataset for tampering. Adapting the tabletop exercise toward AI-specific activities like dataset exploration and data input/output analysis can help the response team gain experience and be more effective.

3.23 Identify and Report

One key attribute of AI is that it is rooted in mathematical probability. AI solutions make mistakes and offer wrong predictions some percentage of the time. This is by design, and AI developers are not aiming to achieve perfection, as overfitted models do not generalize well for new data. Therefore, defining an organization’s threshold for categorizing incorrect AI predictions as an incident makes detection of AI threats significantly more challenging.

Because of the probabilistic nature of AI outputs, AI observability is crucial in detecting the manifestation of an AI threat. An organization needs to have clear metrics and establish a baseline for its AI’s performance and normative behavior to determine when a solution has veered off course. A combination of techniques is needed to judge the AI solution’s performance and detect an incident. While some of the detection processes can be automated through AI observability tools, others require a HITL to make judgments on subjective matters.

Attacks on data, such as data poisoning, are harder to detect than the traditional data loss or leakage incidents security teams have been trained to recognize. Injections of malicious data, either directly or indirectly, can occur at many stages in the data pipeline. Some detection techniques for common AI-specific attacks are shown in figure 3.34.

Figure 3.34—Detection Techniques for Common AI-specific Attacks

Attack	Detection Techniques
Prompt injection	Processes to evaluate and sanitize data input into language models to prevent attacks Monitoring data inputs into the artificial intelligence (AI) model to identify patterns atypical of expected inputs (e.g., code-like structure of text, excessive use of special characters, high-volume but subtle changes to data inputs)
Data poisoning	Monitoring access to systems and change logs in the data supply chain Reviewing data preprocessing scripts for unauthorized changes Reviewing dataset versioning and hashes
Adversarial inference	Analyzing application programming interface (API) call logs Scanning for anomalous patterns that could indicate systematic adversarial testing

Source: ISACA, ISACA AAIA Official Review Manual, USA, 2025

3.24 Assess

Assessment of an AI incident involves collecting facts to establish the timeline, scope, and impact. This phase of the AI incident response should be completed as quickly as possible, without damaging the integrity of the investigation. Documentation about the AI model can aid in the assessment of the AI incident.

Questions to ask include:

Has the incident stopped or is it still happening?
What are the facts about the incident?
- Who was impacted, and what harm or negative consequences occurred?
- When was the incident discovered (important to determine breach notification requirements)?
- What happened (specific details of the incident)?
- What AI systems and data were impacted?
- What attack tactics were used?
What is still unknown, and what needs to be known in order to return to a safe operational state?

3.25 Respond

Conventional strategies to contain, eradicate, and recover are still relevant to an AI incident. However, these strategies can be significantly less effective for AI solutions because of the complexity and probabilistic nature of AI.

3.25.1 Containment

Actions should be taken to contain an AI incident from spreading further. The appropriate actions depend on the circumstances of the incident. Threats to human life and safety necessitate immediate action. Containment strategies to isolate and disable certain AI functionalities may be difficult to deploy because of the complexities involved with AI solutions. Some containment techniques for common AI-specific attacks are shown in figure 3.35.

Figure 3.35—Containment Techniques for AI-specific Attacks

Attack	Containment Technique
Prompt injection	Data input and output validation and prompt templates can be deployed to screen and sanitize abuse prompts. This technique shields the artificial intelligence (AI) model from direct attacks.
Data poisoning	Access to datasets for all systems involved in the data pipeline should be revoked to contain further direct poisoning of the dataset. Access to code or scripts that execute data preprocessing functions should be revoked.
Adversarial inference	Since these attacks leverage validated data input channels (e.g., application programming interfaces [APIs], user interfaces), additional data input throttling, validation, and sanitization can reduce systematic inference attacks.

Source: ISACA, ISACA AAIA Official Review Manual, USA, 2025

3.25.2 Eradication

In conventional software development, security vulnerabilities are patched to eradicate incidents. Enterprises can lock intruders out of a system by changing their credentials. For AI incidents, the eradication step is more challenging. The eradication of the AI threat depends on the attack techniques used. Figure 3.36 describes eradication techniques for common AI-specific attacks.

Figure 3.36—Eradication Techniques for Common AI-specific Attacks

Attack	Eradication Technique
Prompt injection	The underlying artificial intelligence (AI) model needs to be retrained or fine-tuned to resist prompt injections. Retraining for large language models (LLMs) may be impractical and costly. Data input/output validations and prompt templates are still effective long-term controls for LLMs.
Data poisoning	Poisoned data needs to be isolated and removed from the training dataset. Then a new version of the model needs to be retrained, which could be costly and time consuming for large datasets. Short-term actions include data output sanitization to mitigate the impact of poisoned data. Long-term actions include selective retraining on new, clean data that can be used to force the AI model to forget the poisoned data.
Adversarial inference	Regularization techniques make the AI model more resilient to inference attacks. Defensive distillation techniques are also effective.

Source: ISACA, ISACA AAIA Official Review Manual, USA, 2025

3.25.3 Recovery

Returning the AI solution to a safe and operational state involves steps to ensure the threat has been successfully contained and eradicated. Some forms of AI-specific attacks cannot be fully eradicated. The optimal resolution would reduce the attack surface to prevent a relapse, actively monitor any residual threats, and continuously improve the AI models to build in robustness and resilience.

Stricter access controls, new data input/output validations, and other guardrails would need to be implemented. A postincident validation of the new control measures should be carried out and approval from all relevant stakeholders obtained prior to reactivating the AI solution.

3.26 Postincident Review

Postmortems of AI incidents should be conducted to identify areas of improvements for the AI solution, including data preprocessing, security controls, sufficiency of adversarial testing, data input/output controls, fairness of the outputs, and the AI provider’s control environment and processes. Moving from a reactive to a proactive readiness posture can minimize the impact of a future AI incident.