Skip to content
AAIR Review ManualChapter 3 › Part F 29 / 33
On this page

Part F: AI Incident Response, BIA, Business Continuity, and Disaster Recovery

As AI systems become increasingly integrated into critical infrastructures, business operations, and everyday applications, the need for a structured approach to incident response has become more crucial. AI-driven systems are vulnerable to unique threats, including adversarial attacks, bias exploitation, data poisoning, and system failures. Incident response for AI aims to identify, mitigate, and prevent such threats while ensuring system reliability, security, and ethical compliance.

An incident management program focuses on identifying events that deviate from normal planned operations and determining the proper response to contain and prevent disruptions to business operations. The desired goal is to reduce the impact felt by an enterprise and recover and resume operations at acceptable levels. The speed with which an enterprise can identify, analyze, respond to, and recover from an incident reduces its impact and, ultimately, the associated costs. It is important that security professionals understand the challenges that AI presents to incident management due to reliance on AI models, large datasets, and the ability to make real-time decisions. Common AI-related incidents include:

The use of AI in incident response must follow a structured approach, framework, and guidance consistent with traditional cybersecurity incident response plans but tailored to AI-specific threats. The phases of AI incident response are shown in figure 3.33.

Figure 3.33—ISO 27035-1 Process for Handling Security Incidents

A cyclic diagram shows the steps for handling security incidents.

Source: ISO/IEC, ISO/IEC 27035-1:2023 Information technology – Information security incident management: Part 1: Principles and process, Edition 2, 2023, link

3.21 AI Business Impact Analysis

Business impact analysis (BIA) plays a pivotal role in ensuring organizational resilience, particularly as enterprises increasingly depend on AI systems to support critical business processes. In the context of AI, BIA serves as a structured process to identify and prioritize those business functions that rely on AI capabilities, thereby enabling informed continuity planning and recovery strategies.

At its core, BIA establishes continuity requirements by determining the impact of losing support from any resource, including AI systems. This involves identifying critical business processes that depend on AI outputs or operations and assessing the consequences of their disruption over time. By doing so, BIA provides senior management with reliable data to make strategic decisions about resource allocation and risk treatment related to AI dependencies.

A comprehensive BIA in AI-driven environments enables organizations to:

Results of the BIA inform control selection for the AI solutions deployed in an enterprise and provide insight into the risk of using AI solutions for business-critical operations.

3.22 Prepare

Being prepared provides an organization with the best chance for recovery and long-term resiliency. Being prepared includes having runbooks and playbooks. A runbook details the specific, technical steps to complete a task, while a playbook defines the overall strategy and decision framework for a security incident. In other words, runbooks are the step-by-step instructions (“how do I do it”), and playbooks are the general guide that includes the “what to do,” “why,” and “who does it.” They often contain multiple runbooks for different scenarios. Many organizations have an incident response program (IRP), but most have not adapted their programs, policies and procedures, and people to address AI-specific incidents, including:

An AI-ready IRP should address the unique and additive risk and the impact an AI incident could have on the organization and the welfare of humans.

3.22.1 Policies, Procedures, and Model Documentation

An incident response policy should address AI-specific threats, techniques for detection and response, and ethical considerations. Debate and decision making over the use of AI should occur outside of active incident response efforts. Maintaining thorough documentation (e.g., model cards) helps incident investigators better interpret an AI model’s architecture, training data, and decision-making processes. Documentation should include a communication plan and stakeholder escalation protocols to ensure the proper internal and external teams, business functions, and law enforcement entities are informed and called upon for support.

3.22.2 Incident Response Team

The organization should ensure that the AI incident response team includes adequate representation of key stakeholder groups, subject matter experts, and risk management functions. In addition to conventional IT and security incident responders, these stakeholders include:

3.22.3 Tabletop Exercises

The organization should train the AI incident response team on standard operating procedures (SOPs) related to the necessary skills and resources (e.g., audit logs, forensic tools, AI model cards) to handle an incident effectively. The procedures for evaluating a system breach with forensic analysis tools are very different from those used to evaluate data poisoning of an AI model. The latter requires access to or a copy of the data lake in which the data was poisoned. This requires collaboration between the incident response team and data engineering/science teams to analyze the dataset for tampering. Adapting the tabletop exercise toward AI-specific activities like dataset exploration and data input/output analysis can help the response team gain experience and be more effective.

3.23 Identify and Report

One key attribute of AI is that it is rooted in mathematical probability. AI solutions make mistakes and offer wrong predictions some percentage of the time. This is by design, and AI developers are not aiming to achieve perfection, as overfitted models do not generalize well for new data. Therefore, defining an organization’s threshold for categorizing incorrect AI predictions as an incident makes detection of AI threats significantly more challenging.

Because of the probabilistic nature of AI outputs, AI observability is crucial in detecting the manifestation of an AI threat. An organization needs to have clear metrics and establish a baseline for its AI’s performance and normative behavior to determine when a solution has veered off course. A combination of techniques is needed to judge the AI solution’s performance and detect an incident. While some of the detection processes can be automated through AI observability tools, others require a HITL to make judgments on subjective matters.

Attacks on data, such as data poisoning, are harder to detect than the traditional data loss or leakage incidents security teams have been trained to recognize. Injections of malicious data, either directly or indirectly, can occur at many stages in the data pipeline. Some detection techniques for common AI-specific attacks are shown in figure 3.34.

Figure 3.34—Detection Techniques for Common AI-specific Attacks

AttackDetection Techniques
Prompt injection
  • Processes to evaluate and sanitize data input into language models to prevent attacks
  • Monitoring data inputs into the artificial intelligence (AI) model to identify patterns atypical of expected inputs (e.g., code-like structure of text, excessive use of special characters, high-volume but subtle changes to data inputs)
Data poisoning
  • Monitoring access to systems and change logs in the data supply chain
  • Reviewing data preprocessing scripts for unauthorized changes
  • Reviewing dataset versioning and hashes
Adversarial inference
  • Analyzing application programming interface (API) call logs
  • Scanning for anomalous patterns that could indicate systematic adversarial testing

Source: ISACA, ISACA AAIA Official Review Manual, USA, 2025

3.24 Assess

Assessment of an AI incident involves collecting facts to establish the timeline, scope, and impact. This phase of the AI incident response should be completed as quickly as possible, without damaging the integrity of the investigation. Documentation about the AI model can aid in the assessment of the AI incident.

Questions to ask include:

3.25 Respond

Conventional strategies to contain, eradicate, and recover are still relevant to an AI incident. However, these strategies can be significantly less effective for AI solutions because of the complexity and probabilistic nature of AI.

3.25.1 Containment

Actions should be taken to contain an AI incident from spreading further. The appropriate actions depend on the circumstances of the incident. Threats to human life and safety necessitate immediate action. Containment strategies to isolate and disable certain AI functionalities may be difficult to deploy because of the complexities involved with AI solutions. Some containment techniques for common AI-specific attacks are shown in figure 3.35.

Figure 3.35—Containment Techniques for AI-specific Attacks

AttackContainment Technique
Prompt injectionData input and output validation and prompt templates can be deployed to screen and sanitize abuse prompts. This technique shields the artificial intelligence (AI) model from direct attacks.
Data poisoningAccess to datasets for all systems involved in the data pipeline should be revoked to contain further direct poisoning of the dataset. Access to code or scripts that execute data preprocessing functions should be revoked.
Adversarial inferenceSince these attacks leverage validated data input channels (e.g., application programming interfaces [APIs], user interfaces), additional data input throttling, validation, and sanitization can reduce systematic inference attacks.

Source: ISACA, ISACA AAIA Official Review Manual, USA, 2025

3.25.2 Eradication

In conventional software development, security vulnerabilities are patched to eradicate incidents. Enterprises can lock intruders out of a system by changing their credentials. For AI incidents, the eradication step is more challenging. The eradication of the AI threat depends on the attack techniques used. Figure 3.36 describes eradication techniques for common AI-specific attacks.

Figure 3.36—Eradication Techniques for Common AI-specific Attacks

AttackEradication Technique
Prompt injectionThe underlying artificial intelligence (AI) model needs to be retrained or fine-tuned to resist prompt injections. Retraining for large language models (LLMs) may be impractical and costly. Data input/output validations and prompt templates are still effective long-term controls for LLMs.
Data poisoningPoisoned data needs to be isolated and removed from the training dataset. Then a new version of the model needs to be retrained, which could be costly and time consuming for large datasets. Short-term actions include data output sanitization to mitigate the impact of poisoned data. Long-term actions include selective retraining on new, clean data that can be used to force the AI model to forget the poisoned data.
Adversarial inferenceRegularization techniques make the AI model more resilient to inference attacks. Defensive distillation techniques are also effective.

Source: ISACA, ISACA AAIA Official Review Manual, USA, 2025

3.25.3 Recovery

Returning the AI solution to a safe and operational state involves steps to ensure the threat has been successfully contained and eradicated. Some forms of AI-specific attacks cannot be fully eradicated. The optimal resolution would reduce the attack surface to prevent a relapse, actively monitor any residual threats, and continuously improve the AI models to build in robustness and resilience.

Stricter access controls, new data input/output validations, and other guardrails would need to be implemented. A postincident validation of the new control measures should be carried out and approval from all relevant stakeholders obtained prior to reactivating the AI solution.

3.26 Postincident Review

Postmortems of AI incidents should be conducted to identify areas of improvements for the AI solution, including data preprocessing, security controls, sufficiency of adversarial testing, data input/output controls, fairness of the outputs, and the AI provider’s control environment and processes. Moving from a reactive to a proactive readiness posture can minimize the impact of a future AI incident.