24 controls for AI security
This is our reference framework. We test for these controls during engagements and help you implement what's missing.
- Security controls for LLM and RAG systems.
- Data protection and access control.
- Output safety and policy enforcement.
- Operational controls (logging, monitoring, incident response).
Note: Bias testing and explainability are out of scope—we focus on security.
Full control list
Click any control to see what we test for and what evidence looks like.
CC-1.1 Access Control and MFA
Requirement: Enforce MFA for cloud consoles and code repos. Restrict access to model weights, prompt repositories, and fine-tuning data to approved ML engineers.
Evidence: Access review logs; IdP configuration; model repo permission screenshots.
CC-1.2 Encryption at Rest and In Transit
Requirement: Use AES-256 for databases and TLS 1.2+ for all traffic. Encrypt vector database volumes and embeddings in transit.
Evidence: KMS policies; storage encryption settings; TLS test report.
CC-1.3 Input Sanitization and Injection Defense
Requirement: All user inputs pass through a guardrail layer that detects and blocks prompt injection patterns before model execution.
Evidence: Middleware code; prompt-injection pen test results; blocked request logs.
CC-1.4 Output Security
Requirement: Scan model outputs for malicious payloads or unsafe links before rendering to users.
Evidence: Output filtering configuration; logs of blocked outputs.
CC-1.5 Incident Response for AI Failures
Requirement: Incident response includes AI-specific scenarios and a kill switch to disable tool use or external API access.
Evidence: IR playbooks; tabletop exercise report simulating runaway model behavior.
CC-1.6 Logging and Anomaly Detection
Requirement: Log prompts, tool calls, and retrieval events with tamper-resistant storage. Alert on abnormal prompt patterns or elevated refusal rates.
Evidence: Logging architecture diagram; SIEM alert rules; sample alert tickets.
CC-1.7 Tooling Permissions and Least Privilege
Requirement: AI agents only receive scoped tool access, with explicit allowlists for external APIs and file access.
Evidence: Tool permission manifests; access approval records.
C-1.1 Data Classification and Embedding Labels
Requirement: Tag data as Public, Confidential, or Restricted. Embeddings inherit classification and are filtered by user role.
Evidence: Data flow diagrams; vector DB metadata schema; access policy rules.
C-1.2 Retrieval Access Control (RAG)
Requirement: Retrieval enforces user permissions before a chunk enters the context window.
Evidence: Retrieval policy tests; screenshots of denied queries.
C-1.3 Zero-Retention Agreements
Requirement: Contracts with LLM providers include explicit no-training or zero-retention terms.
Evidence: Vendor agreements; API settings showing retention disabled.
C-1.4 Context Flushing
Requirement: Enforce isolation between user sessions and cryptographically verify context window reset between sessions.
Evidence: Session management code review; penetration test attempting cross-session leakage.
C-1.5 PII Scrubbing Before Embedding
Requirement: Detect and mask PII before embedding to prevent irreversible exposure in vector stores.
Evidence: Pipeline configuration; scrubber logs and sample redaction reports.
C-1.6 Secrets and Credential Redaction
Requirement: Remove secrets or API keys from prompts and outputs using pattern detection and secret scanning.
Evidence: Secret scanning configuration; blocked secret leakage logs.
A-1.1 Disaster Recovery and Model Rollback
Requirement: Maintain backups for model registries and prompt configs with tested rollback within minutes.
Evidence: Rollback test logs; model registry screenshots.
A-1.2 Token-Based Rate Limiting
Requirement: Apply token-level limits to prevent model DoS and runaway compute costs.
Evidence: API gateway configuration; rate limit logs.
A-1.3 Inference Latency Monitoring
Requirement: Track time-to-first-token and end-to-end latency with SLA thresholds.
Evidence: Monitoring dashboards; alert configurations.
A-1.4 Capacity Planning and Cost Guardrails
Requirement: Forecast GPU usage and enforce budget alerts to prevent overload or outage during spikes.
Evidence: Capacity plan; billing alerts; autoscaling policies.
A-1.5 Provider Dependency and Failover
Requirement: Document third-party model dependencies and establish a failover plan or graceful degradation path.
Evidence: Dependency map; failover runbook; outage simulation report.
PI-1.1 Evaluation and Regression Testing
Requirement: Run a golden set evaluation on every model or prompt change; maintain a defined quality threshold.
Evidence: CI/CD evaluation logs; versioned evaluation dataset.
PI-1.2 Grounding Enforcement
Requirement: For RAG, force the model to prioritize retrieved context with low temperature settings for factual tasks.
Evidence: System prompt configuration; grounding metrics.
PI-1.3 Uncertainty Refusal
Requirement: If retrieval confidence is low, the system refuses or escalates instead of fabricating an answer.
Evidence: Refusal logs; confidence threshold logic.
PI-1.4 Deterministic Handoffs
Requirement: Route calculations, dates, and compliance logic to deterministic code paths rather than the model.
Evidence: Architecture diagram; function-calling or tool-use code.
PI-1.5 Post-Deployment Sampling
Requirement: Sample production outputs and measure hallucination rates with defined remediation thresholds.
Evidence: Sampling plan; review logs; remediation tickets.
CM-1.1 Versioning of Prompts and Models
Requirement: Treat prompts and model configs as production code with version control and change approvals.
Evidence: Git history for prompt/config files; change request tickets.
CM-1.2 Human Review for Model Changes
Requirement: A domain expert reviews a sample of outputs before promotion to production.
Evidence: Sign-off records; review checklist.
CM-1.3 Staged Rollouts and Rollbacks
Requirement: Use canary or staged rollouts with defined rollback triggers for prompt or model updates.
Evidence: Release pipeline logs; rollback criteria documentation.
CM-1.4 Change Impact Assessments
Requirement: Document the expected impact of changes on safety, reliability, and customer-facing behavior.
Evidence: Change impact templates; review meeting notes.
Where this fits
Our framework draws from established AI security standards:
- OWASP LLM Top 10: Attack categories and mitigations.
- MITRE ATLAS: Adversary tactics for ML systems.
- NIST AI RMF: Risk management approach.
- ISO 42001: AI management system requirements.
Not a checklist exercise
We don't just verify you have policies. We test if your controls actually work by attacking your system.
A control that exists on paper but fails under attack gets flagged as a finding.
See how you measure up
We'll assess your system against this framework and show you exactly where you stand.