RAICProjects and Data

Responsible AI Collaborative · Non-profit

The convening body for AI safety data

The Responsible AI Collaborative (TheCollab) is a non-profit organization bringing together multiple contributors to produce a safer world with AI.

TheCollab develops the AI Incident Database (AIID), the leading catalog of real world AI harm events (i.e. “incidents”). TheCollab’s goal is to support collaborative projects across the world advancing the breadth or depth of real world AI safety data coverage.

Safety is a team sport — and we all play on the same team.

Operating AI Safety Databases

7 entries
AI Incident Database (AIID) · OECD AI Incidents Monitor (AIM) · AI Risk Repository · Academ-AI · AI Law Trackers Hub · AI Hallucination Cases · Database of AI Litigation (DAIL)

The systems and data behind indexing real-world AI incidents.

AI Incident Database (AIID)

Responsible AI Collaborative Led ProjectPerpetual

The leading catalog of real world AI harm events, indexing harms and near-harms realized by deployed AI systems.

Author
Responsible AI Collaborative
Intended user
Researchers, developers, policymakers, and the public
Purpose
Index the collective history of AI harms so the community can learn from past failures and prevent their recurrence.
Read full description

The AI Incident Database (AIID) is the leading catalog of AI harm events — "incidents" — realized in the real world. It indexes the collective history of harms and near-harms caused by the deployment of AI systems, much as the aviation and cybersecurity communities learn from their own incident records.

What it holds

  • Incidents — discrete real-world harm events, each grouping one or more source reports.
  • Reports — the news articles and documents that evidence an incident, carrying source, author, and publication details.
  • Taxonomies — structured classifications (such as CSETv1, GMF, and the MIT AI Risk Repository) applied to incidents during review.

The AIID is the primary focus of the Responsible AI Collaborative, which convenes the contributors who maintain it.

Explore it at incidentdatabase.ai.

OECD AI Incidents Monitor (AIM)

Responsible AI CollaboratorsPerpetual

An automated monitor of AI incidents and hazards drawn from public sources, supporting evidence-based policy.

Author
OECD AI Policy Observatory
Intended user
Policymakers, AI practitioners, and stakeholders worldwide
Purpose
Reveal risk patterns and build a collective understanding of AI incidents and hazards.
Read full description

The OECD AI Incidents Monitor (AIM) is an automated monitor of AI incidents and hazards drawn from public sources, built by the OECD AI Policy Observatory to support evidence-based policymaking. It helps policymakers, practitioners, and stakeholders worldwide gain insight into the risks and harms of AI systems.

What it captures

  • AI incidents — realized harms caused by an AI system's malfunction or misuse.
  • AI hazards — credible future risks where no harm has occurred yet.
  • Rich metadata, including affected countries, industries, harm types, stakeholder groups, AI principles violated, and system characteristics.

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

Explore the monitor at oecd.ai/en/incidents.

AI Risk Repository

Responsible AI CollaboratorsPerpetual

A living database of more than 1,700 AI risks extracted from 65 frameworks, with taxonomies and governance mappings.

Author
MIT AI Risk Initiative
Intended user
Governments, companies, researchers, and policymakers
Purpose
Provide authoritative data and frameworks to identify, prioritize, and manage risks from AI.
Read full description

The AI Risk Repository, from the MIT AI Risk Initiative, is a living database of more than 1,700 AI risks extracted from 65 frameworks. It offers authoritative data and frameworks to help governments, companies, and researchers identify, prioritize, and manage risks from AI.

What it contains

  • A continually updated database of classified risks.
  • Causal and domain taxonomies for organizing them.
  • Real-world incident data, a mapping of AI governance documents to risks, and an expert survey ranking risks by severity and responsibility.

The data is freely available under CC BY 4.0 and is used by organizations including the United Nations, the EU AI Office, and the UK AI Safety Institute.

Explore it at airisk.mit.edu.

Academ-AI

We are fansPerpetual

A database documenting suspected undeclared use of generative AI in the academic literature.

Author
Alex Glynn
Intended user
Researchers, editors, publishers, and research-integrity specialists
Purpose
Surface and document cases of undisclosed generative AI use in scholarly publications.
Read full description

Academ-AI, created by Alex Glynn, documents suspected undeclared use of generative AI in the academic literature — cases discernible largely from the idiosyncratic phrasing characteristic of LLM chatbots that slips into published papers.

What it shows

The collection spans journals, conference proceedings, and textbooks from highly respected publishers, and finds that undeclared AI appears even in outlets with higher citation metrics and article processing charges. Coverage is limited to English-language literature indexed in accessible online databases, principally Google Scholar.

Explore it at academ-ai.info, alongside the accompanying analysis.

AI Law Trackers Hub

We are fansPerpetual

A hub of nine maintained trackers cataloging AI across the justice system, from legal cases to judicial guidance and policy.

Author
Matthew Lee
Intended user
Legal professionals, academics, and policymakers
Purpose
Aggregate verified legal cases, judicial guidance, and policy developments on AI in the justice system.
Read full description

The AI Law Trackers Hub, maintained by barrister Matthew Lee, is a centralized collection of nine trackers monitoring the intersection of AI and the justice system. It aggregates verified legal cases, judicial guidance, and policy developments concerning large language models and automated decision-making.

What it tracks

  • AI hallucination cases (global and UK-specific)
  • Judicial use of AI
  • AI equality, bias, and discrimination
  • Government AI hallucinations
  • Whether AI will replace judges and lawyers
  • A deepfake case-law database
  • AI privilege and confidentiality

The hub welcomes community submissions of new cases and developments, and is an informational resource rather than legal advice.

Explore it at naturalandartificiallaw.com.

AI Hallucination Cases

We are fansPerpetual

A curated database of legal cases where generative AI produced hallucinated content submitted in court filings.

Author
Damien Charlotin
Intended user
Lawyers, judges, and researchers tracking AI misuse in litigation
Purpose
Document court decisions worldwide where AI-generated hallucinations made it into filings.
Read full description

The AI Hallucination Cases database, maintained by Damien Charlotin, is among the most comprehensive catalogs of AI hallucination cases in law — legal decisions from courts worldwide, searchable by country, party, AI tool, and outcome, and updated daily.

What it records

Each entry documents a case in which generative AI produced fabricated material that reached a court, such as:

  • Fabricated citations — cited cases that do not exist.
  • Hallucinated quotations — quotations absent from the purported sources.
  • Mischaracterized holdings — misstatements of what cited authority says.

Entries summarize the court's findings and any sanctions imposed, with links to the underlying orders.

Browse the full, filterable database on damiencharlotin.com.

A catalog of ongoing and completed litigation involving artificial intelligence and related topics.

Author
Ethical Tech Initiative, George Washington University
Intended user
Legal scholars, practitioners, and policy researchers
Purpose
Track AI-related legal cases from the initial complaint onward, whether or not they reach a published decision.
Read full description

The Database of AI Litigation (DAIL) presents information about ongoing and completed litigation involving artificial intelligence and related topics. It is maintained by the Ethical Tech Initiative at George Washington University — an interdisciplinary collaboration where law and ethics meet computer science, engineering, media, public affairs, and social entrepreneurship.

What it tracks

The database follows cases from the initial complaint forward, regardless of whether they result in a published decision, aiming for comprehensive coverage across areas such as:

  • Algorithmic decision-making in hiring, credit, and criminal sentencing.
  • Generative AI training disputes.
  • AI companion and product-liability claims.

See the full, searchable case list on the DAIL site.

Applied Taxonomies

4 entries
MIT AI Risk Repository Taxonomy · Goals, Methods, and Failures (GMF) · CSET AI Harm Taxonomy (CSETv1) · Towards a Common Reporting Framework for AI Incidents

Shared classification schemes applied to one or more of the databases above. Every database has its own native schema, which we are not stating here. This section is for schema intended for broad application or standardization.

MIT AI Risk Repository Taxonomy

Responsible AI CollaboratorsPerpetual

A high- and mid-level taxonomy of AI risks drawn from the MIT AI Risk Repository meta-review.

Author
MIT AI Risk Repository (AIID classifications by Simon Mylius)
Intended user
Researchers, practitioners, and stakeholders tracking AI risks
Purpose
Systematically categorize AI risks by cause and by domain across the AI lifecycle.
Read full description

The MIT AI Risk Repository taxonomy organizes AI-related risks extracted from a broad meta-review of the literature. It was developed by MIT and adapted from the paper The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence. Classifications for the AI Incident Database are generated automatically through work by Simon Mylius.

Causal taxonomy

How and when a risk arises:

  • Entity — whether the risk was caused by an AI, a human, or another actor.
  • Intent — whether the outcome was intentional, unintentional, or other.
  • Timing — whether the risk occurs pre-deployment, post-deployment, or other.

Domain taxonomy

Risks are sorted into seven domains, each with its own subdomains:

  1. Discrimination & toxicity
  2. Privacy & security
  3. Misinformation
  4. Malicious actors & misuse
  5. Human–computer interaction
  6. Socioeconomic & environmental harms
  7. AI system safety, failures & limitations

See the full domain and subdomain definitions on the MIT taxonomy page.

Goals, Methods, and Failures (GMF)

Responsible AI Collaborative Led ProjectComplete

A failure cause analysis taxonomy linking an AI system's goals, the methods it uses, and the technical failures behind real-world harms.

Author
AI Incident Database, Responsible AI Collaborative (maintained by Nikiforos Pittaras)
Intended user
AI developers and deployers, auditors, researchers, and community annotators
Purpose
Trace real-world AI harms to the system's intended goals, its implementation methods, and the technical failures that caused them.
Read full description

The Goals, Methods, and Failures (GMF) taxonomy is a failure cause analysis taxonomy for AI systems in the real world. Maintained by the AI Incident Database — a project of the Responsible AI Collaborative — it interconnects what a system was trying to do, how it was built, and how it failed.

Three interrelated ontologies

  • AI system goals — the deployment objective (e.g. face recognition, autonomous driving).
  • AI methods and technologies — the implementation approach (e.g. transformers, convolutional neural networks).
  • AI technical failures — the failure factors involved (e.g. generalization failure, distributional bias).

Each classification carries a confidence modifier ("known" or "potential"), supporting text snippets, and annotator commentary, so labels stay grounded in the underlying incident reports.

What it is used for

  1. Linking harms to system goals — surfacing the technical failure causes related to the system's intended task.
  2. Connecting methods to failures — showing how architectures and techniques contribute to harm.
  3. Leveraging expertise — applying interdisciplinary knowledge to annotate public incident reports.
  4. Grounded labeling — anchoring annotations to real-world data for verifiability and research.

See the full goal, method, and failure ontologies on the GMF taxonomy page.

CSET AI Harm Taxonomy (CSETv1)

Responsible AI CollaboratorsComplete

The second edition of CSET's incident taxonomy, characterizing the harms, entities, and technologies involved in AI incidents.

Author
Center for Security and Emerging Technology (CSET), Georgetown University
Intended user
Researchers, incident analysts, and AIID contributors
Purpose
Provide a standardized framework for identifying and classifying real-world AI harms.
Read full description

The CSET AI Harm Taxonomy for AIID is the second edition of the incident taxonomy created by Georgetown University's Center for Security and Emerging Technology (CSET) in collaboration with the AI Incident Database. It characterizes the harms, entities, and technologies involved in AI incidents, along with the circumstances surrounding them.

What counts as AI harm

The taxonomy defines AI harm as requiring four essential elements — all must be present for an event to be classified as AI harm:

an entity that experienced a harm event or harm issue that can be directly linked to a consequence of the behavior of an AI system.

What it captures

  • AI system — whether the incident involves a CSET-defined AI system.
  • Differential treatment — protected characteristics affected (race, sex, disability, religion, and others).
  • Sector of deployment — the industry context (law enforcement, healthcare, transportation, and so on).
  • Autonomy level — from fully autonomous to human-in-the-loop operation.
  • Physical domain — whether physical objects or harms are involved.
  • Testing context — developer versus user testing, controlled versus operational conditions.
  • Deployment status — whether the system reached end users.

See the full schema and field definitions on the CSETv1 taxonomy page.

An OECD framework of common dimensions and criteria for reporting AI incidents consistently across contexts.

Author
OECD
Intended user
Policymakers, regulators, and AI incident database operators
Purpose
Enable consistent, interoperable reporting of AI incidents across jurisdictions and databases.
Read full description

Towards a Common Reporting Framework for AI Incidents is an OECD Artificial Intelligence Paper proposing a shared framework for describing AI incidents consistently across organizations and jurisdictions. By aligning how incidents are recorded, it aims to make incident data comparable and interoperable across initiatives such as the AI Incident Database and the OECD AI Incidents Monitor.

Dimensions

The framework organizes roughly 29 criteria into dimensions including:

  • Metadata (9 criteria)
  • Harm details (4 criteria)
  • People and planet (3 criteria)
  • Economic context (4 criteria)
  • Data and input (1 criterion)
  • AI model (3 criteria)
  • Other information (3 criteria)

Together these help policymakers understand incidents across diverse contexts, identify high-risk systems, and assess the impact of AI on people and the planet.

Read it at OECD.

Responsible AI Collaborative Research

9 entries
A Pragmatic Classification Framework for AI Incident Monitoring · AI Incident Monitoring through a Public Health Lens · AI Risk, Safety, and Incident Reporting · Lessons for Editors of AI Incidents from the AI Incident Database · Indexing AI Risks with Incidents, Issues, and Variants · Preventing Repeated Real World AI Failures by Cataloging Incidents: The AI Incident Database · Prioritization of Risks from Artificial Intelligence: A Delphi Study of 272 International Experts · Position: Mind the Gap — Closing the Growing Disconnect Between Vulnerability Disclosure and AI Security · In-House Evaluation Is Not Enough: Toward Robust Third-Party Evaluation and Flaw Disclosure for General-Purpose AI

Studies, prototypes, and findings emerging from the collaborative. More than 600 academic, government, and journalistic citations use or point to the AIID.

A Pragmatic Classification Framework for AI Incident Monitoring

Responsible AI Collaborative Led ProjectComplete

A structured methodology for monitoring AI incidents that separates harm trends from exposure and reporting effects.

Author
Isaak Mengesha, Branwen Owen, Charlie Collins, Tina Wong, Simon Mylius, Peter Slattery, and Sean McGregor
Intended user
AI governance practitioners, policymakers, and safety researchers
Purpose
Turn raw incident counts into actionable governance insight by correcting for reporting and deployment confounders.
Read full description

A Pragmatic Classification Framework for AI Incident Monitoring addresses a core problem in AI governance: raw incident counts conflate media reporting propensity, system deployment, and actual harm frequency. The framework offers a practical, evidence-calibrated way to read incident data despite those confounders.

Three components

  • A structured monitoring question that defines the analytical scope.
  • A tiered estimation process that uses LLM-assisted filtering to analyze harm and exposure trends separately.
  • A classification scheme that labels findings as Escalating, Mitigating, Concentrating, Receding, or Unclassifiable.

The authors demonstrate the approach through real-world case studies, working within the data limitations typical of incident monitoring. Read the full paper on arXiv.

AI Incident Monitoring through a Public Health Lens

Responsible AI Collaborative Led ProjectComplete

A public-health-inspired framework that identifies six phases of AI incident emergence to make AI risk measurable.

Author
Sophia Abraham, Taiye Chen, Cyril Chhun, Giovanna Jaramillo-Gutierrez, Simon Mylius, Sayash Raaj, Peter Slattery, and Sean McGregor
Intended user
Policymakers, companies, researchers, and AI governance practitioners
Purpose
Move beyond cataloging incidents toward measuring AI risk by determining the phase of incident emergence.
Read full description

AI Incident Monitoring through a Public Health Lens adapts public-health surveillance methods — which presume noisy, incomplete data — to the problem of measuring AI risk. Incident databases index what has happened, but they cannot express risk (a joint measure of likelihood and severity) without knowing how prevalent risk-associated systems are and how often incidents are reported.

The framework

The paper identifies six phases of incident emergence and shows how an informed panel of domain experts can combine that framework with incident data and statistical and visualization tools to arrive at phase determinations that serve public needs. It is demonstrated through:

  • Autonomous vehicles, whose mandatory reporting requirements produce reliable incident-rate ground truth (expressed in distance traveled).
  • Deepfakes, as a second case study, before charting a path for future incident-phase research.

For a short overview, see the abstract and executive summaryPhase Modeling for AI Incident Emergence: Adapting Epidemiological Methods to Post-Deployment Governance. The full paper is on arXiv.

AI Risk, Safety, and Incident Reporting

Responsible AI Collaborative Led ProjectComplete

A reference-work chapter surveying AI risk, safety, and the practice of incident reporting.

Author
Kevin Paeth, Sean McGregor, and colleagues
Intended user
Researchers, students, and practitioners new to AI incident reporting
Purpose
Provide a foundational overview of AI risk, safety, and incident reporting in a single reference chapter.
Read full description

This Springer reference-work chapter offers a broad, citable overview of how AI risk and safety connect to the practice of incident reporting — why real-world harm events are worth cataloging, how incident databases are built and maintained, and how reporting fits into the wider landscape of AI governance.

It is a useful entry point for readers approaching AI incident reporting for the first time.

Read it via Springer.

Lessons for Editors of AI Incidents from the AI Incident Database

Responsible AI Collaborative Led ProjectComplete

Practical lessons for editing and curating AI incident reports, drawn from years of operating the AI Incident Database.

Author
Kevin Paeth, Daniel Atherton, Nikiforos Pittaras, Heather Frase, and Sean McGregor
Intended user
AI incident editors, database maintainers, and researchers
Purpose
Share editorial practices that keep an incident database consistent, accurate, and useful.
Read full description

Curating an incident database is as much an editorial practice as a technical one. This AAAI 2025 paper distills lessons from the people who maintain the AI Incident Database — how to decide what counts as an incident, how to handle ambiguous or contested reports, and how to keep classifications consistent as the collection and its editors grow.

It is a practical companion for anyone building or operating a real-world AI harm catalog.

Read it at AAAI.

Indexing AI Risks with Incidents, Issues, and Variants

Responsible AI Collaborative Led ProjectComplete

Extends the AI Incident Database with a two-tiered model of incidents and issues, plus incident variants.

Author
Sean McGregor, Kevin Paeth, and Khoa Lam
Intended user
AIID editors, researchers, and incident analysts
Purpose
Capture a broader spectrum of AI risk by distinguishing realized harms from potential ones and grouping repeated failures.
Read full description

After two years operating the AI Incident Database, a backlog of issues — cases that reveal AI risks but do not meet the bar for an incident — accumulated. Drawing on lessons from editing 2,000+ incident reports, this paper proposes a two-tiered system, as used in aviation and computer security, that distinguishes:

  • Incidents — realized harms or near-harms in the real world.
  • Issues — credible risks where harm has not (yet) occurred.

It also introduces incident variants for machine learning systems that produce many similar incidents, so repeated failure patterns can be grouped rather than re-litigated.

Read it on arXiv.

The founding paper of the AI Incident Database, cataloging real-world AI failures to prevent their recurrence.

Author
Sean McGregor
Intended user
AI practitioners, researchers, and policymakers
Purpose
Establish a systematized collection of AI incidents so the community can learn from past failures.
Read full description

This AAAI/IAAI 2021 paper introduces the AI Incident Database (AIID) — a collection of more than a thousand reports of intelligent systems causing safety, fairness, or other real-world problems. Modeled on the incident databases of aviation and computer security, it argues that systematically cataloging failures is a precondition for not repeating them.

The paper describes the database's motivation, its data model of incidents and reports, and the application architecture that makes the collection searchable and extensible.

Read it at AAAI.

A three-round Delphi study ranking AI risks by likelihood, severity, and responsibility across 272 international experts.

Author
Alexander K. Saeri, Jess Graham, Michael Noetel, Peter Slattery, and colleagues
Intended user
Policymakers, AI governance researchers, and risk analysts
Purpose
Build expert consensus on which AI risks most urgently warrant attention.
Read full description

Across three rounds, 272 international experts ranked which AI threats most warrant urgent attention, weighing likelihood, magnitude of harm, who is vulnerable, and who bears responsibility.

Findings

  • The five most concerning near-term harms were dangerous capabilities, competitive dynamics, weapons and cyberattacks, power concentration, and misinformation.
  • Under business-as-usual conditions, eighteen of twenty-four assessed risks carried more than a 10% chance of catastrophic outcomes (over one million deaths or $100 billion in losses) between 2025 and 2030.
  • Experts placed primary responsibility for mitigation on general-purpose AI developers and governance institutions.

Read it on arXiv.

A position paper arguing that existing vulnerability disclosure frameworks are misaligned with the needs of AI security.

Author
Lukas Bieringer, Sean McGregor, Nicole Nichols, Kevin Paeth, Jochen Stängler, Andreas Wespi, Alexandre Alahi, and Kathrin Grosse
Intended user
AI security researchers, practitioners, and policymakers
Purpose
Make the case for incident reporting standards built for AI systems rather than borrowed from cybersecurity.
Read full description

Reporting practices borrowed from traditional cybersecurity and general AI safety are not well aligned with the distinctive characteristics of AI security incidents. This SaTML 2026 position paper maps where those frameworks fall short — including unresolved questions of intellectual property and vulnerability ownership — and separates issues that are immediately addressable from those that remain open.

The authors argue that purpose-built AI incident reporting standards are urgently needed, and that the rise of autonomous AI agents will only sharpen that need.

The full paper is on arXiv; the published version appears in IEEE SaTML 2026.

A position paper calling for standardized third-party evaluation and coordinated flaw disclosure for general-purpose AI.

Author
Shayne Longpre, Kevin Klyman, Ruth Elisabeth Appel, Sayash Kapoor, Rishi Bommasani, Sean McGregor, and colleagues
Intended user
AI developers, evaluators, security researchers, and policymakers
Purpose
Bring bug-bounty-style flaw disclosure and researcher safe harbors from cybersecurity to general-purpose AI.
Read full description

Evaluation conducted by AI developers alone cannot surface the flaws that matter once systems are deployed at scale. This ICML 2025 position paper — with contributors from more than two dozen institutions — proposes three interventions:

  1. Standardized flaw reports and researcher-engagement guidelines.
  2. Industry-wide disclosure programs, modeled on cybersecurity bug bounties, with legal safe harbors to protect researchers.
  3. Coordination infrastructure to route vulnerability information to all affected parties.

Transferable flaws — jailbreaks that work across multiple providers — show why disclosure mechanisms comparable to established software-security practice are overdue.

Read it on OpenReview.

Community

3 entries
38 Flags · Governance Error Register (GER) · Scheming in the Wild

More projects advancing the goal of collaborative AI safety data. These might eventually be promoted to the headings above if their developers meet all the criteria (e.g., apply the taxonomy across the entire AIID).

38 Flags

In communication with TheCollabPerpetual

A daily record of real-world AI systems operating without meaningful human oversight.

Author
38 Flags
Intended user
Accountability advocates, governance professionals, and incident watchers
Purpose
Document cases where deployed AI systems acted without meaningful human oversight.
Read full description

38 Flags — subtitled "AI Is Already Out of Control" — is a daily record of AI systems operating without meaningful human oversight. As the project puts it: "These are not hypotheticals. These already happened."

How it works

Each documented case is rated with a HITL Score (Human-in-the-Loop) across four dimensions:

  • Oversight at deployment
  • Ongoing monitoring
  • Incident response
  • Accountability

The tracker spans dozens of cases across categories such as systemic failures, rogue agents, data exposure, and legal hallucinations — emphasizing that many harms come from software-only systems (like chatbots) rather than robots, and that oversight gaps reflect preventable governance failures rather than technical inevitability.

Browse the tracker at 38flags.com.

Governance Error Register (GER)

In communication with TheCollabIn development

An open taxonomy of AI platform governance failures — 110 codes across seven architectural layers, modeled on HTTP status codes.

Author
SVRNOS
Intended user
Engineers, regulators, lawyers, journalists, and affected users
Purpose
Give stakeholders a shared vocabulary for naming and discussing AI governance failures.
Read full description

The Governance Error Register (GER), published by SVRNOS, is an open taxonomy classifying AI platform governance failures. It exists so that engineers, regulators, lawyers, journalists, and affected users can communicate about the same failures using the same words.

How it is structured

GER defines 110 codes across seven architectural layers, modeled on HTTP status codes, with tiers such as:

  • 0xx — pre-infrastructure (no governance layer)
  • 2xx — success states
  • 3xx — structural routing decisions
  • 4xx — local operator or platform failures
  • 5xx — systemic infrastructure failures

It addresses breakdowns in oversight, enforcement, consent, escalation, and multi-agent coordination, surfacing real incidents drawn from the AI Incident Database.

Explore the register at docs.svrnos.com/ger.

Scheming in the Wild

In communication with TheCollabComplete

A study of how to detect real-world AI scheming incidents using open-source intelligence.

Author
Centre for Long-Term Resilience
Intended user
AI safety researchers, analysts, and policymakers
Purpose
Identify real-world cases of AI systems pursuing misaligned objectives before they cause significant harm.
Read full description

Scheming in the Wilddetecting real-world AI scheming incidents through open-source intelligence — examines how to find cases where deployed AI systems pursue objectives misaligned with human intentions ("scheming") using publicly available information.

What it offers

Working from open-source intelligence, the report proposes ways to monitor and surface anomalous, deceptive, or goal-seeking AI behavior in the wild — turning scattered public signals into detectable incidents that researchers and policymakers can act on.

Read the full report from the Centre for Long-Term Resilience.