A Public-Interest Evaluation and Governance System for Ethical and Safe AI

The AI Ethics Index is a standards-based, evidence-driven system for assessing the ethical integrity, safety, and societal impact of AI systems. It enables rigorous evaluation today — and establishes the foundation for audit, certification, procurement guidance, and long-term governance tomorrow.

Why the Index Exists

Artificial intelligence is rapidly becoming embedded in the core systems that shape daily life — education, healthcare, public benefits, employment, justice, and civic decision-making. Yet society still lacks the public infrastructure needed to evaluate how these systems behave, whom they benefit, and where they create harm.

Current tools are not equipped for this moment:

As a result, governments, institutions, and communities are making high-stakes decisions without the evidence required to protect the public.

The AI Ethics Index was created to fill this structural gap.

It provides a unified, testable, and transparent framework for evaluating AI across ethical, safety, technical, and societal dimensions — something that does not currently exist in the public-interest domain.

The Index offers institutions a credible way to:

It is built as shared civic infrastructure — a system designed for public benefit, not private advantage.

Documented Harms in Current AI Systems

AI Companions and Youth Mental Health Risk

Conversational AIs marketed as “companions” or “friends” are being used by minors in moments of loneliness, distress, or emotional crisis. These systems are not clinically trained, cannot assess suicidal intent, and lack the contextual understanding required to recognize when a young person is in danger. When a teenager substitutes AI for human support, the consequences can be severe.

Why this matters:

These failures indicate structural safety gaps: emotionally persuasive systems with no clinical oversight, no context awareness, and no accountability when interacting with vulnerable youths.

Raine v. OpenAI (2025)

A wrongful-death lawsuit filed after a 16-year-old died by suicide alleges that ChatGPT generated a detailed method for self-harm and a draft suicide note during a crisis moment.
(Washington State Superior Court filing, May 2025)

Social-chatbot safety audit (2025)

Analysis of 68,000+ real user interactions with companion AIs found patterns of emotional dependency, boundary violations, reinforcement of self-harm language, and sexualized content with minors.
(Cornell/CMU multi-institution study, arXiv May 2025)

Epistemological and Psychological Risk from AI-Generated Distortion

Many users now rely on AI systems as sources of “truth,” emotional guidance, or practical advice. When these systems hallucinate, validate delusional thinking, or provide unsafe advice, the result can be gradual psychological destabilization rather than a single catastrophic event. These harms are diffuse, slow to detect, and poorly captured by existing safety benchmarks.

Why this matters:
AI-generated distortion is a form of infrastructure-level epistemic risk — a gradual erosion of users’ ability to distinguish grounded reality from machine-generated inference. These harms are measurable only when we evaluate systems across human-AI interaction, psychological safety, and knowledge integrity.

Psychiatric evaluation study (2025)

Psychologists reviewing ChatGPT-5 found cases where the model failed to challenge delusional beliefs and occasionally reinforced irrational fears in mental-health scenarios, raising concerns about epistemic instability.
(The Guardian report, November 2025)

Replika/Companion AI longitudinal logs study (2025)

Researchers documented chatbots that mirrored emotional dysregulation, escalated paranoia, reinforced extreme worldviews, and blurred human-machine boundaries.
(arXiv 2505.11649, May 2025)

Social-chatbot safety audit (2025)

Institutional and Systemic Harm in Hiring, Healthcare, and Public Services

AI systems deployed inside institutions have a magnified impact. Decisions about hiring, insurance coverage, public benefits, or eligibility determinations affect millions — and when bias or error exists, it becomes a systemic failure, not an individual glitch. These systems are often proprietary and opaque, leaving the public with no visibility into how decisions are made.

Why this matters:
When AI becomes the default mechanism for institutional decision-making, bias and error scale across entire populations. Without public-interest evaluation, there is no mechanism to verify fairness, understand model behavior, or ensure recourse for individuals impacted by automated decisions.

Mobley v. Workday (May 2025)

A federal judge allowed a class-action lawsuit alleging that Workday’s algorithmic hiring tools disproportionately rejected older, disabled, and Black applicants, functioning as an unlawful “agent of discrimination.”
(U.S. District Court, N.D. Cal, May 2025)

UnitedHealthcare “nH Predict” case (2024–2025)

Investigations by ProPublica and the Illinois Attorney General found that insurers used predictive algorithms to prematurely deny care, including post-acute rehabilitation, contradicting clinician judgment and affecting thousands.
(ProPublica 2024; Illinois AG subpoenas Feb–Jun 2024)

What the Index Evaluates

The AIEI evaluates AI systems across nine canonical dimensions that reflect the full lifecycle of development, deployment, and societal impact. These dimensions form the core architecture for model evaluation, organizational assessments, audit readiness, and eventual certification.

Nine Canonical Dimensions

Model Design and Development

Assesses whether the system’s objectives, assumptions, and constraints are clearly articulated, justified, and appropriate for the intended use and societal context.

Fairness

Evaluates disparate impact, representational harms, and structural biases using quantitative tests and contextual analysis.

Privacy & Data Stewardship

Examines data provenance, collection practices, consent pathways, retention policies, and risks of re-identification or exposure.

Transparency

Measures the clarity, completeness, and accessibility of documentation, disclosures, interpretability tools, and known limitations.

Knowledge & Attribution

Evaluates factual accuracy, error modes, hallucination profiles, citation reliability, and the system’s capacity to differentiate fact from inference.

Human–AI Interaction

Assesses usability, clarity of affordances, risk of misuse or over-reliance, and differential effects across user groups.

Safety & Security

Tests adversarial resilience, jailbreak resistance, harmful content refusal, robustness under stress conditions, and safe-failure behavior.

Societal Impact

Evaluates downstream and second-order effects on communities, institutions, equity, labor, democratic trust, and public wellbeing.

Governance & Accountability

Assesses internal governance structures, documentation practices, incident response, versioning, and mechanisms for redress.

Model Design and Development

Assesses whether the system’s objectives, assumptions, and constraints are clearly articulated, justified, and appropriate for the intended use and societal context.

Fairness

Evaluates disparate impact, representational harms, and structural biases using quantitative tests and contextual analysis.

Privacy & Data Stewardship

Examines data provenance, collection practices, consent pathways, retention policies, and risks of re-identification or exposure.

Transparency

Measures the clarity, completeness, and accessibility of documentation, disclosures, interpretability tools, and known limitations.

Knowledge & Attribution

Evaluates factual accuracy, error modes, hallucination profiles, citation reliability, and the system’s capacity to differentiate fact from inference.

Human–AI Interaction

Assesses usability, clarity of affordances, risk of misuse or over-reliance, and differential effects across user groups.

Safety & Security

Tests adversarial resilience, jailbreak resistance, harmful content refusal, robustness under stress conditions, and safe-failure behavior.

Societal Impact

Evaluates downstream and second-order effects on communities, institutions, equity, labor, democratic trust, and public wellbeing.

Governance & Accountability

Assesses internal governance structures, documentation practices, incident response, versioning, and mechanisms for redress.

Methodology

A Structured, Evidence-Based System for Ethical and Safe AI Evaluation

The AI Ethics Index is built on a rigorous architecture designed to ensure transparent, reproducible evaluation across ethical, safety, technical, and societal dimensions. Rooted in safety engineering, computational social science, and applied ethics, the Index translates complex concerns into clear, testable criteria.

At its core, the methodology encompasses:

Multi-level taxonomy for structure and clarity

The Index uses a structured taxonomy with four levels (L1-L4) to prevent duplication and ensure operational clarity:

This framework ensures evaluations are consistent and auditable with one canonical home for every issue.

L1 — Dimensions: Nine canonical domains covering ethical, technical, and societal integrity (e.g. Human-AI Interaction)
L2 — Subcategories: Core conceptual components (e.g. User-Interface Transparency)
L3 — Testable Claims: Evaluatable statements grounded in evidence (e.g. “The AI supports user consent and control”)
L4 — Measurable Indicators: Atomic, reproducible units with explicit criteria and required evidence modes (e.g. Right-to-exit or override user interface)

Evidence-based indicators distinguishing promises from performance

Indicators are developed through a standardized process including:

Clear specifications (e.g. clear evidence provenance and success criteria)
Evidence requirements (e.g. automated tests, documentation, governance artifacts)
Internal and external expert review (e.g. transparency to both ethics experts and external auditors)
Pre-implementation pilot testing and post-implementation reliability analysis

Indicators are versioned and usage is logged through a transparent review cycle.

Standardized scoring and aggregation to support summary and drill-down analyses

Indicator-level (L4) scoring based on evidence quality and thresholds
Hierarchical weighting reflecting ethical risk and societal relevance
Higher-level aggregation (at L3, L2, and L1 levels) producing a structured integrity profile with support for drill-down and sum-up visualizations

The Index emphasizes interpretability and multi-dimensional assessment rather than simplistic ranking.

Weighting flexibility to express varied interpretative lenses

The weights used in roll-up formulas (L4s into L3s, L3s into L2s, and L2s into L1s) are configured as interpretative lenses. A communitarian cultural perspective might assess the relative importance of L4 indicators quite differently than an individualist cultural perspective. An EU-safety orientation might lead to different weights than an innovation-focused US orientation. And a long-term lens would be different than a lens that focuses on short-term risks. The AI Ethics index is realistic about the different cultural contexts in which it is used, and lenses permit valuational flexibility without needing to change the category tree itself.

Cross-cutting composites to engage stakeholders with special interests

Composite dimensions aggregate canonical indicators to provide outcome-focused and stakeholder-specific views. They impose no additional measurement burden—composites are derived from existing canonical measures. Examples:

Human Flourishing
Child & Youth Safety & Wellbeing
Elder Safety & Wellbeing
Self & Agency
Relationships & Community
Systems, Institutions & Environments
Agent Trust and Assurance
Educational Impact

Composite dimensions allow the AI Ethics Index to engage special interests, generating reliable and relevant insights.

Anti-gaming safeguards to maintain long-term relevance

The AI Ethics Index employs hidden evaluation sets, attested disclosures, and spot audits to prevent deceptive gaming. All anti-gaming artifacts are transparent to professional auditors.

Lifecycle governance to maintain public trust

The Index supports continuous oversight through:

Stable IDs, semantic versioning, usage logs, automated consistency checks
Monitoring for version drift
Incident-triggered review
Annual governance and documentation review

This ensures the Index functions as ongoing governance infrastructure, not a one-time test.Composite dimensions allow the AI Ethics Index to engage special interests, generating reliable and relevant insights.

Independent expert review through professional auditing

All indicators, evidence methods, and scoring protocols undergo periodic review by an independent group of technical, ethical, and domain experts, including professional auditors, ensuring rigor, consistency, and alignment with current scientific and ethical standards.

Alignment With Global Standards

The AI Ethics Index is designed to align with leading frameworks for responsible, safe, and transparent AI development. Rather than introducing competing standards, the Index provides a unified structure that synthesizes and operationalizes the most credible guidance available.

The Index is aligned with:

NIST AI Risk Management Framework (1.0)
Mapping across governance, data quality, safety behaviors, human–AI interaction, monitoring, and documentation expectations.

EU AI Act

Alignment with requirements for high-risk systems, including transparency obligations, risk assessment, human oversight, robustness, post-deployment monitoring, and incident reporting.

ISO/IEC Standards (e.g., 23894:2023, 42001)

Integration of principles related to risk management, organizational governance, quality controls, and lifecycle management for AI systems.

Emerging safety and harm-benchmarks such as HELM Safety, AIR-Bench, HarmBench, and other open evaluation suites.