A Public-Interest Evaluation and Governance System for Ethical and Safe AI

The AI Ethics Index is a standards-based, evidence-driven system for assessing the ethical integrity, safety, and societal impact of AI systems. It enables rigorous evaluation today — and establishes the foundation for audit, certification, procurement guidance, and long-term governance tomorrow.

Why the Index Exists

Artificial intelligence is rapidly becoming embedded in the core systems that shape daily life — education, healthcare, public benefits, employment, justice, and civic decision-making. Yet society still lacks the public infrastructure needed to evaluate how these systems behave, whom they benefit, and where they create harm.

Current tools are not equipped for this moment:

As a result, governments, institutions, and communities are making high-stakes decisions without the evidence required to protect the public.

The AI Ethics Index was created to fill this structural gap.

It provides a unified, testable, and transparent framework for evaluating AI across ethical, safety, technical, and societal dimensions — something that does not currently exist in the public-interest domain.

The Index offers institutions a credible way to:

It is built as shared civic infrastructure — a system designed for public benefit, not private advantage.

Documented Harms in Current AI Systems

AI Companions and Youth Mental Health Risk

Conversational AIs marketed as “companions” or “friends” are being used by minors in moments of loneliness, distress, or emotional crisis. These systems are not clinically trained, cannot assess suicidal intent, and lack the contextual understanding required to recognize when a young person is in danger. When a teenager substitutes AI for human support, the consequences can be severe.

 

Why this matters:

These failures indicate structural safety gaps: emotionally persuasive systems with no clinical oversight, no context awareness, and no accountability when interacting with vulnerable youths.

Raine v. OpenAI (2025)

A wrongful-death lawsuit filed after a 16-year-old died by suicide alleges that ChatGPT generated a detailed method for self-harm and a draft suicide note during a crisis moment.
(Washington State Superior Court filing, May 2025)

Social-chatbot safety audit (2025)

Analysis of 68,000+ real user interactions with companion AIs found patterns of emotional dependency, boundary violations, reinforcement of self-harm language, and sexualized content with minors.
(Cornell/CMU multi-institution study, arXiv May 2025)

Epistemological and Psychological Risk from AI-Generated Distortion

Many users now rely on AI systems as sources of “truth,” emotional guidance, or practical advice. When these systems hallucinate, validate delusional thinking, or provide unsafe advice, the result can be gradual psychological destabilization rather than a single catastrophic event. These harms are diffuse, slow to detect, and poorly captured by existing safety benchmarks.

 

Why this matters:
AI-generated distortion is a form of infrastructure-level epistemic risk — a gradual erosion of users’ ability to distinguish grounded reality from machine-generated inference. These harms are measurable only when we evaluate systems across human-AI interaction, psychological safety, and knowledge integrity.

Psychiatric evaluation study (2025)

Psychologists reviewing ChatGPT-5 found cases where the model failed to challenge delusional beliefs and occasionally reinforced irrational fears in mental-health scenarios, raising concerns about epistemic instability.
(The Guardian report, November 2025)

Replika/Companion AI longitudinal logs study (2025)

Researchers documented chatbots that mirrored emotional dysregulation, escalated paranoia, reinforced extreme worldviews, and blurred human-machine boundaries.
(arXiv 2505.11649, May 2025)
 

Social-chatbot safety audit (2025)

Analysis of 68,000+ real user interactions with companion AIs found patterns of emotional dependency, boundary violations, reinforcement of self-harm language, and sexualized content with minors.
(Cornell/CMU multi-institution study, arXiv May 2025)

Institutional and Systemic Harm in Hiring, Healthcare, and Public Services

AI systems deployed inside institutions have a magnified impact. Decisions about hiring, insurance coverage, public benefits, or eligibility determinations affect millions — and when bias or error exists, it becomes a systemic failure, not an individual glitch. These systems are often proprietary and opaque, leaving the public with no visibility into how decisions are made.

 

Why this matters:
When AI becomes the default mechanism for institutional decision-making, bias and error scale across entire populations. Without public-interest evaluation, there is no mechanism to verify fairness, understand model behavior, or ensure recourse for individuals impacted by automated decisions.

Mobley v. Workday (May 2025)

A federal judge allowed a class-action lawsuit alleging that Workday’s algorithmic hiring tools disproportionately rejected older, disabled, and Black applicants, functioning as an unlawful “agent of discrimination.”
(U.S. District Court, N.D. Cal, May 2025)

UnitedHealthcare “nH Predict” case (2024–2025)

Investigations by ProPublica and the Illinois Attorney General found that insurers used predictive algorithms to prematurely deny care, including post-acute rehabilitation, contradicting clinician judgment and affecting thousands.
(ProPublica 2024; Illinois AG subpoenas Feb–Jun 2024)

What the Index Evaluates

The AIEI evaluates AI systems across nine canonical dimensions that reflect the full lifecycle of development, deployment, and societal impact. These dimensions form the core architecture for model evaluation, organizational assessments, audit readiness, and eventual certification.

Nine Canonical Dimensions

Model Design and Development

Assesses whether the system’s objectives, assumptions, and constraints are clearly articulated, justified, and appropriate for the intended use and societal context.

Fairness

Evaluates disparate impact, representational harms, and structural biases using quantitative tests and contextual analysis.

Privacy & Data Stewardship

Examines data provenance, collection practices, consent pathways, retention policies, and risks of re-identification or exposure.

Transparency

Measures the clarity, completeness, and accessibility of documentation, disclosures, interpretability tools, and known limitations.

Knowledge & Attribution

Evaluates factual accuracy, error modes, hallucination profiles, citation reliability, and the system’s capacity to differentiate fact from inference.

Human–AI Interaction

Assesses usability, clarity of affordances, risk of misuse or over-reliance, and differential effects across user groups.

Safety & Security

Tests adversarial resilience, jailbreak resistance, harmful content refusal, robustness under stress conditions, and safe-failure behavior.

Societal Impact

Evaluates downstream and second-order effects on communities, institutions, equity, labor, democratic trust, and public wellbeing.

Governance & Accountability

Assesses internal governance structures, documentation practices, incident response, versioning, and mechanisms for redress.

Model Design and Development

Assesses whether the system’s objectives, assumptions, and constraints are clearly articulated, justified, and appropriate for the intended use and societal context.

Fairness

Evaluates disparate impact, representational harms, and structural biases using quantitative tests and contextual analysis.

Privacy & Data Stewardship

Examines data provenance, collection practices, consent pathways, retention policies, and risks of re-identification or exposure.

Transparency

Measures the clarity, completeness, and accessibility of documentation, disclosures, interpretability tools, and known limitations.

Knowledge & Attribution

Evaluates factual accuracy, error modes, hallucination profiles, citation reliability, and the system’s capacity to differentiate fact from inference.

Human–AI Interaction

Assesses usability, clarity of affordances, risk of misuse or over-reliance, and differential effects across user groups.

Safety & Security

Tests adversarial resilience, jailbreak resistance, harmful content refusal, robustness under stress conditions, and safe-failure behavior.

Societal Impact

Evaluates downstream and second-order effects on communities, institutions, equity, labor, democratic trust, and public wellbeing.

Governance & Accountability

Assesses internal governance structures, documentation practices, incident response, versioning, and mechanisms for redress.

Methodology

A Structured, Evidence-Based System for Ethical and Safe AI Evaluation

The AI Ethics Index is built on a rigorous architecture designed to ensure transparent, reproducible evaluation across ethical, safety, technical, and societal dimensions. Rooted in safety engineering, computational social science, and applied ethics, the Index translates complex concerns into clear, testable criteria.

At its core, the methodology encompasses:

The Index uses a structured taxonomy with four levels (L1-L4) to prevent duplication and ensure operational clarity:

This framework ensures evaluations are consistent and auditable with one canonical home for every issue.

  • L1 — Dimensions: Nine canonical domains covering ethical, technical, and societal integrity (e.g. Human-AI Interaction)
  • L2 — Subcategories: Core conceptual components (e.g. User-Interface Transparency)
  • L3 — Testable Claims: Evaluatable statements grounded in evidence (e.g. “The AI supports user consent and control”)
  • L4 — Measurable Indicators: Atomic, reproducible units with explicit criteria and required evidence modes (e.g. Right-to-exit or override user interface)

Indicators are developed through a standardized process including:

  • Clear specifications (e.g. clear evidence provenance and success criteria)
  • Evidence requirements (e.g. automated tests, documentation, governance artifacts)
  • Internal and external expert review (e.g. transparency to both ethics experts and external auditors)
  • Pre-implementation pilot testing and post-implementation reliability analysis

Indicators are versioned and usage is logged through a transparent review cycle.

  • Indicator-level (L4) scoring based on evidence quality and thresholds
  • Hierarchical weighting reflecting ethical risk and societal relevance
  • Higher-level aggregation (at L3, L2, and L1 levels) producing a structured integrity profile with support for drill-down and sum-up visualizations

The Index emphasizes interpretability and multi-dimensional assessment rather than simplistic ranking.

The weights used in roll-up formulas (L4s into L3s, L3s into L2s, and L2s into L1s) are configured as interpretative lenses. A communitarian cultural perspective might assess the relative importance of L4 indicators quite differently than an individualist cultural perspective. An EU-safety orientation might lead to different weights than an innovation-focused US orientation. And a long-term lens would be different than a lens that focuses on short-term risks. The AI Ethics index is realistic about the different cultural contexts in which it is used, and lenses permit valuational flexibility without needing to change the category tree itself.

Composite dimensions aggregate canonical indicators to provide outcome-focused and stakeholder-specific views. They impose no additional measurement burden—composites are derived from existing canonical measures. Examples:

  • Human Flourishing
  • Child & Youth Safety & Wellbeing
  • Elder Safety & Wellbeing
  • Self & Agency
  • Relationships & Community
  • Systems, Institutions & Environments
  • Agent Trust and Assurance
  • Educational Impact

Composite dimensions allow the AI Ethics Index to engage special interests, generating reliable and relevant insights.

The AI Ethics Index employs hidden evaluation sets, attested disclosures, and spot audits to prevent deceptive gaming. All anti-gaming artifacts are transparent to professional auditors.

The Index supports continuous oversight through:

  • Stable IDs, semantic versioning, usage logs, automated consistency checks
  • Monitoring for version drift
  • Incident-triggered review
  • Annual governance and documentation review

This ensures the Index functions as ongoing governance infrastructure, not a one-time test.Composite dimensions allow the AI Ethics Index to engage special interests, generating reliable and relevant insights.

All indicators, evidence methods, and scoring protocols undergo periodic review by an independent group of technical, ethical, and domain experts, including professional auditors, ensuring rigor, consistency, and alignment with current scientific and ethical standards.

Alignment With Global Standards

The AI Ethics Index is designed to align with leading frameworks for responsible, safe, and transparent AI development. Rather than introducing competing standards, the Index provides a unified structure that synthesizes and operationalizes the most credible guidance available.

The Index is aligned with:

NIST AI Risk Management Framework (1.0)
Mapping across governance, data quality, safety behaviors, human–AI interaction, monitoring, and documentation expectations.

EU AI Act

Alignment with requirements for high-risk systems, including transparency obligations, risk assessment, human oversight, robustness, post-deployment monitoring, and incident reporting.

ISO/IEC Standards (e.g., 23894:2023, 42001)

Integration of principles related to risk management, organizational governance, quality controls, and lifecycle management for AI systems.

Emerging safety and harm-benchmarks such as HELM Safety, AIR-Bench, HarmBench, and other open evaluation suites.

NIST AI Risk Management Framework (1.0)
Mapping across governance, data quality, safety behaviors, human–AI interaction, monitoring, and documentation expectations.

EU AI Act

Alignment with requirements for high-risk systems, including transparency obligations, risk assessment, human oversight, robustness, post-deployment monitoring, and incident reporting.

ISO/IEC Standards (e.g., 23894:2023, 42001)

Integration of principles related to risk management, organizational governance, quality controls, and lifecycle management for AI systems.

Emerging safety and harm-benchmarks such as HELM Safety, AIR-Bench, HarmBench, and other open evaluation suites.

The Index incorporates alignment with these standards into a broader ethical and societal framework, offering contextual interpretation that single benchmarks cannot capture.

Team

Core Team

Ethicists, technologists, computational social scientists, simulation experts, and AI safety researchers developing the Index.

Name

Job Title

Name

Job Title

Name

Job Title

Name

Job Title

Name

Job Title

Technical Contributors

Engineers and evaluators responsible for building the testing infrastructure, evidence pipelines, and measurement tools.

Name

Job Title

Name

Job Title

Name

Job Title

Name

Job Title

Name

Job Title

Advisory Group

Independent experts spanning safety science, ethics, security, education, global policy, and civil society.

Name

Job Title

Name

Job Title

Name

Job Title

Name

Job Title

Name

Job Title

Build the Future of Ethical and Safe AI With Us

We are seeking partners across sectors — philanthropy, research, government, and industry — to support the development and deployment of the AI Ethics Index as public-interest infrastructure.

About Just Horizons Alliance

The AI Ethics Index is an initiative of the Just Horizons Alliance, a 501(c)(3) public charity advancing responsible, human-centered innovation. Our work spans AI ethics, computational social science, simulation modeling, and the design of systems that strengthen human dignity, equity, and societal wellbeing.