AI Safety Evaluations for Human Flourishing - Designing Real-World Governance

AI Safety Evaluations for Human Flourishing - Designing Real-World Governance

AI Safety Evaluations for Human Flourishing - Designing Real-World Governance

AI Safety Evaluations for Human Flourishing - Designing Real-World Governance

February 20, 2026 | 8:00AM to 10:00AM The Imperial Hotel, New Delhi
Co-hosted with Humane Intelligence PBC

PAST EVENT

AI Safety Connect and Humane Intelligence PBC convened a private breakfast dialogue bringing together researchers, policymakers and practitioners to explore practical, equitable methodologies for evaluating AI systems. The conversation built on three complementary initiatives emerging from the Summit's Safe and Trusted AI Working Group: the Expert Engagement Group on Frontier AI Model Usage Data and Voluntary Commitments; the announced Trust and Safety AI Commons; and the newly launched Global South AI Safety Network, led by Digital Futures Lab and the Centre for Responsible AI at IIT Madras.

The dialogue centred on three questions: what role evaluations play in AI governance, how voluntary safety commitments by frontier developers can be sustained, and how governance frameworks can better reflect Global South priorities and institutional realities.

Framing Provocation 1

Blurred gradient background transitioning from dark blue on the left to bright orange on the right.

The first provocation and discussion examined evaluations as a governance mechanism. Participants recognised that evaluations are increasingly foundational to AI governance but face significant challenges: inconsistent methods among developers, high costs and limited transparency around internal company testing. A key concern was that AI models may increasingly recognise evaluation settings and adapt their behaviour — underperforming during tests or masking capabilities — raising questions about whether current frameworks can keep pace with advancing systems.

The first provocation and discussion examined evaluations as a governance mechanism. Participants recognised that evaluations are increasingly foundational to AI governance but face significant challenges: inconsistent methods among developers, high costs and limited transparency around internal company testing. A key concern was that AI models may increasingly recognise evaluation settings and adapt their behaviour — underperforming during tests or masking capabilities — raising questions about whether current frameworks can keep pace with advancing systems.

The first provocation and discussion examined evaluations as a governance mechanism. Participants recognised that evaluations are increasingly foundational to AI governance but face significant challenges: inconsistent methods among developers, high costs and limited transparency around internal company testing. A key concern was that AI models may increasingly recognise evaluation settings and adapt their behaviour — underperforming during tests or masking capabilities — raising questions about whether current frameworks can keep pace with advancing systems.

The first provocation and discussion examined evaluations as a governance mechanism. Participants recognised that evaluations are increasingly foundational to AI governance but face significant challenges: inconsistent methods among developers, high costs and limited transparency around internal company testing. A key concern was that AI models may increasingly recognise evaluation settings and adapt their behaviour — underperforming during tests or masking capabilities — raising questions about whether current frameworks can keep pace with advancing systems.

Framing Provocation 2

Blurred gradient background transitioning from dark blue on the left to bright orange on the right.

The second provocation addressed voluntary commitments and institutional sustainability. While commitments made at international summits signal shared norms, participants noted that companies are often hesitant to make public pledges, summits lack continuity mechanisms, and tracking commitments over time remains difficult. The discussion surfaced a tension between responsible AI and commercial incentives and emphasised that the real audience for safety information is developers adapting models for specific use cases, not end users.

The second provocation addressed voluntary commitments and institutional sustainability. While commitments made at international summits signal shared norms, participants noted that companies are often hesitant to make public pledges, summits lack continuity mechanisms, and tracking commitments over time remains difficult. The discussion surfaced a tension between responsible AI and commercial incentives and emphasised that the real audience for safety information is developers adapting models for specific use cases, not end users.

The second provocation addressed voluntary commitments and institutional sustainability. While commitments made at international summits signal shared norms, participants noted that companies are often hesitant to make public pledges, summits lack continuity mechanisms, and tracking commitments over time remains difficult. The discussion surfaced a tension between responsible AI and commercial incentives and emphasised that the real audience for safety information is developers adapting models for specific use cases, not end users.

Framing Provocation 3

Blurred gradient background transitioning from dark blue on the left to bright orange on the right.

The third provocation examined Global South participation in AI safety governance. Participants observed that AI safety debates are often perceived as a Global North agenda, even as Global South countries actively adopt AI technologies. Key gaps include limited local evaluation infrastructure, insufficient government technical capacity and a lack of resources for safety research. Regional cooperation, civil society participation and locally contextualised evaluations were proposed as ways forward.

The third provocation examined Global South participation in AI safety governance. Participants observed that AI safety debates are often perceived as a Global North agenda, even as Global South countries actively adopt AI technologies. Key gaps include limited local evaluation infrastructure, insufficient government technical capacity and a lack of resources for safety research. Regional cooperation, civil society participation and locally contextualised evaluations were proposed as ways forward.

The third provocation examined Global South participation in AI safety governance. Participants observed that AI safety debates are often perceived as a Global North agenda, even as Global South countries actively adopt AI technologies. Key gaps include limited local evaluation infrastructure, insufficient government technical capacity and a lack of resources for safety research. Regional cooperation, civil society participation and locally contextualised evaluations were proposed as ways forward.

The third provocation examined Global South participation in AI safety governance. Participants observed that AI safety debates are often perceived as a Global North agenda, even as Global South countries actively adopt AI technologies. Key gaps include limited local evaluation infrastructure, insufficient government technical capacity and a lack of resources for safety research. Regional cooperation, civil society participation and locally contextualised evaluations were proposed as ways forward.

Key Findings

Blurred gradient background transitioning from dark blue on the left to bright orange on the right.

Participants agreed that AI evaluations are rapidly becoming a central governance mechanism, but the infrastructure required to support them — credible third-party evaluators, shared risk taxonomies, scalable benchmarks, and certification mechanisms for evaluation professionals — remains underdeveloped. The lack of shared definitions of AI risks across countries, companies and evaluators was identified as a major barrier to coordinated governance. Strengthening this ecosystem will require coordinated efforts across governments, industry, research institutions and civil society.

Participants agreed that AI evaluations are rapidly becoming a central governance mechanism, but the infrastructure required to support them — credible third-party evaluators, shared risk taxonomies, scalable benchmarks, and certification mechanisms for evaluation professionals — remains underdeveloped. The lack of shared definitions of AI risks across countries, companies and evaluators was identified as a major barrier to coordinated governance. Strengthening this ecosystem will require coordinated efforts across governments, industry, research institutions and civil society.

Participants agreed that AI evaluations are rapidly becoming a central governance mechanism, but the infrastructure required to support them — credible third-party evaluators, shared risk taxonomies, scalable benchmarks, and certification mechanisms for evaluation professionals — remains underdeveloped. The lack of shared definitions of AI risks across countries, companies and evaluators was identified as a major barrier to coordinated governance. Strengthening this ecosystem will require coordinated efforts across governments, industry, research institutions and civil society.

Closing

Blurred gradient background transitioning from dark blue on the left to bright orange on the right.

It was agreed that the session's findings would inform multilateral governance processes through the rest of 2026. Evaluation was positioned as one of the most important bridges between governance commitments and implementation. Participants were encouraged to carry their practical insights directly into the UN Global Dialogue in Geneva.

It was agreed that the session's findings would inform multilateral governance processes through the rest of 2026. Evaluation was positioned as one of the most important bridges between governance commitments and implementation. Participants were encouraged to carry their practical insights directly into the UN Global Dialogue in Geneva.

It was agreed that the session's findings would inform multilateral governance processes through the rest of 2026. Evaluation was positioned as one of the most important bridges between governance commitments and implementation. Participants were encouraged to carry their practical insights directly into the UN Global Dialogue in Geneva.

It was agreed that the session's findings would inform multilateral governance processes through the rest of 2026. Evaluation was positioned as one of the most important bridges between governance commitments and implementation. Participants were encouraged to carry their practical insights directly into the UN Global Dialogue in Geneva.

Where the world meets to make AI safe

Where the world meets to make AI safe

Where the world meets to make AI safe

Where the world meets to make AI safe