EMNLP 2025 Two Papers - Value-Action Gap in LLMs (Main Track); ValueCompass (WiNLP Workshop)
-
Updated
Nov 5, 2025 - Jupyter Notebook
EMNLP 2025 Two Papers - Value-Action Gap in LLMs (Main Track); ValueCompass (WiNLP Workshop)
Code and data for our IROS paper: "Are Large Language Models Aligned with People's Social Intuitions for Human–Robot Interactions?"
PRISM: A Multi-Perspective AI Alignment Framework for Ethical AI (Demo: https://app.prismframework.ai | Paper: https://arxiv.org/abs/2503.04740)
EthosGPT is an open-source framework that maps how Large Language Models align with diverse human values, promoting cultural and ethical diversity in AI-driven decision-making.
A data-driven framework mapping daily activities to multi-horizon goals, exploring time-to-value realization beyond traditional 80/20 optimization
A comprehensive toolkit for implementing, analyzing, and validating AI value alignment based on Anthropic's 'Values in the Wild' research.
Seeding mercy and coexistence - Socratic Method Dia-LOGs for LLM Alignment
Value aligned socio-political-economic systems
AI ethics framework built on Layer 0 Principle: ∀x, V(x) > 0. Combines philosophical depth with measurable implementation.
A unified framework: Collective Resonance → Strange Attractors → Value Alignment → Algorithmic Intentionality → Emergent Algorithmic Behavior
TriEthix is a novel evaluation framework that systematically benchmarks frontier LLMs across three foundational ethical perspectives: virtue, deontology, and consequentialism in 3 steps: (Step-1) Moral Weights; (Step-2) Moral Consistency; and (Step-3) Moral Reasoning. TriEthix reveals robust moral profiles for AI Safety, Governance, and Welfare.
Authority Stack Benchmark Suite — measuring AI Integrity across 4 layers: Normative, Epistemic, Source, and Data Authority
Ripple_Logic: A rights-constrained ripple-aware ethical decision operating system for governance, AI alignment, and institutional decision-making.
Toy 7. An elimination-filter landscape applying two structural constraints simultaneously to map which objective classes can persist under sustained optimization pressure — and which cannot. Includes a four-stage scenario engine and open-question frontier. Companion simulation for The Shape of What Does Not End — Series 2, Part 4.
Driving away from the binary "hallucinations" evals to a more nuanced and context-dependent eval technique.
Survey-based research study analyzing organizational initiatives that drive employee value alignment, workplace satisfaction, and productivity outcomes.
Assess workplace initiatives to measure and improve alignment between organizational values and employees’ personal values using survey data.
Add a description, image, and links to the value-alignment topic page so that developers can more easily learn about it.
To associate your repository with the value-alignment topic, visit your repo's landing page and select "manage topics."