Towards Provably Moral AI Agents in Bottom-up Learning Frameworks

AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, 2018

Nolan P. Shaw, Andreas Stöckel, Ryan Orr, Thomas F. Lidbetter, Robin Cohen


We examine moral machine decision-making, inspired by a central question posed by Rossi regarding moral preferences: can AI systems based on statistical machine learning (which do not provide a natural way to explain or justify their decisions) be used for embedding morality into a machine in a way that allows us to prove that nothing morally wrong will happen? We argue for an evaluation held to the same standards as a human agent, removing the demand that ethical behavior is always achieved. We introduce four key meta-qualities desired for our moral standards, and then proceed to clarify how we can prove that an agent will correctly learn to perform moral actions given a set of samples within certain error bounds. Our group-dynamic approach enables us to demonstrate that the learned models converge to a common function to achieve stability. We further explain a valuable intrinsic consistency check made possible through the derivation of logical statements from the machine learning model. In all, this work proposes an approach for building ethical AI systems, from the perspective of artificial intelligence, and sheds important light on understanding how much learning is required for an intelligent agent to behave morally with negligible error.

Full text links


 External link

Conference Proceedings

AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society
New Orleans, USA


Plain text