Execution errors enable the evolution of fairness in the ultimatum game

The goal of designing autonomous and successful agents is often attempted by providing mechanisms to choose actions that maximise some reward function. When agents interact with a static environment, the provided reward functions are well-defined and the implementation of traditional learning algori...

Full description

Bibliographic Details
Main Author: Santos, Fernando P. (author)
Other Authors: Santos, Francisco C. (author), Paiva, Ana (author), Pacheco, Jorge Manuel Santos (author)
Format: conferencePaper
Language:eng
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/1822/47899
Country:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/47899
Description
Summary:The goal of designing autonomous and successful agents is often attempted by providing mechanisms to choose actions that maximise some reward function. When agents interact with a static environment, the provided reward functions are well-defined and the implementation of traditional learning algorithms turns to be feasible. However, agents are not only intended to act in isolation. Often, they interact in dynamic multiagent systems whose decentralised nature of decision making, huge number of opponents and evolving behaviour stems a complex adaptive system [3]. This way, it is an important challenge to unveil the long term outcome of agents’ strategies, both in terms of individual goals and social desirability [9]. This endeavour can be conveniently achieved through the employment of new tools from, e.g., population dynamics [4] and complex systems research, in order to grasp the effects of implementing agents whose strategies, even rational in the context of static environments, may turn to be disadvantageous (individually and socially) when successively applied by members of a dynamic population. In this paper, we present a paradigmatic scenario in which behavioural errors are pernicious if committed in isolation, yet are the source of long-term success when considering adaptive populations. Moreover, errors support population states in which fairness (less inequality) is augmented. We assume that the goals and strategies of agents are formalised through the famous Ultimatum Game (UG) [2].We focus on the changes regarding the frequency of agents adopting each strategy, over time. This process of social learning, essentially analogous to the evolution of animal traits in a population, enables us to use the tools of Evolutionary Game Theory (EGT), originally applied in the context of theoretical biology [4]. We describe analytically the behavioural outcome in a discretised strategy space of the UG, in the limit of small exploration rates (or the socalled mutations) [1]. This allows us to replicate the results of largescale simulations [7], yet avoiding the burden of computational resources.