Interruptible coroutines

The engine that executes agents tries specifically to make the agent code easy to write and to read. One way we do this is to provide coroutines. However, agents in real-time games must also respond to events, which means the API must be event-driven as well.

We call our solution interruptible coroutines. It is similar to the interruptible iterators described by Liu, Kimball, and Myers for the ACM, but is a bit more functional for our specific needs. Here is how they work:

Consider the task of extracting any unconscious people from an evacuated building, using four actors. The actors must spread out to cover all the area as fast as possible, but when a person is discovered two actors must work together to carry the person out. The actor provides three functions ("actions") the agent may invoke, plus a special abort function:

  • go_to_room (run to the specified room and enter it)
  • search_room (inspect the present room until it is proven empty or until a person is found)
  • evacuate (with one other actor, carry the specified person out of the building)
  • abort (stop what you're doing)

Here's the code for one possible agent. (Functions referenced but not defined here do what their names indicate, but their implementations are not relevant to this demonstration.)

function run() { room = determine_closest_unsearched_room() while (room != null) { try { actor.go_to_room(room) person = actor.search_room() if (person != null) { alert_all_agents(person, room) } } catch (alert) { actor.go_to_room(alert.room) actor.evacuate(alert.person) } room = determine_closest_unsearched_room() } } function receive_alert(person, room) { nearby_actors = rank_actors_by_proximity_to(room) if (actor == nearby_actors[0] || actor == nearby_actors[1]) { actor.abort(person, room) } }

When the agent calls an action, two API features come into play. They complement each other nicely:

1. Coroutine - Each action takes time (as simulated by the model) to complete. When agent A invokes the action go_to_room(room), the engine stops executing A's code for as long as it takes the model to simulate the actor going to the room. During that time the engine selects another agent and executes its code. If all agents are in the middle of actions, the engine waits for an action to complete.

When the simulation finishes executing agent A's action go_to_room(room), it resumes executing A's code, starting with the next line search_room(room), and stopping again when that action begins.

The agent is in charge of its behavior, and this kind of API accentuates that. It runs its own program, calls actions, and acts on the results. The other way to do it is inversion of control, where the agent implements a function and waits for the engine to call that function whenever it wants to know what the agent is doing. We don't like inversion of control in this case. The agent code is harder to read.

2. Event - In some cases – in our example, if a fellow agent finds a person – the agent does not want to wait for its current action to complete. It wants to be interrupted and get control back immediately upon important events. The agent execution engine enables this. When any agent calls alert_all_agents(person, room), the engine invokes receive_alert(person, room) on each agent. If that agent was executing search_room, for example, the execution of receive_alert is stacked on top of the search_room action, which is suspended.

During the newly invoked function, the agent may call actor.abort(person, room) before exiting. If it does not, search_room continues and the agent is resumed when it completes, as usual. If the agent does abort, the actions lower in the stack are all interrupted. They throw an exception; the agent catches the exception and carries out the task of evacuating the found person.

Combining these two features allows us to provide continuation during actions, which shares computing resources among all friendly agents – while ensuring that the agent can regain control when important events occur.