Christopher Amato

Assistant Professor
 Department of Computer Science
University of New Hampshire

camato at cs dot unh dot edu

Home | Publications | Research | Robot videos | Talks | Teaching | Contact

Research: Also check out the robot videos

Optimal coordination of decentralized agents

Coordination under uncertainty is difficult, even in cooperative systems. When communication is unreliable, slow or costly, agents must rely on local information to make decisions. These cooperative systems can be modeled as decentralized partially observable Markov decision processes (Dec-POMDPs). Dec-POMDPs are very general and can represent agent teams with stochastic actions and partial information. One example is a vehicles optimizing exploration while minimizing fuel and communication usage such as Mars rovers or reconnaissance drones.

My research has made several contributions to Dec-POMDP solutions, including:
  • The first (and only) epsilon-optimal solution method for infinite-horizon Dec-POMDPs [JAIR 09]
  • Using a centralized planning stage, but decentralized execution to transform (finite-horizon) Dec-POMDPs into continuous-state MDPs, allowing powerful centralized (POMDP) methods to be used [IJCAI 13][JAIR 15]
  • An efficient dynamic programming algorithm that can automatically take advantage of state-space structure  [ICAPS 09]
  • The most scalable optimal (top-down) heuristic search technique for finite-horizon Dec-POMDPs [JAIR 13]
These algorithms represent the most scalable bounded and optimal methods for Dec-POMDPs, showing that optimal methods can solve large interesting problems.

Coordination in multi-robot systems

A large focus of my current work has been on coordination for teams of robots when there is sensor uncertainty and limited communication. Previous research has addressed parts of the problem or specific instances, but our approaches are the first general and scalable methods that consider uncertainty in the outcomes, sensors and communication.

My research has made several contributions to multi-robot coordination, including:
  • An asynchronous decision-making framework using macro-actions (i.e., actions that may require different amounts of time to complete such as navigating to a waypoint or grasping an object) that can model any multi-robot coordination problem, as well as scalable algorithms [AAMAS 14].
  • Simulator-based versions of our algorithms that better fit with multi-robot environments and demonstration in a multi-robot warehouse scenario to automatically optimize navigation and communication actions [ICRA 15a]
  • Algorithms to solve continuous-state problems and automatically generate the macro-actions as well as a demonstration of results on simulated and real package delivery problems with aerial vehicles [ICRA 15b]
  • More scalable algorithms that only require a very high-level model (or simulator) in order to optimize solutions and demonstration on a heterogeneous multi-robot logistics scenario (bartending) [RSS 15]
These approaches represent a new framework and viewpoint on multi-robot coordination that opens the door for a wide range of future research on problems ranging from search and rescue to cooperative mapping and surveillance.

Learning to coordinate through data and interaction

It is time-consuming for humans to generate models and they will often be incorrect or incomplete. This is especially true in multiagent systems where agents often require a model not only for the system, but also for the other agents. As a result, agents must be able to learn these models or learn a solution without using a model.

My work has developed scalable methods for learning in different settings:
  • Methods that learn from a small amount of data (generated from a simulator or real-world experience), which outperform hand-coded ``expert" solutions  [IJCAI 15]
  • Methods for online learning that allow agents to learn while acting in noisy, partially observable domains [AAAI 15]
These learning methods are widely applicable and scalable to large problems, but additional work is needed to produce near-optimal solutions that are robust to data limitations (e.g., having only a small amount of data that is very noisy).  

Optimizing agent performance with limited resources

When only a single agent is present or centralization is possible, POMDPs are a natural model for sequential decision-making with action and sensor uncertainty. In both POMDPs and Dec-POMDPs, solutions can become large and complicated, but searching through and representing these solutions can be computationally intractable.

My research has developed fixed-memory approaches that can often provide scalable, high quality solutions:
  • A fixed-memory formulation for (infinite-horizon) POMDPs which is optimal with respect to the fixed size and an optimization approach to approximate solutions  [IJCAI 07]
  • A fixed-memory formulation and solution method for Dec-POMDPs [UAI 07]
  • Additional improvements including more scalable solution methods [JAAMAS 09] and a technique that automatically discovers and exploits problem structure [AAAI 10]
These approaches provide concise solutions that balance solution quality and available resources, resulting in the best known performance in a number of common problems. 

Improving scalability in algorithms for uncertain decentralized systems

Although finding optimal solutions for Dec-POMDPs can be difficult, we can often find high quality approximate solutions and some problems permit more scalability due to their structure. My research has developed:
          • A sample-based algorithm for finite-horizon and goal-based Dec-POMDPs and with bounded optimality [AAMAS 09]
          • Scalable solution methods assuming additional structure is present in the form of independence in agent dynamics [UAI 12, AAMAS 13]
These methods solve significantly larger problems in terms of states and agents while retaining solution quality bounds.
Balancing time and accuracy in video surveillance

With the profusion of video data, automated surveillance and intrusion detection is becoming closer to reality. In order to provide a timely response while limiting false alarms, an intrusion detection system must balance resources (e.g., time) and accuracy.

We showed how such a system can be modeled with a partially observable Markov decision process (POMDP), representing possible computer vision filters and their costs in a way that is similar to human vision systems [IAAI 12].

Machine learning for video games

Video games provide a rich testbed for artificial intelligence methods. In particular, creating automated opponents that perform well in strategy games is a difficult task. For instance, human players rapidly discover and exploit the weaknesses of hard coded strategies.

To build better strategies, we developed a reinforcement learning approach for learning a policy that switches between high-level strategies [AAMAS 10].