3 Theoretical developments

Along with these technological developments there have been a series of theoretical developments that are critical for building large-scale artificial agents. Some have argued that theoretical developments are not that important: suggesting that standard back propagation at a sufficiently large scale is enough to capture complex perceptual processing (Krizhevsky et al. 2012). That is, building brain-like models is more a matter of getting a sufficiently large computer with enough parameters and neurons than it is of discovering some new principles about how brains function. If this is true, then the technological developments that I pointed to in the previous section may be sufficient for scaling to sophisticated cognitive agents. However, I am not convinced that this is the case.

As a result, I think that theoretical developments in deep learning, nonlinear adaptive control, high dimensional brain-like computing, and biological cognition combined will be important to support continued advances in understanding how the mind works. For instance, deep networks continue to achieve state-of-the-art results in a wide variety of perception-like processing challenges (http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html#43494641522d3130). And while deep networks have traditionally been used for static processing, such as an image classification or document classification, there has been a recent, concerted move to use them to model more dynamic perceptual tasks as well (Graves et al. 2013). In essence, deep networks are one among many techniques for modeling the statistics of time varying signals, a skill central to animal cognition.

However, animals are also incredibly adept at controlling nonlinear dynamical systems, including their bodies. That is, biological brains can generate time varying signals that allow successful and sophisticated interactions with their environment through their body. Critically, there have been a variety of important theoretical advances in nonlinear and adaptive control theory as well. New methods for solving difficult optimal control problems have been discovered through careful study of biological motor control (Schaal et al. 2007; Todorov 2008). In addition, advances in hierarchical control allow for real-time computation of difficult inverse kinematics problems on a laptop (Khatib 1987). And, finally, important advances in adaptive control allow for the automatic learning of both kinematic and dynamic models even in highly nonlinear and high dimensional control spaces (Cheah et al. 2006).

Concurrently with these more abstract characterizations of brain function there have been theoretical developments in neuroscience that have deepened our understanding of how biological neural networks may perform sophisticated information processing. Work using the Neural Engineering Framework (NEF) has resulted in a wide variety of spiking neural models that mirror data recorded from biological systems (Eliasmith & Anderson 1999, 2003). In addition, the closely related liquid computing (Maass et al. 2002) and FORCE learning (Sussillo & Abbott 2009) paradigms have been successfully exploited by a number of researchers to generate interesting dynamical systems that often closely mirror biological data. Together these kinds of methods provide quantitative characterizations of the computational power available in biologically plausible neural networks. Such developments are crucial for exploiting neuromorphic approaches to building brain-like hardware. And they suggest ways of testing some of the more abstract perceptual and control ideas in real-world, brain-like implementations.

Interestingly, several authors have suggested that difficult perceptual and control problems are in fact mathematical duals of one another (Todorov 2009; Eliasmith 2013). This means that there are deep theoretical connections between perception and motor control. This realization points to a need to think hard about how diverse aspects of brain function can be integrated into single, large-scale models. This has been a major focus of research in my lab recently. One result of this focus is Spaun, currently the world’s largest functional brain model. This model incorporates deep networks, recent control methods, and the NEF to perform eight different perceptual, motor, and cognitive tasks (Eliasmith et al. 2012). Importantly, this is not a one-off model, but rather a single example among many that employs a general architecture intended to directly address integrated biological cognition (Eliasmith 2013). Currently, the most challenging constraints for running models like Spaun are technological—computers are not fast enough. However, the neuromorphic technologies mentioned previously should soon remove these constraints. So, in some sense, theory currently outstrips application: we have individually tested several critical assumptions of the model and shown that they scale well (Crawford et al. 2013), but we are not yet able to integrate full-scale versions of the components due to limitations in current computational resources.

Taken together, I believe that these recent theoretical developments demonstrate that we have a roadmap for how to approach the problem of building sophisticated models of biological cognition. No doubt not all of the methods we need are currently available, but it is not evident that there are any major conceptual roadblocks to building a cognitive system that rivals the flexibility, adaptability, and robustness of those found in nature. I believe this is a unique historical position. In the heyday of the symbolic approach to AI there were detractors who said that the perceptual problems solved easily by biological systems would be a challenge for the symbolic approach (Norman 1986; Rumelhart 1989). They were correct. In the heyday of connectionism there were detractors who said that standard approaches to artificial neural networks would not be able to solve difficult planning or syntactic processing problems (Pinker & Prince 1988; Fodor & Pylyshyn 1988; Jackendoff 2002). They were correct. In the heyday of statistical machine learning approaches (a heyday we are still in) there are detractors who say that mountains of data are not sufficient for solving the kinds of problems faced by biological cognitive systems (Marcus 2013). They are probably correct. However, as many of the insights of these various approaches are combined with control theory, integrated into models able to do efficient syntactic and semantic processing with neural networks, and, in general, become conceptually unified (Eliasmith 2013), it is less and less obvious what might be missing from our characterization of biological cognition.