See http://nengo.ca/build-a-brain/spaunvideos/ for recent movies of Spaun, implementing this work.
Of particular relevance to the present proposal, we have developed methods for simulating cognitive behaviour (Eliasmith, 2004; Stewart and Eliasmith, 2008; Stewart and Eliasmith, 2009), and have successfully applied it in preliminary work to the well-known Wason card task, which tests language-based reasoning (Eliasmith, 2005). These developments begin to extend our work on high-dimensional neural representation and dynamics (Eliasmith, 2005) to more cognitive domains. The over-riding challenge that we now encounter, and the focus of this proposal, is scaling up both the theory and the simulations to tackle cognitive tasks in more demanding circumstances. We have begun preliminary work on the scaling of clean-up memory, a kind of associative memory necessary to implement our 'semantic pointer' architecture (Stewart et al., 2009, see Methods). In that work, we demonstrated how to construct a spiking neural network that can scale as well as an ideal associative memory, making this element of the architecture biologically plausible. Specifically, the network could clean up elements in an 8-part symbolic structure, using a vocabulary of 10,000 symbols, with 99% accuracy, using 100,000 neurons. To the best of our knowledge, past models have not employed vocabularies of this size, so these results are encouraging.
Our central objective is to build biologically detailed models of cognition. Building such models is crucial to improving our understanding of cognitive function in several ways, including: justifying or challenging assumptions currently made by cognitive modellers (Stewart and Eliasmith, 2009); applying neuroscientific constraints to cognitive models (Stewart and Eliasmith, 2009); and addressing cognitive functions over-looked by current cognitive models (Eliasmith, 2005).
Despite the potential benefits of constructing such biologically detailed simulations, there are no broadly accepted, systematic methods for relating biological data to our high-level understanding of cognitive systems. While our lab has had some success bridging the neural/cognitive gap in specific circumstances (Eliasmith, 2005), we have not yet demonstrated a consistent, scalable architecture that applies to a range of cognitive tasks.
There are several reasons why such methods are difficult to develop. First, it is challenging to relate realistic neural hardware to cognition without making highly implausible assumptions about the nature of the underlying hardware (see Pertinent Literature for current examples). Second, cognitive models are, by their very nature, large in scale. That is, they often recruit the activation of many areas of cortex. If those areas are to be simulated at the level of single neurons, computational constraints quickly make such simulations practically difficult. Third, it is unclear that current neural-based architectures can, even in principle, scale beyond toy problems (see Pertinent Literature). Fourth, there are few systematic methods that allow for the design of large-scale neural simluations, even if computational problems are solved and the function of the system is identified. And, finally, there have been few proposals for what the functional architecture of the system is that are detailed enough to implement in neurobiological simulations.
Short-term objectives: Elsewhere, we have argued at length that our Neural Engineering Framework (NEF) in principle provides a method for addressing the first four of these challenges (Eliasmith and Anderson, 2003, see Methods). However, in order to address these challenges successfully in the cognitive case it is important to establish a reasonable working hypothesis about the basic functional architecture underlying cognitive processing. For this reason, work on this project will be organized around the “semantic pointer hypothesis.” A simple statement of the hypothesis is as follows: “High-level cognitive functions in biological systems are made possible by semantic pointers. Semantic pointers are neural representations that carry partial semantic content and are composable into the complex structures necessary to support cognition. Semantic pointers are generated and used by perceptual and motor areas for deep semantic processing.”
Our key short-term objective is to show that this hypothesis can be realized in biologically realistic simulations which meet the first, third, and fourth challenges listed above (i.e., neurally plausible implementation, in-principle scalability, and systematic design). There are two aspects of this hypothesis that must be expanded in detail to meet these challenges. First, we must indicate how semantic information, even if partial, will be captured by the representations that we choose to identify as semantic pointers. Second, we must describe how to construct complex structures using 'semantic pointer' representations (see Methods). Indeed, in the conclusion to a recent review of his own and others' work on the issue of semantic processing, Barsalou (2009) states: “Perhaps the most pressing issue surrounding this area of work is the lack of well-specified computational accounts” (p. 1287). Ideally, this is where the semantic pointer hypothesis will contribute to our understanding of cognitive processing.
In the short term, we will determine if our approach can successfully address these challenges using neural simulations on small-scale problems. However, each of these problems, and the methods adopted, have been carefully chosen to allow for a significant increase in scale. Currently, we have chosen two cognitive tasks: the Wason card task (Eliasmith, 2005), and the Raven's Progressive Matrices (RPM; Rasmussen, 2009). These emphasize language processing (Wason, 1968) and general intelligence (Marshalek et al., 1983), respectively.
Concurrently, we will develop computational tools that allow us to scale up these preliminary models. Specifically, we will extend our Nengo simulation environment to run efficiently on a high-performance computing (HPC) infrastructure, such as Sharcnet (our preferred HPC consortium in Ontario). These extensions will result in parallelization of the network setup and temporal simulation algorithms (the two largest current bottlenecks).
Long-term objectives: Our long-term objective is to meet the second challenge identified above: large-scale simulation of a cognitive architecture. Again, we will tackle this problem by both developing the relevant computational tools, and extending our specific models to exploit those tools. Currently, the theory behind the NEF allows for a wide degree of flexibility in characterizing neural activity. It allows a model to be specified in terms of spiking neurons, non-spiking neurons, population activity, or just in terms of the relevant representations (i.e., with no neurons). This flexibility is crucial to exploit in order to practically scale cognitive models. As a result, Nengo will be extended to incorporate this flexibility in a parallel environment. This will allow the user to control the degree of detail in which each object of the simulation (i.e., network, subnetwork, neural population, neuron, etc.) is run, while gaining the advantages of parallel implementation.
To exploit these tools, we will scale up the Wason and RPM models. Currently, these models use 100 dimensional vectors (see Methods). To scale up to our current clean-up memory (Stewart et al., 2009), they will need to employ 500 dimensional vectors. This will allow processing of 10,000 distinct symbols, approximating the size of vocabulary of a 6 year old (Anglin, 1993). In the context of the Wason task, this will allow for many more inferences to be tested across a wider variety of contexts, and for the representations used in the inferences to be more complex. For RPM, this scaling will allow the system to take the same test administered to human subjects, aiding a straightforward comparison of the model and human data. In both cases, determining when and how the system fails and succeeds will test the appropriateness of the underlying representational and architectural hypotheses.
Currently, there are no neural cognitive models with this size of lexicon. Establishing such ambitious targets should significantly advance our understanding of how neural resources can be used to account for cognitive behaviour. It will also allow for a more direct comparison of the model with the many fMRI and ERP experiments performed during cognitive tasks, because the scale of the activity generated by the model will be comparable to the collected data. In short, scaling up models should both challenge current theory and more convincingly contact available data.
Perhaps the best known cognitive model is ACT-R (Anderson et al., 2004). This is a hybrid model that implements a classical cognitive architecture (Fodor and Pylyshyn, 1988) by combining a production system with procedural learning and a neurally-inspired model of declarative memory. Despite extensive success at modelling a wide variety of cognitive tasks, there is no characterization of the neural processing underlying the model, and there is strong evidence that including more neurally plausible components in the architecture would allow it to better account for human data (van Maanen, 2009). In short, ACT-R is an ideal point of comparison for the current state-of-the-art in cognitive modelling, as well as being useful as a point of departure for demonstrating the benefits of neural modelling.
A recent, more neurally-based cognitive approach is the LISA model (Hummel and Holyoak, 2003). LISA is also an implementation of a classical architecture: all elements of the structures are explicitly tokened whenever a structure is tokened. In LISA, each population only represents one object, subproposition, or proposition. Unfortunately, this results in exponentially poor scaling and unrealistic resource demands (Stewart and Eliasmith, 2009). More recently, van der Velde and de Kamps (2006) introduced the neural blackboard architectures (NBA) as a means of modelling cognition. To avoid the exponential growth in resources attributable to LISA, the NBA uses a smaller set of “cell assemblies” that represent basic symbols. Larger structures are then built by binding these assemblies together using a highly intricate system of neural gates. While better than LISA, NBA scales poorly, and introduces a complex and brittle control system into the architecture. In addition to these theoretical scaling issues, both the NBA and LISA, in practice, avoid important biological constraints. Neither approach uses spiking neurons, enforce local connectivity constraints, or have neural evidence for their assumed binding mechanism (Stewart and Eliasmith, 2009).
Methods and proposed approach
The three main methods that we will exploit to realize these goals are the NEF for neural simulation, a vector symbolic architecture (VSA) for binding, and a semantic pointer architecture for realizing cognitive processing.
The NEF: Over the past several years, we have developed a neural modelling approach that permits the construction of large-scale models (Eliasmith and Anderson, 2003). The size of these models is not limited for two main reasons: 1) because we provide a general mathematical description of a neural subsystem for which the inputs and the outputs are the same (i.e., trains of neural spikes); and 2) because our approach permits the analytical calculation of connection weight matrices. As a result, many subsystems can be concatenated to form complex models (e.g., our Wason model has 9 such subsystems), without the need for learning. Furthermore, because there are no assumptions made about the form of the neural nonlinearity (i.e., spike generator), our methods permit the inclusion of very high degrees of neural realism (e.g., using conductance-based single neuron models), or not, depending on the research question at hand. Finally, because representation in the NEF is descibed in terms of general vector spaces, the size of a vector space does not adversely influence the application of the theory. Consequently, if we can characterize a cognitive function as a transformation of vector spaces -- a very general kind of description -- the NEF provides a method to map that function onto a neural substrate, incorporating known neurological constraints (e.g., tuning curves, connection constraints, neurotransmitter type, etc.). Thus, to construct large-scale models of cognition, we need to express the relevant functions in such a vector space.
Vector Symbolic Architectures: The term “Vector Symbolic Architecture” was coined by Gayler (2003) to describe a class of closely related approaches to encoding syntactic structure using distributed, vector representations (Smolensky, 1990; Plate, 1991; Kanerva, 1994). In order to construct structured representations, VSAs define two operations. The first is a binding operation, which is a kind of vector product (). The second operation is a merging operation, which is vector superposition (). Unlike binding, the results of merging vectors is similar to both of the merged vectors. The similarity of vectors is determined a metric on the vector space, usually the inner product.
The binding and merging operations can be used to construct structured representations, for example, of “The dog chased the boy,” or chased(dog, boy). To encode this in a VSA, vectors for each of the roles and each of the fillers are needed. Then, to encode such a proposition as , we perform the following calculation: . To make use of this representation, the terms can be unbound. This is done by binding with the inverse of a term. For example, to decode the from the structure, given only , we bind with the inverse of , and the result is approximately equal to . Unlike classical architectures, VSAs rely on `reduced' representations. In such representations, the output of the vector binding operation does not explicitly include the bound components. As a result, the unbound elements must be recognized in the face of noise: i.e., they must be cleaned-up. This is why clean-up memories are crucial to the proper functioning of VSAs, and why we have been exploring their implementation in spiking neurons (Stewart et al., 2009).
Because the NEF maps vector transformations onto neural function, it is a natural method for examining the biological plausibility of VSAs (Eliasmith, 2004). Furthermore, VSAs can be used to compose semantic pointers in a manner suitable for supporting language-like processing. Consequently, our past work has provided proof-of-concept implementations of VSAs, and hence the neural composability of semantic pointers (Stewart and Eliasmith, 2009; Eliasmith, 2005). However, the vectors we have employed are low-dimensional and randomly generated, thus unable to effectively reflect semantic relationships.
Semantic Pointers: It has been suggested for some time that a vector space is a natural way to represent semantic relationships in a cognitive system. However, there is concern that such spaces are not sufficiently sophisticated to represent the kinds of complex representational structures that underlie cognition (Barsalou, 1999). This is where the notion of a “pointer” is of crucial importance. In computer science, a pointer is a set of numbers that indicates the memory address of a piece of information. Notably, a pointer and the information contained at its address are arbitrarily related. This, however, does not describe linguistic representation well, hence we suggest that 'cognitive' pointers are (shallowly) semantic. Nevertheless, the notion of a 'pointer' does provide the insight that the manipulation of compact, address-like representations can provide great efficiency when building a flexible architecture.
Recent empirical evidence is consistent with this notion of semantic pointers. For instance, Solomon and Barsalou (2004) demonstrated that pairings of target words and properties can result in significant differences in response times to determining if a property belongs to a target word. Specifically, false pairings that were lexically associated took longer to process than those that were not. This suggests that the semantics of the pointer were sufficient in the semantically easy cases, but insufficient in the hard cases, forcing the pointer to be 'de-referenced' in semantic memory, taking extra time. A similar result was reported in an fMRI experiment carried out by Kan et al. (2003). There, activation in semantically rich perceptual systems was only present in the difficult cases, while activation of frontal areas was evident in all cases. This work demonstrates that deep processing is not needed when a simpler word association (i.e., shallow semantic) strategy is sufficient to complete the task.
Given this functional characterization of semantic pointers, the remaining issue is how such pointers are generated. Given their semantic content, we assume that they are generated on the basis of the full semantics of a lexical item. These semantics are, arguably, found largely in perceptual processing areas (Barsalou, 2009). Consequently, we adopt hierarchical, generative statistical models of perception (Hinton, 2007), from which we can extract 'compressed' representations (Eliasmith, 2007); i.e., semantic pointers. In such models, a higher level in the processing hierarchy attempts to build a statistical model of the level below it. Taken together, the levels define a model of the original input data. This kind of hierarchical structure naturally allows the progressive generation of more complex features at higher levels, and progressively captures higher-order correlations in the data. Higher levels have fewer nodes than lower levels, so the highest level of the hierarchy can act as a compressed version of the state of the perceptual cortex, i.e., it can be a semantic pointer. To de-reference such a pointer, the activity of the last level can be fixed, acting as a statistical prior on the activity of lower levels, allowing the entire network to sample the distribution (i.e., fill-in the semantic details).
In applications to vision, these models have been shown to generate neuron tuning curves that mimic those seen in visual cortex (Lee et al., 2007). As well, many of the most actively researched models of biological object recognition can be seen as constructing exactly these kinds of statistical models (e.g., Riesenhuber and Poggio, 1999). Consequently, we are actively developing a simple visual model of this class to generate 'grounded' semantic pointers for the visual processing in the RPM task.
In sum, we are proposing to use the neural modelling methods of the NEF to implement a VSA that can appropriately process vectors in syntactic structures, where the surface semantics of those vectors are determined by the highest level of a hierarchical statistical model of perceptual systems. We have had preliminary success with these methods, and have chosen them due to their in-principle ability to scale well in a detailed neural simulation. We will extend our Nengo modelling environment to evaluate the scalability of these methods, and bring them in closer contact with empirical evidence.