Abstract: The speed–accuracy trade-off (SAT) is the tendency for fast decisions to come at the expense of accurate performance. Evidence accumulation models such as the drift diffusion model can reproduce a variety of behavioral data related to the SAT, and their parameters have been linked to neural activities in the brain. However, our understanding of how biological neural networks realize the associated cognitive operations remains incomplete, limiting our ability to unify neurological and computational accounts of the SAT. We address this gap by developing and analyzing a biologically plausible spiking neural network that extends the drift diffusion approach. We apply our model to both perceptual and nonperceptual tasks, investigate several contextual manipulations, and validate model performance using neural and behavioral data. Behaviorally, we find that our model (a) reproduces individual response time distributions; (b) generalizes across experimental contexts, including the number of choice alternatives, speed- or accuracy-emphasis, and task difficulty; and (c) predicts accuracy data, despite being fit only to response time data. Neurally, we show that our model (a) recreates observed patterns of spiking neural activity and (b) captures age-related deficits that are consistent with the behavioral data. More broadly, our model exhibits the SAT across a variety of tasks and contexts and explains how individual differences in speed and accuracy arise from synaptic weights within a spiking neural network. Our work showcases a method for translating mathematical models into functional neural networks and demonstrates that simulating such networks permits analyses and predictions that are outside the scope of purely mathematical models.
Abstract: The amygdala (AMY) is widely implicated in fear learning and fear behaviour, but it remains unclear how the many biological components present within AMY interact to achieve these abilities. Building on previous work, we hypothesize that individual AMY nuclei represent different quantities and that fear conditioning arises from error-driven learning on the synapses between AMY nuclei. We present a computational model of AMY that (a) recreates the divisions and connections between AMY nuclei and their constituent pyramidal and inhibitory neurons; (b) accommodates scalable high-dimensional representations of external stimuli; (c) learns to associate complex stimuli with the presence (or absence) of an aversive stimulus; (d) preserves feature information when mapping inputs to salience estimates, such that these estimates generalize to similar stimuli; and (e) induces a diverse profile of neural responses within each nucleus. Our model predicts (1) defensive responses and neural activities in several experimental conditions, (2) the consequence of artificially ablating particular nuclei and (3) the tendency to generalize defensive responses to novel stimuli. We test these predictions by comparing model outputs to neural and behavioural data from animals and humans. Despite the relative simplicity of our model, we find significant overlap between simulated and empirical data, which supports our claim that the model captures many of the neural mechanisms that support fear conditioning. We conclude by comparing our model to other computational models and by characterizing the theoretical relationship between pattern separation and fear generalization in healthy versus anxious individuals.
Abstract: With the advancement of large pretrained language models (PLMs), many question answering (QA) benchmarks have been developed in order to evaluate the reasoning capabilities of these models. Augmenting PLMs with external knowledge in the form of Knowledge Graphs (KGs) has been a popular method to improve their reasoning capabilities, and a common method to reason over KGs is to use Graph Neural Networks (GNNs). As an alternative to GNNs to augment PLMs, we propose a novel graph reasoning module using Vector Symbolic Algebra (VSA) graph representations and a k-layer MLP. We demonstrate that our VSA-based model performs as well as QA-GNN, a model combining a PLM and a GNN-module, on 3 multiple-choice question answering (MCQA) datasets. Our model has a simpler architecture than QA-GNN and also converges 39
Abstract: Social interaction is one of humanity's defining features. Through it, we develop ideas, express emotions, and form relationships. In this thesis, we explore the topic of social cognition by building biologically-plausible computational models of learning and decision making. Our goal is to develop mechanistic explanations for how the brain performs a variety of social tasks, to test those theories by simulating neural networks, and to validate our models by comparing to human and animal data. We begin by introducing social cognition from functional and anatomical perspectives, then present the Neural Engineering Framework, which we use throughout the thesis to specify functional brain models. Over the course of four chapters, we investigate many aspects of social cognition using these models. We begin by studying fear conditioning using an anatomically accurate model of the amygdala. We validate this model by comparing the response properties of our simulated neurons with real amygdala neurons, showing that simulated behavior is consistent with animal data, and exploring how simulated fear generalization relates to normal and anxious humans. Next, we show that biologically-detailed networks may realize cognitive operations that are essential for social cognition. We validate this approach by constructing a working memory network from multi-compartment cells and conductance-based synapses, then show that its mnemonic performance is comparable to animals performing a delayed match-to-sample task. In the next chapter, we study decision making and the tradeoffs between speed and accuracy: our network gathers information from the environment and tracks the value of choice alternatives, making a decision once certain criteria are met. We apply this model to a two-choice decision task, fit model parameters to recreate the behavior of individual humans, and reproduce the speed-accuracy tradeoff evident in the human population. Finally, we combine our networks for learning, working memory, and decision making into a cognitive agent that uses reinforcement learning to play a simple social game. We compare this model with two other cognitive architectures and with human data from an experiment we ran, and show that our three cognitive agents recreate important patterns in the human data, especially those related to social value orientation and cooperative behavior. Our concluding chapter summarizes our contributions to the field of social cognition and proposes directions for further research. The main contribution of this thesis is the demonstration that a diverse set of social cognitive abilities may be explained, simulated, and validated using a functionally-descriptive, biologically-plausible theoretical framework. Our models lay a foundation for studying increasingly-sophisticated forms of social cognition in future work.
Abstract: The Neural Engineering Framework (Eliasmith & Anderson, 2003) is a long-standing method for implementing high-level algorithms constrained by low-level neurobiological details. In recent years, this method has been expanded to incorporate more biological details and applied to new tasks. This paper brings together these ongoing research strands, presenting them in a common framework. We expand on the NEF’s core principles of (a) specifying the desired tuning curves of neurons in different parts of the model, (b) defining the computational relationships between the values represented by the neurons in different parts of the model, and (c) finding the synaptic connection weights that will cause those computations and tuning curves. In particular, we show how to extend this to include complex spatiotemporal tuning curves, and then apply this approach to produce functional computational models of grid cells, time cells, path integration, sparse representations, probabilistic representations, and symbolic representations in the brain.
Abstract: In this paper, we present a fully spiking neural network running on Intel’s Loihi chip for operational space control of a simulated 7-DOF arm. Our approach uniquely combines neural engineering and deep learning methods to successfully implement position and orientation control of the end effector. The development process involved 4 stages: 1) Designing a node-based network architecture implementing an analytical solution; 2) developing rate neuron networks to replace the nodes; 3) retraining the network to handle spiking neurons and temporal dynamics; and finally 4) adapting the network for the specific hardware constraints of the Loihi. We benchmark the controller on a center-out reaching task, using the deviation of the end effector from the ideal trajectory as our evaluation metric. The RMSE of the final neuromorphic controller running on Loihi is only slightly worse than the analytic solution, with 4.13
Abstract: To navigate in new environments, an animal must be able to keep track of its position while simultaneously creating and updating an internal map of features in the environment, a problem formulated as simultaneous localization and mapping (SLAM) in the field of robotics. This requires integrating information from different domains, including self-motion cues, sensory, and semantic information. Several specialized neuron classes have been identified in the mammalian brain as being involved in solving SLAM. While biology has inspired a whole class of SLAM algorithms, the use of semantic information has not been explored in such work. We present a novel, biologically plausible SLAM model called SSP-SLAM—a spiking neural network designed using tools for large scale cognitive modeling. Our model uses a vector representation of continuous spatial maps, which can be encoded via spiking neural activity and bound with other features (continuous and discrete) to create compressed structures containing semantic information from multiple domains (e.g., spatial, temporal, visual, conceptual). We demonstrate that the dynamics of these representations can be implemented with a hybrid oscillatory-interference and continuous attractor network of head direction cells. The estimated self-position from this network is used to learn an associative memory between semantically encoded landmarks and their positions, i.e., an environment map, which is used for loop closure. Our experiments demonstrate that environment maps can be learned accurately and their use greatly improves self-position estimation. Furthermore, grid cells, place cells, and object vector cells are observed by this model. We also run our path integrator network on the NengoLoihi neuromorphic emulator to demonstrate feasibility for a full neuromorphic implementation for energy efficient SLAM.
Abstract: Path integration, the ability to maintain an estimate of one's location by continuously integrating self-motion cues, is a vital component of the brain's navigation system. We present a spiking neural network model of path integration derived from a starting assumption that the brain represents continuous variables, such as spatial coordinates, using Spatial Semantic Pointers (SSPs). SSPs are a representation for encoding continuous variables as high-dimensional vectors, and can also be used to create structured, hierarchical representations for neural cognitive modelling. Path integration can be performed by a recurrently-connected neural network using SSP representations. Unlike past work, we show that our model can be used to continuously update variables of any dimensionality. We demonstrate that symbol-like object representations can be bound to continuous SSP representations. Specifically, we incorporate a simple model of working memory to remember environment maps with such symbol-like representations situated in 2D space.
Keywords: Machine Learning (cs.LG), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences
Abstract: In this report we consider the following problem: Given a trained model that is partially faulty, can we correct its behaviour without having to train the model from scratch? In other words, can we “debug" neural networks similar to how we address bugs in our mathematical models and standard computer code. We base our approach on the hypothesis that debugging can be treated as a two-task continual learning problem. In particular, we employ a modified version of a continual learning algorithm called Orthogonal Gradient Descent (OGD) to demonstrate, via two simple experiments on the MNIST dataset, that we can in-fact \textit unlearn the undesirable behaviour while retaining the general performance of the model, and we can additionally \textit relearn the appropriate behaviour, both without having to train the model from scratch.
Abstract: Unmanned aerial vehicles (UAVs) need more autonomy. In light of inherent size, weight and power (SWaP) constraints, avionics with artificial intelligence implemented using neuromorphic technology offers a potential solution. We demonstrate intelligent drone control using spiking neural networks (SNNs), which can run on neuromorphic hardware. We present "BatSLAM", a modular SNN for autonomous localization, navigation, and control of UAVs. The bio-inspired algorithms are implemented using the neural modeling and simulation software Nengo, and are able to control the drone to autonomously perform a complex "house search" task in an AirSim simulated environment using only local sensor feedback. In this task, the drone is randomly placed and given a target object. The BatSLAM network localizes the drone, retrieves the target object location from memory, and then guides the drone along the most efficient path through the house to the target, avoiding obstacles and maneuvering through doors and stairways. We present benchmark results showing the BatSLAM network achieves 97.2 percent success rate in navigating to target objects in the house. To the best of our knowledge, the BatSLAM network presented here is the first in the world to carry out localization, navigation, and control in a fully spiking implementation.
Abstract: Mixed-signal neuromorphic computers often emulate some variant of the LIF neuron model. While, in theory, two-layer networks of these neurons are universal function approximators, single-layer networks consisting of slightly more complex neurons can, at the cost of universality, be more efficient. In this paper, we discuss a family of LIF neurons with passive dendrites. We provide rules that describe how input channels targeting different dendritic compartments interact, and test in how far these interactions can be harnessed in a spiking neural network context. We find that a single layer of two-compartment neurons approximates some functions at smaller errors than similarly sized hidden-layer networks. Single-layer networks with with three compartment neurons can approximate functions such as XOR and four-quadrant multiplication well; adding more compartments only offers small improvements in accuracy. From the perspective of mixed-signal neuromorphic systems, our results suggest that only small modifications to the neuron circuit are necessary to construct more computationally powerful and energy efficient systems that move more computation into the dendritic, analogue domain.
Abstract: Improving biological plausibility and functional capacity are two important goals for brain models that connect low-level neural details to high-level behavioral phenomena. We develop a method called “oracle-supervised Neural Engineering Framework” (osNEF) to train biologically-detailed spiking neural networks that realize a variety of cognitively-relevant dynamical systems. Specifically, we train networks to perform computations that are commonly found in cognitive systems (communication, multiplication, harmonic oscillation, and gated working memory) using four distinct neuron models (leaky-integrate-and-fire neurons, Izhikevich neurons, 4-dimensional nonlinear point neurons, and 4-compartment, 6-ion-channel layer-V pyramidal cell reconstructions) connected with various synaptic models (current-based synapses, conductance-based synapses, and voltage-gated synapses). We show that osNEF networks exhibit the target dynamics by accounting for nonlinearities present within the neuron models: performance is comparable across all four systems and all four neuron models, with variance proportional to task and neuron model complexity. We also apply osNEF to build a model of working memory that performs a delayed response task using a combination of pyramidal cells and inhibitory interneurons connected with NMDA and GABA synapses. The baseline performance and forgetting rate of the model are consistent with animal data from delayed match-to-sample tasks (DMTST): we observe a baseline performance of 95 percent and exponential forgetting with time constant tau = 8.5s, while a recent meta-analysis of DMTST performance across species observed baseline performances of 58 − 99 percent and exponential forgetting with time constants of tau = 2.4 − 71s. These results demonstrate that osNEF can train functional brain models using biologically-detailed components and open new avenues for investigating the relationship between biophysical mechanisms and functional capabilities.
Abstract: Researchers study nervous systems at levels of scale spanning several orders of magnitude, both in terms of time and space. While some parts of the brain are well understood at specific levels of description, there are few overarching theories that systematically bridge low-level mechanism and high-level function. The Neural Engineering Framework (NEF) is an attempt at providing such a theory. The NEF enables researchers to systematically map dynamical systems—corresponding to some hypothesised brain function—onto biologically constrained spiking neural networks. In this thesis, we present several extensions to the NEF that broaden both the range of neural resources that can be harnessed for spatiotemporal computation and the range of available biological constraints. Specifically, we suggest a method for harnessing the dynamics inherent in passive dendritic trees for computation, allowing us to construct single-layer spiking neural networks that, for some functions, achieve substantially lower errors than larger multi-layer networks. Furthermore, we suggest “temporal tuning” as a unifying approach to harnessing temporal resources for computation through time. This allows modellers to directly constrain networks to temporal tuning observed in nature, in ways not previously well-supported by the NEF. We then explore specific examples of neurally plausible dynamics using these techniques. In particular, we propose a new “information erasure” technique for constructing LTI systems generating temporal bases. Such LTI systems can be used to establish an optimal basis for spatiotemporal computation. We demonstrate how this captures “time cells” that have been observed throughout the brain. As well, we demonstrate the viability of our extensions by constructing an adaptive filter model of the cerebellum that successfully reproduces key features of eyeblink conditioning observed in neurobiological experiments. Outside the cognitive sciences, our work can help exploit resources available on existing neuromorphic computers, and inform future neuromorphic hardware design. In machine learning, our spatiotemporal NEF populations map cleanly onto the Legendre Memory Unit (LMU), a promising artificial neural network architecture for stream-to-stream processing that outperforms competing approaches. We find that one of our LTI systems derived through “information erasure” may serve as a computationally less expensive alternative to the LTI system commonly used in the LMU.
Abstract: Distributed vector representations are a key bridging point between connectionist and symbolic representations of cognition. It is unclear how uncertainty should be modelled in systems using such representations. One may place vector-valued distributions over vector representations, although that may assign non-zero probabilities to vector symbols that cannot occur. In this paper we discuss how bundles of symbols in Vector Symbolic Architectures (VSAs) can be understood as defining an object that has a relationship to a probability distribution, and how statements in VSAs can be understood as being analogous to probabilistic statements. We sketch novel designs for networks that compute entropy and mutual information of VSA-represented distributions. In this paper we restrict ourselves to operators proposed for Holographic Reduced Representations, and representing real-valued data. However, we suggest that the methods presented in this paper should translate to any VSA where the dot product between fractionally bound symbols induces a valid kernel.
Abstract: Modern neural networks have allowed substantial advances in robotics, but these algorithms make implicit assumptions about the discretization of time. In this document we argue that there are benefits to be gained, especially in robotics, by designing learning algorithms that exist in continuous time, as well as state, and only later discretizing the algorithms for implementation on traditional computing models, or mapping them directly onto analog hardware. We survey four arguments to support this approach: That continuum representations provide a unified theory of functions for robotic systems; That many algorithms formulated as temporally continuous demonstrate anytime properties; That we can exploit temporal sparsity to effect energy efficiency in both traditional and analog hardware; and that these algorithms reflect the instantiations of intelligence that have evolved in organisms. Further, we present learning algorithms that are derived from continuous representations. Finally, we discuss robotic precedents for this approach, and conclude with the implications of using continuum representations in robotic systems.
Abstract: Social environments often impose tradeoffs between pursuing personal goals and maintaining a favorable reputation. We studied how individuals navigate these tradeoffs using Reinforcement Learning (RL), paying particular attention to the role of social value orientation (SVO). We had human participants play an interated Trust Game against various software opponents and analyzed the behaviors. We then incorporated RL into two cognitive models, trained these RL agents against the same software opponents, and performed similar analyses. Our results show that the RL agents reproduce many interesting features in the human data, such as the dynamics of convergence during learning and the tendency to defect once reciprocation becomes impossible. We also endowed some of our agents with SVO by incorporating terms for altruism and inequality aversion into their reward functions. These prosocial agents differed from proself agents in ways that resembled the differences between prosocial and proself participants. This suggests that RL is a useful framework for understanding how people use feedback to make social decisions.
Abstract: Recently, a new recurrent neural network (RNN) named the Legendre Memory Unit (LMU) was proposed and shown to achieve state-of-the-art performance on several benchmark datasets. Here we leverage the linear time-invariant (LTI) memory component of the LMU to construct a simplified variant that can be parallelized during training (and yet executed as an RNN during inference), resulting in up to 200 times faster training. We note that our efficient parallelizing scheme is general and is applicable to any deep network whose recurrent components are linear dynamical systems. We demonstrate the improved accuracy of our new architecture compared to the original LMU and a variety of published LSTM and transformer networks across seven benchmarks. For instance, our LMU sets a new state-of-the-art result on psMNIST, and uses half the parameters while outperforming DistilBERT and LSTM models on IMDB sentiment analysis.
Abstract: We discuss the notion of "discrete function bases" with a particular focus on the discrete basis derived from the Legendre Delay Network (LDN). We characterize the performance of these bases in a delay computation task, and as fixed temporal convolutions in neural networks. Networks using fixed temporal convolutions are conceptually simple and yield state-of-the-art results in tasks such as psMNIST.
Abstract: We present an alternative derivation of the LTI system underlying the Legendre Delay Network (LDN). To this end, we first construct an LTI system that generates the Legendre polynomials. We then dampen the system by approximating a windowed impulse response, using what we call a "delay re-encoder". The resulting LTI system is equivalent to the LDN system. This technique can be applied to arbitrary polynomial bases, although there typically is no closed-form equation that describes the state-transition matrix.
Abstract: While neural networks are highly effective at learning task-relevant representations from data, they typically do not learn representations with the kind of symbolic structure that is hypothesized to support high-level cognitive processes, nor do they naturally model such structures within problem domains that are continuous in space and time. To fill these gaps, this work exploits a method for defining vector representations that bind discrete (symbol-like) entities to points in continuous topological spaces in order to simulate and predict the behavior of a range of dynamical systems. These vector representations are spatial semantic pointers (SSPs), and we demonstrate that they can (1) be used to model dynamical systems involving multiple objects represented in a symbol-like manner and (2) be integrated with deep neural networks to predict the future of physical trajectories. These results help unify what have traditionally appeared to be disparate approaches in machine learning.
Abstract: Neurophysiology and neuroanatomy constrain the set of possible computations that can be performed in a brain circuit. While detailed data on brain microcircuits is sometimes available, cognitive modelers are seldom in a position to take these constraints into account. One reason for this is the intrinsic complexity of accounting for biological mechanisms when describing cognitive function. In this paper, we present multiple extensions to the neural engineering framework (NEF), which simplify the integration of low-level constraints such as Dale's principle and spatially constrained connectivity into high-level, functional models. We focus on a model of eyeblink conditioning in the cerebellum, and, in particular, on systematically constructing temporal representations in the recurrent granule–Golgi microcircuit. We analyze how biological constraints impact these representations and demonstrate that our overall model is capable of reproducing key properties of eyeblink conditioning. Furthermore, since our techniques facilitate variation of neurophysiological parameters, we gain insights into why certain neurophysiological parameters may be as observed in nature. While eyeblink conditioning is a somewhat primitive form of learning, we argue that the same methods apply for more cognitive models as well. We implemented our extensions to the NEF in an open-source software library named “NengoBio” and hope that this work inspires similar attempts to bridge low-level biological detail and high-level function.
Abstract: Recent studies have demonstrated that the performance of transformers on the task of language modeling obeys a power-law relationship with model size over six orders of magnitude. While transformers exhibit impressive scaling, their performance hinges on processing large amounts of data, and their computational and memory requirements grow quadratically with sequence length. Motivated by these considerations, we construct a Legendre Memory Unit based model that introduces a general prior for sequence processing and exhibits an O(n) and O(nlnn) (or better) dependency for memory and computation respectively. Over three orders of magnitude, we show that our new architecture attains the same accuracy as transformers with 10x fewer tokens. We also show that for the same amount of training our model improves the loss over transformers about as much as transformers improve over LSTMs. Additionally, we demonstrate that adding global self-attention complements our architecture and the augmented model improves performance even further.
Abstract: Nonlinear interactions in the dendritic tree play a key role in neural computation. Nevertheless, modeling frameworks aimed at the construction of large-scale, functional spiking neural networks, such as the Neural Engineering Framework, tend to assume a linear superposition of postsynaptic currents. In this letter, we present a series of extensions to the Neural Engineering Framework that facilitate the construction of networks incorporating Dale's principle and nonlinear conductance-based synapses. We apply these extensions to a two-compartment LIF neuron that can be seen as a simple model of passive dendritic computation. We show that it is possible to incorporate neuron models with input-dependent nonlinearities into the Neural Engineering Framework without compromising high-level function and that nonlinear postsynaptic currents can be systematically exploited to compute a wide variety of multivariate, band-limited functions, including the Euclidean norm, controlled shunting, and nonnegative multiplication. By avoiding an additional source of spike noise, the function approximation accuracy of a single layer of two-compartment LIF neurons is on a par with or even surpasses that of two-layer spiking neural networks up to a certain target function bandwidth.
Abstract: We present the context-unified encoding (CUE) model, a large-scale spiking neural network model of human memory. It combines and integrates activity-based short-term memory with weight-based long-term memory. The implementation with spiking neurons ensures biological plausibility and allows for predictions on the neural level. At the same time, the model produces behavioral outputs that have been matched to human data from serial and free recall experiments. In particular, well-known results such as primacy, recency, transposition error gradients, and forward recall bias have been reproduced with good quantitative matches. Additionally, the model accounts for the Hebb repetition effect. The CUE model combines and extends the ordinal serial encoding (OSE) model, a spiking neuron model of short-term memory, and the temporal context model (TCM), a mathematical memory model matching free recall data. To implement the modification of the required association matrices, a novel learning rule, the association matrix learning rule (AML), is derived that allows for one-shot learning without catastrophic forgetting. Its biological plausibility is discussed and it is shown that it accounts for changes in neural firing observed in human recordings from an association learning experiment.
Abstract: Mutual information (MI) is a standard objective function for driving exploration. The use of Gaussian processes to compute information gain is limited by time and memory complexity that grows with the number of observations collected. We present an efficient implementation of MI-driven exploration by combining vector symbolic architectures with Bayesian Linear Regression. We demonstrate equivalent regret performance to a GP-based approach with memory and time complexity that is constant in the number of samples collected, as opposed to t2 and t3, respectively, enabling long-term exploration.
Abstract: As neuromorphic hardware begins to emerge as a viable target platform for artificial intelligence (AI) applications, there is a need for tools and software that can effectively compile a variety of AI models onto such hardware. Nengo (http://nengo.ai) is an ecosystem of software designed to fill this need with a suite of tools for creating, training, deploying, and visualizing neural networks for various hardware backends, including CPUs, GPUs, FPGAs, microcontrollers, and neuromorphic hardware. While backpropagation-based methods are powerful and fully supported in Nengo, there is also a need for frameworks that are capable of efficiently mapping dynamical systems onto such hardware while best utilizing its computational resources. The neural engineering framework (NEF) is one such method that is supported by Nengo. Most prominently, Nengo and the NEF have been used to engineer the world's largest functional model of the human brain. In addition, as a particularly efficient approach to training neural networks for neuromorphics, the NEF has been ported to several neuromorphic platforms. In this chapter, we discuss the mathematical foundations of the NEF and a number of its extensions and review several recent applications that use Nengo to build models for neuromorphic hardware. We focus in-depth on a particular class of dynamic neural networks, Legendre Memory Units (LMUs), which have demonstrated advantages over state-of-the-art approaches in deep learning with respect to energy efficiency, training time, and accuracy.
Abstract: We discuss a minimal, unconstrained log-Cholesky parametrisation of radial basis functions (RBFs) and the corresponding partial derivatives. This is useful when using RBFs as part of neural network that is either trained in a supervised fashion via error backpropagation, or unsupervised using a homeostasis mechanism. We perform some experiments and discuss potential caveats when using RBFs in this way. Furthermore, we compare RBFs to the Spatial Semantic Pointer similarity that can be used to construct networks with sparse hidden representations resembling those found in RBF networks.
Abstract: In this thesis I explore a biologically inspired method of encoding continuous space within a population of neurons. This method provides an extension to the Semantic Pointer Architecture (SPA) to encompass Semantic Pointers with real-valued spatial content in addition to symbol-like representations. I demonstrate how these Spatial Semantic Pointers (SSPs) can be used to generate cognitive maps containing objects at various locations. A series of operations are defined that can retrieve objects or locations from the encoded map as well as manipulate the contents of the memory. These capabilities are all implemented by a network of spiking neurons. I explore the topology of the SSP vector space and show how it preserves metric information while compressing all coordinates to unit length vectors. This allows a limitless spatial extent to be represented in a finite region. Neurons encoding space represented in this manner have firing fields similar to entorhinal grid cells. Beyond constructing biologically plausible models of spatial cognition, SSPs are applied to the domain of machine learning. I demonstrate how replacing traditional spatial encoding mechanisms with SSPs can improve performance on networks trained to compute a navigational policy. In addition, SSPs are also effective for training a network to localize within an environment based on sensor measurements as well as perform path integration. To demonstrate a practical, integrated system using SSPs, I combine a goal driven navigational policy with the localization network and cognitive map representation to produce an agent that can navigate to semantically defined goals. In addition to spatial tasks, the SSP encoding is applied to a more general class of machine learning problems involving arbitrary continuous signals. Results on a collection of 122 benchmark datasets across a variety of domains indicate that neural networks trained with SSP encoding outperform commonly used methods for the majority of the datasets. Overall, the experiments in this thesis demonstrate the importance of exploring new kinds of representations within neural networks and how they shape the kinds of functions that can be effectively computed. They provide an example of how insights regarding how the brain may encode information can inspire new ways of designing artificial neural networks.
Abstract: Spatial cognition relies on an internal map-like representation of space provided by hippocampal place cells, which in turn are thought to rely on grid cells as a basis. Spatial Semantic Pointers (SSP) have been introduced as a way to represent continuous spaces and positions via the activity of a spiking neural network. In this work, we further develop SSP representation to replicate the firing patterns of grid cells. This adds biological realism to the SSP representation and links biological findings with a larger theoretical framework for representing concepts. Furthermore, replicating grid cell activity with SSPs results in greater accuracy when constructing place cells. Improved accuracy is a result of grid cells forming the optimal basis for decoding positions and place cell output. Our results have implications for modelling spatial cognition and more general cognitive representations over continuous variables.
Abstract: Neuromorphic hardware has long promised to provide power advantages by leveraging the kind of event-driven, temporally sparse computation observed in biological neural systems. Only recently, however, has this hardware been developed to a point that allows for general purpose AI programming. In this paper, we provide an overview of tools and methods for building applications that run on neuromorphic computing devices. We then discuss reasons for observed efficiency gains in neuromorphic systems, and provide a concrete illustration of these gains by comparing conventional and neuromorphic implementations of a keyword spotting system trained on the widely used Speech Commands dataset. We show that replacing floating point operations in a conventional neural network with synaptic operations in a spiking neural network results in a roughly 4x energy reduction, with minimal performance loss.
Abstract: Emergent communication in artificial agents has been studied to understand language evolution, as well as to develop artificial systems that learn to communicate with humans. We show that agents performing a cooperative navigation task in various gridworld environments learn an interpretable communication protocol that enables them to efficiently, and in many cases, optimally, solve the task. An analysis of the agents' policies reveals that emergent signals spatially cluster the state space, with signals referring to specific locations and spatial directions such as left, up, or upper left room. Using populations of agents, we show that the emergent protocol has basic compositional structure, thus exhibiting a core property of natural language.
Abstract: Decision making (DM) requires the coordination of anatomically and functionally distinct cortical and subcortical areas. While previous computational models have studied these subsystems in isolation, few models explore how DM holistically arises from their interaction. We propose a spiking neuron model that unifies various components of DM, then show that the model performs an inferential decision task in a human-like manner. The model (a) includes populations corresponding to dorsolateral prefrontal cortex, orbitofrontal cortex, right inferior frontal cortex, pre-supplementary motor area, and basal ganglia; (b) is constructed using 8000 leaky-integrate-and-fire neurons with 7 million connections; and (c) realizes dedicated cognitive operations such as weighted valuation of inputs, accumulation of evidence for multiple choice alternatives, competition between potential actions, dynamic thresholding of behavior, and urgency-mediated modulation. We show that the model reproduces reaction time distributions and speed-accuracy tradeoffs from humans performing the task. These results provide behavioral validation for tasks that involve slow dynamics and perceptual uncertainty; we conclude by discussing how additional tasks, constraints, and metrics may be incorporated into this initial framework.
Abstract: The cerebellum is classically described in terms of its role in motor control. Recent evidence suggests that the cerebellum supports a wide variety of functions, including timing-related cognitive tasks and perceptual prediction. Correspondingly, deciphering cerebellar function may be important to advance our understanding of cognitive processes. In this paper, we build a model of eyeblink conditioning, an extensively studied low-level function of the cerebellum. Building such a model is of particular interest, since, as of now, it remains unclear how exactly the cerebellum manages to learn and reproduce the precise timings observed in eyeblink conditioning that are potentially exploited by cognitive processes as well. We employ recent advances in large-scale neural network modeling to build a biologically plausible spiking neural network based on the cerebellar microcircuitry. We compare our simulation results to neurophysiological data and demonstrate how the recurrent Granule-Golgi subnetwork could generate the dynamics representations required for triggering motor trajectories in the Purkinje cell layer. Our model is capable of reproducing key properties of eyeblink conditioning, while generating neurophysiological data that could be experimentally verified.
Abstract: We present several experiments demonstrating the efficiency and scalability of a biologically inspired spatial representation on navigation tasks using artificial neural networks. Specifically, we demonstrate that encoding coordinates with Spatial Semantic Pointers (SSPs) outperforms six other proposed encoding methods when training a neural network to navigate to arbitrary goals in a 2D environment. The SSP representation naturally generalizes to larger spaces, as there is no definition of a boundary required (unlike most other methods). Additionally, we show how this navigational policy can be integrated into a larger system that combines memory retrieval and self-localization to produce a behavioural agent capable of finding cued goal objects. We further demonstrate that explicitly incorporating a hexagonal grid cell-like structure in the generation of SSPs can improve performance. This biologically inspired spatial representation has been shown to be able to produce spiking neural models of spatial cognition. The link between SSPs and higher level cognition allows models using this representation to be seamlessly integrated into larger neural models to elicit complex behaviour.
Abstract: Linguistic communication is a unique characteristic of intelligent behaviour that distinguishes humans from non-human animals. Natural language is a structured, complex communication system supported by a variety of cognitive functions, realized by hundreds of millions of neurons in the brain. Artificial neural networks typically used in natural language processing (NLP) are often designed to focus on benchmark performance, where one of the main goals is reaching the state-of-the-art performance on a set of language tasks. Although the advances in NLP have been tremendous in the past decade, such networks provide only limited insights into biological mechanisms underlying linguistic processing in the brain. In this thesis, we propose an integrative approach to the study of computational mechanisms underlying fundamental language processes, spanning biologically plausible neural networks, and learning of basic communicative abilities through environmentally grounded behaviour. In doing so, we argue for the usage-based approach to language, where language is supported by a variety of cognitive functions and learning mechanisms. Thus, we focus on the three following questions: How are basic linguistic units, such as words, represented in the brain? Which neural mechanisms operate on those representations in cognitive tasks? How can aspects of such representations, such as associative similarity and structure, be learned in a usage-based framework? To answer the first two questions, we build novel, biologically realistic models of neural function that perform different semantic processing tasks: the Remote Associates Test (RAT) and the semantic fluency task. Both tasks have been used in experimental and clinical environments to study organizational principles and retrieval mechanisms from semantic memory. The models we propose realize the mental lexicon and cognitive retrieval processes operating on that lexicon using associative mechanisms in a biologically plausible manner. We argue that such models are the first and only biologically plausible models that propose specific mechanisms as well as reproduce a wide range of human behavioural data on those tasks, further corroborating their plausibility. To address the last question, we use an interactive, collaborative agent-based reinforcement learning setup in a navigation task where agents learn to communicate to solve the task. We argue that agents in such a setup learn to jointly coordinate their actions, and develop a communication protocol that is often optimal for the performance on the task, while exhibiting some core properties of language, such as representational similarity structure and compositionality, essential for associative mechanisms underlying cognitive representations.
Abstract: The machine learning community has become increasingly interested in the energy efficiency of neural networks. The Spiking Neural Network (SNN) is a promising approach to energy-efficient computing, since its activation levels are quantized into temporally sparse, one-bit values (i.e., "spike" events), which additionally converts the sum over weight-activity products into a simple addition of weights (one weight for each spike). However, the goal of maintaining state-of-the-art (SotA) accuracy when converting a non-spiking network into an SNN has remained an elusive challenge, primarily due to spikes having only a single bit of precision. Adopting tools from signal processing, we cast neural activation functions as quantizers with temporally-diffused error, and then train networks while smoothly interpolating between the non-spiking and spiking regimes. We apply this technique to the Legendre Memory Unit (LMU) to obtain the first known example of a hybrid SNN outperforming SotA recurrent architectures – including the LSTM, GRU, and NRU – in accuracy, while reducing activities to at most 3.74 bits on average with 1.26 significant bits multiplying each weight. We discuss how these methods can significantly improve the energy efficiency of neural networks.
Abstract: The ability to flexibly route information between brain regions is crucial to perform new multi-step tasks. The mechanisms postulated by the global neuronal workspace theory of consciousness are thought to underlie this ability. We developed a spiking neural architecture based on the theory, which is able to route concepts in order to perform a cognitive task composed of chained operations.
Abstract: In this paper we demonstrate how the Nengo neural modeling and simulation libraries enable users to quickly develop robotic perception and action neural networks for simulation on neuromorphic hardware using tools they are already familiar with, such as Keras and Python. We identify four primary challenges in building robust, embedded neurorobotic systems, including: (1) developing infrastructure for interfacing with the environment and sensors; (2) processing task specific sensory signals; (3) generating robust, explainable control signals; and (4) compiling neural networks to run on target hardware. Nengo helps to address these challenges by: (1) providing the NengoInterfaces library, which defines a simple but powerful API for users to interact with simulations and hardware; (2) providing the NengoDL library, which lets users use the Keras and TensorFlow API to develop Nengo models; (3) implementing the Neural Engineering Framework, which provides white-box methods for implementing known functions and circuits; and (4) providing multiple backend libraries, such as NengoLoihi, that enable users to compile the same model to different hardware. We present two examples using Nengo to develop neural networks that run on CPUs and GPUs as well as Intel's neuromorphic chip, Loihi, to demonstrate two variations on this workflow. The first example is an implementation of an end-to-end spiking neural network in Nengo that controls a rover simulated in Mujoco. The network integrates a deep convolutional network that processes visual input from cameras mounted on the rover to track a target, and a control system implementing steering and drive functions in connection weights to guide the rover to the target. The second example uses Nengo as a smaller component in a system that has addressed some but not all of those challenges. Specifically it is used to augment a force-based operational space controller with neural adaptive control to improve performance during a reaching task using a real-world Kinova Jaco2 robotic arm. The code and implementation details are provided1 (https://github.com/abr/neurorobotics-2020), with the intent of enabling other researchers to build and run their own neurorobotic systems.