Abstract: Path integration, the ability to maintain an estimate of one's location by continuously integrating self-motion cues, is a vital component of the brain's navigation system. We present a spiking neural network model of path integration derived from a starting assumption that the brain represents continuous variables, such as spatial coordinates, using Spatial Semantic Pointers (SSPs). SSPs are a representation for encoding continuous variables as high-dimensional vectors, and can also be used to create structured, hierarchical representations for neural cognitive modelling. Path integration can be performed by a recurrently-connected neural network using SSP representations. Unlike past work, we show that our model can be used to continuously update variables of any dimensionality. We demonstrate that symbol-like object representations can be bound to continuous SSP representations. Specifically, we incorporate a simple model of working memory to remember environment maps with such symbol-like representations situated in 2D space.
Keywords: Machine Learning (cs.LG), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences
Abstract: In this report we consider the following problem: Given a trained model that is partially faulty, can we correct its behaviour without having to train the model from scratch? In other words, can we “debug" neural networks similar to how we address bugs in our mathematical models and standard computer code. We base our approach on the hypothesis that debugging can be treated as a two-task continual learning problem. In particular, we employ a modified version of a continual learning algorithm called Orthogonal Gradient Descent (OGD) to demonstrate, via two simple experiments on the MNIST dataset, that we can in-fact \textit unlearn the undesirable behaviour while retaining the general performance of the model, and we can additionally \textit relearn the appropriate behaviour, both without having to train the model from scratch.
Abstract: Unmanned aerial vehicles (UAVs) need more autonomy. In light of inherent size, weight and power (SWaP) constraints, avionics with artificial intelligence implemented using neuromorphic technology offers a potential solution. We demonstrate intelligent drone control using spiking neural networks (SNNs), which can run on neuromorphic hardware. We present "BatSLAM", a modular SNN for autonomous localization, navigation, and control of UAVs. The bio-inspired algorithms are implemented using the neural modeling and simulation software Nengo, and are able to control the drone to autonomously perform a complex "house search" task in an AirSim simulated environment using only local sensor feedback. In this task, the drone is randomly placed and given a target object. The BatSLAM network localizes the drone, retrieves the target object location from memory, and then guides the drone along the most efficient path through the house to the target, avoiding obstacles and maneuvering through doors and stairways. We present benchmark results showing the BatSLAM network achieves 97.2 percent success rate in navigating to target objects in the house. To the best of our knowledge, the BatSLAM network presented here is the first in the world to carry out localization, navigation, and control in a fully spiking implementation.
Abstract: Mixed-signal neuromorphic computers often emulate some variant of the LIF neuron model. While, in theory, two-layer networks of these neurons are universal function approximators, single-layer networks consisting of slightly more complex neurons can, at the cost of universality, be more efficient. In this paper, we discuss a family of LIF neurons with passive dendrites. We provide rules that describe how input channels targeting different dendritic compartments interact, and test in how far these interactions can be harnessed in a spiking neural network context. We find that a single layer of two-compartment neurons approximates some functions at smaller errors than similarly sized hidden-layer networks. Single-layer networks with with three compartment neurons can approximate functions such as XOR and four-quadrant multiplication well; adding more compartments only offers small improvements in accuracy. From the perspective of mixed-signal neuromorphic systems, our results suggest that only small modifications to the neuron circuit are necessary to construct more computationally powerful and energy efficient systems that move more computation into the dendritic, analogue domain.
Abstract: Improving biological plausibility and functional capacity are two important goals for brain models that connect low-level neural details to high-level behavioral phenomena. We develop a method called “oracle-supervised Neural Engineering Framework” (osNEF) to train biologically-detailed spiking neural networks that realize a variety of cognitively-relevant dynamical systems. Specifically, we train networks to perform computations that are commonly found in cognitive systems (communication, multiplication, harmonic oscillation, and gated working memory) using four distinct neuron models (leaky-integrate-and-fire neurons, Izhikevich neurons, 4-dimensional nonlinear point neurons, and 4-compartment, 6-ion-channel layer-V pyramidal cell reconstructions) connected with various synaptic models (current-based synapses, conductance-based synapses, and voltage-gated synapses). We show that osNEF networks exhibit the target dynamics by accounting for nonlinearities present within the neuron models: performance is comparable across all four systems and all four neuron models, with variance proportional to task and neuron model complexity. We also apply osNEF to build a model of working memory that performs a delayed response task using a combination of pyramidal cells and inhibitory interneurons connected with NMDA and GABA synapses. The baseline performance and forgetting rate of the model are consistent with animal data from delayed match-to-sample tasks (DMTST): we observe a baseline performance of 95 percent and exponential forgetting with time constant tau = 8.5s, while a recent meta-analysis of DMTST performance across species observed baseline performances of 58 − 99 percent and exponential forgetting with time constants of tau = 2.4 − 71s. These results demonstrate that osNEF can train functional brain models using biologically-detailed components and open new avenues for investigating the relationship between biophysical mechanisms and functional capabilities.
Abstract: Researchers study nervous systems at levels of scale spanning several orders of magnitude, both in terms of time and space. While some parts of the brain are well understood at specific levels of description, there are few overarching theories that systematically bridge low-level mechanism and high-level function. The Neural Engineering Framework (NEF) is an attempt at providing such a theory. The NEF enables researchers to systematically map dynamical systems—corresponding to some hypothesised brain function—onto biologically constrained spiking neural networks. In this thesis, we present several extensions to the NEF that broaden both the range of neural resources that can be harnessed for spatiotemporal computation and the range of available biological constraints. Specifically, we suggest a method for harnessing the dynamics inherent in passive dendritic trees for computation, allowing us to construct single-layer spiking neural networks that, for some functions, achieve substantially lower errors than larger multi-layer networks. Furthermore, we suggest “temporal tuning” as a unifying approach to harnessing temporal resources for computation through time. This allows modellers to directly constrain networks to temporal tuning observed in nature, in ways not previously well-supported by the NEF. We then explore specific examples of neurally plausible dynamics using these techniques. In particular, we propose a new “information erasure” technique for constructing LTI systems generating temporal bases. Such LTI systems can be used to establish an optimal basis for spatiotemporal computation. We demonstrate how this captures “time cells” that have been observed throughout the brain. As well, we demonstrate the viability of our extensions by constructing an adaptive filter model of the cerebellum that successfully reproduces key features of eyeblink conditioning observed in neurobiological experiments. Outside the cognitive sciences, our work can help exploit resources available on existing neuromorphic computers, and inform future neuromorphic hardware design. In machine learning, our spatiotemporal NEF populations map cleanly onto the Legendre Memory Unit (LMU), a promising artificial neural network architecture for stream-to-stream processing that outperforms competing approaches. We find that one of our LTI systems derived through “information erasure” may serve as a computationally less expensive alternative to the LTI system commonly used in the LMU.
Abstract: Distributed vector representations are a key bridging point between connectionist and symbolic representations of cognition. It is unclear how uncertainty should be modelled in systems using such representations. One may place vector-valued distributions over vector representations, although that may assign non-zero probabilities to vector symbols that cannot occur. In this paper we discuss how bundles of symbols in Vector Symbolic Architectures (VSAs) can be understood as defining an object that has a relationship to a probability distribution, and how statements in VSAs can be understood as being analogous to probabilistic statements. We sketch novel designs for networks that compute entropy and mutual information of VSA-represented distributions. In this paper we restrict ourselves to operators proposed for Holographic Reduced Representations, and representing real-valued data. However, we suggest that the methods presented in this paper should translate to any VSA where the dot product between fractionally bound symbols induces a valid kernel.
Abstract: Modern neural networks have allowed substantial advances in robotics, but these algorithms make implicit assumptions about the discretization of time. In this document we argue that there are benefits to be gained, especially in robotics, by designing learning algorithms that exist in continuous time, as well as state, and only later discretizing the algorithms for implementation on traditional computing models, or mapping them directly onto analog hardware. We survey four arguments to support this approach: That continuum representations provide a unified theory of functions for robotic systems; That many algorithms formulated as temporally continuous demonstrate anytime properties; That we can exploit temporal sparsity to effect energy efficiency in both traditional and analog hardware; and that these algorithms reflect the instantiations of intelligence that have evolved in organisms. Further, we present learning algorithms that are derived from continuous representations. Finally, we discuss robotic precedents for this approach, and conclude with the implications of using continuum representations in robotic systems.
Abstract: Social environments often impose tradeoffs between pursuing personal goals and maintaining a favorable reputation. We studied how individuals navigate these tradeoffs using Reinforcement Learning (RL), paying particular attention to the role of social value orientation (SVO). We had human participants play an interated Trust Game against various software opponents and analyzed the behaviors. We then incorporated RL into two cognitive models, trained these RL agents against the same software opponents, and performed similar analyses. Our results show that the RL agents reproduce many interesting features in the human data, such as the dynamics of convergence during learning and the tendency to defect once reciprocation becomes impossible. We also endowed some of our agents with SVO by incorporating terms for altruism and inequality aversion into their reward functions. These prosocial agents differed from proself agents in ways that resembled the differences between prosocial and proself participants. This suggests that RL is a useful framework for understanding how people use feedback to make social decisions.
Abstract: Recently, a new recurrent neural network (RNN) named the Legendre Memory Unit (LMU) was proposed and shown to achieve state-of-the-art performance on several benchmark datasets. Here we leverage the linear time-invariant (LTI) memory component of the LMU to construct a simplified variant that can be parallelized during training (and yet executed as an RNN during inference), resulting in up to 200 times faster training. We note that our efficient parallelizing scheme is general and is applicable to any deep network whose recurrent components are linear dynamical systems. We demonstrate the improved accuracy of our new architecture compared to the original LMU and a variety of published LSTM and transformer networks across seven benchmarks. For instance, our LMU sets a new state-of-the-art result on psMNIST, and uses half the parameters while outperforming DistilBERT and LSTM models on IMDB sentiment analysis.
Abstract: We discuss the notion of "discrete function bases" with a particular focus on the discrete basis derived from the Legendre Delay Network (LDN). We characterize the performance of these bases in a delay computation task, and as fixed temporal convolutions in neural networks. Networks using fixed temporal convolutions are conceptually simple and yield state-of-the-art results in tasks such as psMNIST.
Abstract: We present an alternative derivation of the LTI system underlying the Legendre Delay Network (LDN). To this end, we first construct an LTI system that generates the Legendre polynomials. We then dampen the system by approximating a windowed impulse response, using what we call a "delay re-encoder". The resulting LTI system is equivalent to the LDN system. This technique can be applied to arbitrary polynomial bases, although there typically is no closed-form equation that describes the state-transition matrix.
Abstract: While neural networks are highly effective at learning task-relevant representations from data, they typically do not learn representations with the kind of symbolic structure that is hypothesized to support high-level cognitive processes, nor do they naturally model such structures within problem domains that are continuous in space and time. To fill these gaps, this work exploits a method for defining vector representations that bind discrete (symbol-like) entities to points in continuous topological spaces in order to simulate and predict the behavior of a range of dynamical systems. These vector representations are spatial semantic pointers (SSPs), and we demonstrate that they can (1) be used to model dynamical systems involving multiple objects represented in a symbol-like manner and (2) be integrated with deep neural networks to predict the future of physical trajectories. These results help unify what have traditionally appeared to be disparate approaches in machine learning.
Abstract: Neurophysiology and neuroanatomy constrain the set of possible computations that can be performed in a brain circuit. While detailed data on brain microcircuits is sometimes available, cognitive modelers are seldom in a position to take these constraints into account. One reason for this is the intrinsic complexity of accounting for biological mechanisms when describing cognitive function. In this paper, we present multiple extensions to the neural engineering framework (NEF), which simplify the integration of low-level constraints such as Dale's principle and spatially constrained connectivity into high-level, functional models. We focus on a model of eyeblink conditioning in the cerebellum, and, in particular, on systematically constructing temporal representations in the recurrent granule–Golgi microcircuit. We analyze how biological constraints impact these representations and demonstrate that our overall model is capable of reproducing key properties of eyeblink conditioning. Furthermore, since our techniques facilitate variation of neurophysiological parameters, we gain insights into why certain neurophysiological parameters may be as observed in nature. While eyeblink conditioning is a somewhat primitive form of learning, we argue that the same methods apply for more cognitive models as well. We implemented our extensions to the NEF in an open-source software library named “NengoBio” and hope that this work inspires similar attempts to bridge low-level biological detail and high-level function.
Abstract: Recent studies have demonstrated that the performance of transformers on the task of language modeling obeys a power-law relationship with model size over six orders of magnitude. While transformers exhibit impressive scaling, their performance hinges on processing large amounts of data, and their computational and memory requirements grow quadratically with sequence length. Motivated by these considerations, we construct a Legendre Memory Unit based model that introduces a general prior for sequence processing and exhibits an O(n) and O(nlnn) (or better) dependency for memory and computation respectively. Over three orders of magnitude, we show that our new architecture attains the same accuracy as transformers with 10x fewer tokens. We also show that for the same amount of training our model improves the loss over transformers about as much as transformers improve over LSTMs. Additionally, we demonstrate that adding global self-attention complements our architecture and the augmented model improves performance even further.
Abstract: Nonlinear interactions in the dendritic tree play a key role in neural computation. Nevertheless, modeling frameworks aimed at the construction of large-scale, functional spiking neural networks, such as the Neural Engineering Framework, tend to assume a linear superposition of postsynaptic currents. In this letter, we present a series of extensions to the Neural Engineering Framework that facilitate the construction of networks incorporating Dale's principle and nonlinear conductance-based synapses. We apply these extensions to a two-compartment LIF neuron that can be seen as a simple model of passive dendritic computation. We show that it is possible to incorporate neuron models with input-dependent nonlinearities into the Neural Engineering Framework without compromising high-level function and that nonlinear postsynaptic currents can be systematically exploited to compute a wide variety of multivariate, band-limited functions, including the Euclidean norm, controlled shunting, and nonnegative multiplication. By avoiding an additional source of spike noise, the function approximation accuracy of a single layer of two-compartment LIF neurons is on a par with or even surpasses that of two-layer spiking neural networks up to a certain target function bandwidth.
Abstract: We present the context-unified encoding (CUE) model, a large-scale spiking neural network model of human memory. It combines and integrates activity-based short-term memory with weight-based long-term memory. The implementation with spiking neurons ensures biological plausibility and allows for predictions on the neural level. At the same time, the model produces behavioral outputs that have been matched to human data from serial and free recall experiments. In particular, well-known results such as primacy, recency, transposition error gradients, and forward recall bias have been reproduced with good quantitative matches. Additionally, the model accounts for the Hebb repetition effect. The CUE model combines and extends the ordinal serial encoding (OSE) model, a spiking neuron model of short-term memory, and the temporal context model (TCM), a mathematical memory model matching free recall data. To implement the modification of the required association matrices, a novel learning rule, the association matrix learning rule (AML), is derived that allows for one-shot learning without catastrophic forgetting. Its biological plausibility is discussed and it is shown that it accounts for changes in neural firing observed in human recordings from an association learning experiment.
Abstract: Mutual information (MI) is a standard objective function for driving exploration. The use of Gaussian processes to compute information gain is limited by time and memory complexity that grows with the number of observations collected. We present an efficient implementation of MI-driven exploration by combining vector symbolic architectures with Bayesian Linear Regression. We demonstrate equivalent regret performance to a GP-based approach with memory and time complexity that is constant in the number of samples collected, as opposed to $t^2$ and $t^3$, respectively, enabling long-term exploration.
Abstract: As neuromorphic hardware begins to emerge as a viable target platform for artificial intelligence (AI) applications, there is a need for tools and software that can effectively compile a variety of AI models onto such hardware. Nengo (http://nengo.ai) is an ecosystem of software designed to fill this need with a suite of tools for creating, training, deploying, and visualizing neural networks for various hardware backends, including CPUs, GPUs, FPGAs, microcontrollers, and neuromorphic hardware. While backpropagation-based methods are powerful and fully supported in Nengo, there is also a need for frameworks that are capable of efficiently mapping dynamical systems onto such hardware while best utilizing its computational resources. The neural engineering framework (NEF) is one such method that is supported by Nengo. Most prominently, Nengo and the NEF have been used to engineer the world's largest functional model of the human brain. In addition, as a particularly efficient approach to training neural networks for neuromorphics, the NEF has been ported to several neuromorphic platforms. In this chapter, we discuss the mathematical foundations of the NEF and a number of its extensions and review several recent applications that use Nengo to build models for neuromorphic hardware. We focus in-depth on a particular class of dynamic neural networks, Legendre Memory Units (LMUs), which have demonstrated advantages over state-of-the-art approaches in deep learning with respect to energy efficiency, training time, and accuracy.
Abstract: We discuss a minimal, unconstrained log-Cholesky parametrisation of radial basis functions (RBFs) and the corresponding partial derivatives. This is useful when using RBFs as part of neural network that is either trained in a supervised fashion via error backpropagation, or unsupervised using a homeostasis mechanism. We perform some experiments and discuss potential caveats when using RBFs in this way. Furthermore, we compare RBFs to the Spatial Semantic Pointer similarity that can be used to construct networks with sparse hidden representations resembling those found in RBF networks.
Abstract: In this thesis I explore a biologically inspired method of encoding continuous space within a population of neurons. This method provides an extension to the Semantic Pointer Architecture (SPA) to encompass Semantic Pointers with real-valued spatial content in addition to symbol-like representations. I demonstrate how these Spatial Semantic Pointers (SSPs) can be used to generate cognitive maps containing objects at various locations. A series of operations are defined that can retrieve objects or locations from the encoded map as well as manipulate the contents of the memory. These capabilities are all implemented by a network of spiking neurons. I explore the topology of the SSP vector space and show how it preserves metric information while compressing all coordinates to unit length vectors. This allows a limitless spatial extent to be represented in a finite region. Neurons encoding space represented in this manner have firing fields similar to entorhinal grid cells. Beyond constructing biologically plausible models of spatial cognition, SSPs are applied to the domain of machine learning. I demonstrate how replacing traditional spatial encoding mechanisms with SSPs can improve performance on networks trained to compute a navigational policy. In addition, SSPs are also effective for training a network to localize within an environment based on sensor measurements as well as perform path integration. To demonstrate a practical, integrated system using SSPs, I combine a goal driven navigational policy with the localization network and cognitive map representation to produce an agent that can navigate to semantically defined goals. In addition to spatial tasks, the SSP encoding is applied to a more general class of machine learning problems involving arbitrary continuous signals. Results on a collection of 122 benchmark datasets across a variety of domains indicate that neural networks trained with SSP encoding outperform commonly used methods for the majority of the datasets. Overall, the experiments in this thesis demonstrate the importance of exploring new kinds of representations within neural networks and how they shape the kinds of functions that can be effectively computed. They provide an example of how insights regarding how the brain may encode information can inspire new ways of designing artificial neural networks.
Abstract: Spatial cognition relies on an internal map-like representation of space provided by hippocampal place cells, which in turn are thought to rely on grid cells as a basis. Spatial Semantic Pointers (SSP) have been introduced as a way to represent continuous spaces and positions via the activity of a spiking neural network. In this work, we further develop SSP representation to replicate the firing patterns of grid cells. This adds biological realism to the SSP representation and links biological findings with a larger theoretical framework for representing concepts. Furthermore, replicating grid cell activity with SSPs results in greater accuracy when constructing place cells. Improved accuracy is a result of grid cells forming the optimal basis for decoding positions and place cell output. Our results have implications for modelling spatial cognition and more general cognitive representations over continuous variables.
Abstract: Neuromorphic hardware has long promised to provide power advantages by leveraging the kind of event-driven, temporally sparse computation observed in biological neural systems. Only recently, however, has this hardware been developed to a point that allows for general purpose AI programming. In this paper, we provide an overview of tools and methods for building applications that run on neuromorphic computing devices. We then discuss reasons for observed efficiency gains in neuromorphic systems, and provide a concrete illustration of these gains by comparing conventional and neuromorphic implementations of a keyword spotting system trained on the widely used Speech Commands dataset. We show that replacing floating point operations in a conventional neural network with synaptic operations in a spiking neural network results in a roughly 4x energy reduction, with minimal performance loss.
Abstract: Emergent communication in artificial agents has been studied to understand language evolution, as well as to develop artificial systems that learn to communicate with humans. We show that agents performing a cooperative navigation task in various gridworld environments learn an interpretable communication protocol that enables them to efficiently, and in many cases, optimally, solve the task. An analysis of the agents' policies reveals that emergent signals spatially cluster the state space, with signals referring to specific locations and spatial directions such as left, up, or upper left room. Using populations of agents, we show that the emergent protocol has basic compositional structure, thus exhibiting a core property of natural language.
Abstract: Decision making (DM) requires the coordination of anatomically and functionally distinct cortical and subcortical areas. While previous computational models have studied these subsystems in isolation, few models explore how DM holistically arises from their interaction. We propose a spiking neuron model that unifies various components of DM, then show that the model performs an inferential decision task in a human-like manner. The model (a) includes populations corresponding to dorsolateral prefrontal cortex, orbitofrontal cortex, right inferior frontal cortex, pre-supplementary motor area, and basal ganglia; (b) is constructed using 8000 leaky-integrate-and-fire neurons with 7 million connections; and (c) realizes dedicated cognitive operations such as weighted valuation of inputs, accumulation of evidence for multiple choice alternatives, competition between potential actions, dynamic thresholding of behavior, and urgency-mediated modulation. We show that the model reproduces reaction time distributions and speed-accuracy tradeoffs from humans performing the task. These results provide behavioral validation for tasks that involve slow dynamics and perceptual uncertainty; we conclude by discussing how additional tasks, constraints, and metrics may be incorporated into this initial framework.
Abstract: The cerebellum is classically described in terms of its role in motor control. Recent evidence suggests that the cerebellum supports a wide variety of functions, including timing-related cognitive tasks and perceptual prediction. Correspondingly, deciphering cerebellar function may be important to advance our understanding of cognitive processes. In this paper, we build a model of eyeblink conditioning, an extensively studied low-level function of the cerebellum. Building such a model is of particular interest, since, as of now, it remains unclear how exactly the cerebellum manages to learn and reproduce the precise timings observed in eyeblink conditioning that are potentially exploited by cognitive processes as well. We employ recent advances in large-scale neural network modeling to build a biologically plausible spiking neural network based on the cerebellar microcircuitry. We compare our simulation results to neurophysiological data and demonstrate how the recurrent Granule-Golgi subnetwork could generate the dynamics representations required for triggering motor trajectories in the Purkinje cell layer. Our model is capable of reproducing key properties of eyeblink conditioning, while generating neurophysiological data that could be experimentally verified.
Abstract: We present several experiments demonstrating the efficiency and scalability of a biologically inspired spatial representation on navigation tasks using artificial neural networks. Specifically, we demonstrate that encoding coordinates with Spatial Semantic Pointers (SSPs) outperforms six other proposed encoding methods when training a neural network to navigate to arbitrary goals in a 2D environment. The SSP representation naturally generalizes to larger spaces, as there is no definition of a boundary required (unlike most other methods). Additionally, we show how this navigational policy can be integrated into a larger system that combines memory retrieval and self-localization to produce a behavioural agent capable of finding cued goal objects. We further demonstrate that explicitly incorporating a hexagonal grid cell-like structure in the generation of SSPs can improve performance. This biologically inspired spatial representation has been shown to be able to produce spiking neural models of spatial cognition. The link between SSPs and higher level cognition allows models using this representation to be seamlessly integrated into larger neural models to elicit complex behaviour.
Abstract: Linguistic communication is a unique characteristic of intelligent behaviour that distinguishes humans from non-human animals. Natural language is a structured, complex communication system supported by a variety of cognitive functions, realized by hundreds of millions of neurons in the brain. Artificial neural networks typically used in natural language processing (NLP) are often designed to focus on benchmark performance, where one of the main goals is reaching the state-of-the-art performance on a set of language tasks. Although the advances in NLP have been tremendous in the past decade, such networks provide only limited insights into biological mechanisms underlying linguistic processing in the brain. In this thesis, we propose an integrative approach to the study of computational mechanisms underlying fundamental language processes, spanning biologically plausible neural networks, and learning of basic communicative abilities through environmentally grounded behaviour. In doing so, we argue for the usage-based approach to language, where language is supported by a variety of cognitive functions and learning mechanisms. Thus, we focus on the three following questions: How are basic linguistic units, such as words, represented in the brain? Which neural mechanisms operate on those representations in cognitive tasks? How can aspects of such representations, such as associative similarity and structure, be learned in a usage-based framework? To answer the first two questions, we build novel, biologically realistic models of neural function that perform different semantic processing tasks: the Remote Associates Test (RAT) and the semantic fluency task. Both tasks have been used in experimental and clinical environments to study organizational principles and retrieval mechanisms from semantic memory. The models we propose realize the mental lexicon and cognitive retrieval processes operating on that lexicon using associative mechanisms in a biologically plausible manner. We argue that such models are the first and only biologically plausible models that propose specific mechanisms as well as reproduce a wide range of human behavioural data on those tasks, further corroborating their plausibility. To address the last question, we use an interactive, collaborative agent-based reinforcement learning setup in a navigation task where agents learn to communicate to solve the task. We argue that agents in such a setup learn to jointly coordinate their actions, and develop a communication protocol that is often optimal for the performance on the task, while exhibiting some core properties of language, such as representational similarity structure and compositionality, essential for associative mechanisms underlying cognitive representations.
Abstract: The ability to flexibly route information between brain regions is crucial to perform new multi-step tasks. The mechanisms postulated by the global neuronal workspace theory of consciousness are thought to underlie this ability. We developed a spiking neural architecture based on the theory, which is able to route concepts in order to perform a cognitive task composed of chained operations.
Abstract: In this paper we demonstrate how the Nengo neural modeling and simulation libraries enable users to quickly develop robotic perception and action neural networks for simulation on neuromorphic hardware using tools they are already familiar with, such as Keras and Python. We identify four primary challenges in building robust, embedded neurorobotic systems, including: (1) developing infrastructure for interfacing with the environment and sensors; (2) processing task specific sensory signals; (3) generating robust, explainable control signals; and (4) compiling neural networks to run on target hardware. Nengo helps to address these challenges by: (1) providing the NengoInterfaces library, which defines a simple but powerful API for users to interact with simulations and hardware; (2) providing the NengoDL library, which lets users use the Keras and TensorFlow API to develop Nengo models; (3) implementing the Neural Engineering Framework, which provides white-box methods for implementing known functions and circuits; and (4) providing multiple backend libraries, such as NengoLoihi, that enable users to compile the same model to different hardware. We present two examples using Nengo to develop neural networks that run on CPUs and GPUs as well as Intel's neuromorphic chip, Loihi, to demonstrate two variations on this workflow. The first example is an implementation of an end-to-end spiking neural network in Nengo that controls a rover simulated in Mujoco. The network integrates a deep convolutional network that processes visual input from cameras mounted on the rover to track a target, and a control system implementing steering and drive functions in connection weights to guide the rover to the target. The second example uses Nengo as a smaller component in a system that has addressed some but not all of those challenges. Specifically it is used to augment a force-based operational space controller with neural adaptive control to improve performance during a reaching task using a real-world Kinova Jaco2 robotic arm. The code and implementation details are provided1 (https://github.com/abr/neurorobotics-2020), with the intent of enabling other researchers to build and run their own neurorobotic systems.
Abstract: Neurophysiology and neuroanatomy limit the set of possible computations that can be performed in a brain circuit. Although detailed data on individual brain microcircuits is available in the literature, cognitive modellers seldom take these constraints into account. One reason for this is the intrinsic complexity of accounting for mechanisms when describing function. In this paper, we present multiple extensions to the Neural Engineering Framework that simplify the integration of low-level constraints such as Dale's principle and spatially constrained connectivity into high-level, functional models. We apply these techniques to a recent model of temporal representation in the Granule-Golgi microcircuit in the cerebellum, extending it towards higher degrees of biological plausibility. We perform a series of experiments to analyze the impact of these changes on a functional level. The results demonstrate that our chosen functional description can indeed be mapped onto the target microcircuit under biological constraints. Further, we gain insights into why these parameters are as observed by examining the effects of parameter changes. While the circuit discussed here only describes a small section of the brain, we hope that this work inspires similar attempts of bridging low-level biological detail and high-level function. To encourage the adoption of our methods, we published the software developed for building our model as an open-source library.
Abstract: Network models that reconstruct (a) the dynamics of individual neurons, (b) the anatomy of specific brain regions, and (c) the behaviors governed by these regions are important for understanding mental disorders and their pharmacological treatment. We present a spiking neuron model of the rat amygdala that undergoes fear conditioning, and is appropriately modulated by simulated pharmacological perturbation (including oxytocin OXY, seratonin 5-HT, dopamine DA, and muscimol MUSC). The network includes neural populations for the central-lateral (CeL), central-medial (CeM), and basolateral (BLA) amygdala; interneurons in BLA and CeL; inputs from spinal cord, cortex and hippocampus; and motor output through the periaqueductal gray (PAG). The model is trained by pairing negative stimuli (footshocks) with neutral stimuli (auditory tones) within a prescribed context (conditioning cage). Prediction error signals drive associative learning of synaptic connection weights in CeL and BLA. Following an experimentally-vetted training regime, the model exhibits the fear response (freezing, via CeM inhibition of tonically driven PAG) to presentation of the conditioned tone or context (Fig. 1). Furthermore, repeatedly presenting tones without shocks in a new context causes extinction of the fear response in that context, but not in others, via synaptic plasticity in BLA. To simulate pharmacology, we excite particular neural subpopulations that are known to express receptors for the corresponding neurotransmitters. Fig. 2 reports mean freezing in response to twelve pharmacological manipulations, applied during various stages of conditioning, extinction, or expression. Simulated freezing is consistent with empirical data from rats injected with appropriate agonists or antagonists. These results demonstrate that the mechanisms underlying fear conditioning, including associative learning, extinction, and pharmacology, can be understood through the dynamic interactions between amygdala nuclei. Extensions to the model will allow targeted biophysical manipulations and anatomical reconstructions of human amygdala; experimental predictions made from this model may profitably inform human pharmacology and the treatment of conditions such as post-traumatic stress disorder.
Abstract: We propose a spiking recurrent neural network model of flexible human timing behavior based on the delay network. The well-known 'scalar property' of timing behavior arises from the model in a natural way, and critically depends on how many dimensions are used to represent the history of stimuli. The model also produces heterogeneous firing patterns that scale with the timed interval, consistent with available neural data. This suggests that the scalar property and neural scaling are tightly linked. Further extensions of the model are discussed that may capture additional behavior, such as continuative timing, temporal cognition, and learning how to time.
Abstract: There are numerous behavioral and physiological studies that show how the brain compensates for uncertainties and unexpected changes in the sensory environment while still successfully perform motor tasks. To date, there have been a variety of phenomenological models proposed for explaining this sensory motor adaptation. But in order to relate the suggested control algorithms with their neural realizations, it is important to have biologically plausible mechanisms that capture both neural activities and the higher order behaviors that they give rise to. Here we extend a previous model, the Recurrent Error-driven Adaptive Control Hierarchy (REACH), that accounts for dynamic and kinematic adaptation, to also capture visuomotor adaptations. To demonstrate this behavior, we consider the conventional task of 'visuomotor rotation' using a two link arm in a planar reaching task. The model consists of anatomically organized structures including M1, pMd and cerebellum to incorporate different aspects of the behavior. The extended model has a multimodal Kalman filter to accomodate an internal model for dynamic prediction of limb states and sensory integration of vision and proprioception. A spike based algorithm is implemented to learn the visuomotor transformation. Replicating experiments in humans and non-human primates, the model is made to reach when given either abrupt or fast implicit rotations. We also use the model to explore multirate adaptations to further demonstrate classical characteristics such as savings and interference. While the proposed model is consistent with the experimental data of rapid adaptations, the model also exhibits spiking activity comparable to empirical data. A plausible and anatomically organized neuron model describing the adaptation to visuomotor rotations, and eventually to other visuomotor transformations, can provide insights and significantly improve our understanding of the motor system organization and function.
Abstract: Most computational models of timing rely on well-defined start- and stop-signals, however, these are quite rare in our natural environments. Moreover, theories typically propose different mechanisms to account for retrospective and prospective timing, an assumption that is difficult to align with naturalistic, continuative types of timing. Here we propose a spiking recurrent neural network model of flexible human timing behavior: Legendre Memory Timing (LMT). Our model continually and optimally represents the history of its input by compressing the input into a q-dimensional state, consisting of Legendre polynomials. At any point in time, the network represents a rolling window of its input history that spans from the current time to θ seconds in the past and uses this window to assess time (Voelker & Eliasmith, 2018). Where previous models require constrained ramping, decaying, or oscillating neural activity, our model is not restricted to a single firing pattern, but - consistent with available experimental data from monkeys (Wang, Narain, Hosseini, and Jazayeri, 2018) - utilizes heterogeneous firing patterns in individual spiking neurons that temporally scale with the timed interval. Without an explicit clock, or any specific clock-focussed assumptions, this model accounts for a number of key timing phenomena. For example, the scalar property naturally arises from our model and is functionally linked to the numbers of dimensions that are used to represent input history. Moreover, the model explains why constant standard deviation may be observed for well-trained subjects (e.g., Fetterman & Killeen, 1992). The model suggests that the scalar property and neural scaling are tightly linked: inter-trial variability in timing responses are correlated with inter-trial variability in neural scaling. In doing so, the model explains how accurate timing performance, both prospective and retrospective, may be accomplished in more ecological settings without relying on clear start- and stop-signals (van Rijn, 2018).
Abstract: Predicting future motion of other vehicles or, more generally, the development of traffic situations, is an essential step towards secure, context-aware automated driving. On the one hand, human drivers are able to anticipate driving situations continuously based on the currently perceived behavior of other traffic participants while incorporating prior experience. On the other hand, the most successful data-driven prediction models are typically trained on large amounts of recorded data before deployment achieving remarkable results. In this paper, we present a mixture-of-experts online learning model encapsulating both ideas. Our system learns at run time to choose between several models, which have been previously trained offline, based on the current situational context. We show that our model is able to improve over the offline models already after a short ramp-up phase. We evaluate our system on real world driving data.
Abstract: Feedback alignment has been proposed as a biologically plausible alternative to error backpropagation in multi-layer perceptrons. However, feedback alignment currently has not been demonstrated to scale beyond relatively shallow network topologies, or to solve cognitively interesting tasks such as high-resolution image classification. In this paper, we provide an overview of feedback alignment and review suggested mappings of feedback alignment onto biological neural networks. We then discuss a novel geometric interpretation of the feedback alignment algorithm that can be used to analyze its limitations. Finally, we discuss a series of experiments in which we compare the performance of backpropagation and feedback alignment. We hope that these insights can be used to systematically improve feedback alignment under biological constraints, which may allow us to build better models of learning in cognitive systems.
Abstract: We propose a novel memory cell for recurrent neural networks that dynamically maintains information across long windows of time using relatively few resources. The Legendre Memory Unit (LMU) is mathematically derived to orthogonalize its continuous-time history – doing so by solving $d$ coupled ordinary differential equations (ODEs), whose phase space linearly maps onto sliding windows of time via the Legendre polynomials up to degree $d - 1$. Backpropagation across LMUs outperforms equivalently-sized LSTMs on a chaotic time-series prediction task, improves memory capacity by two orders of magnitude, and significantly reduces training and inference times. LMUs can efficiently handle temporal dependencies spanning $100\text ,000$ time-steps, converge rapidly, and use few internal state-variables to learn complex functions spanning long windows of time – exceeding state-of-the-art performance among RNNs on permuted sequential MNIST. These results are due to the network's disposition to learn scale-invariant features independently of step size. Backpropagation through the ODE solver allows each layer to adapt its internal time-step, enabling the network to learn task-relevant time-scales. We demonstrate that LMU memory cells can be implemented using $m$ recurrently-connected Poisson spiking neurons, $\mathcal O( m )$ time and memory, with error scaling as $\mathcal O( d / \sqrt m )$. We discuss implementations of LMUs on analog and digital neuromorphic hardware.
Abstract: Predicting future vehicle behaviour is an essential task to enable safe and situation-aware automated driving. In this paper, we propose to encapsulate spatial information of multiple objects in a semantic vector-representation. Assuming that future vehicle motion is influenced not only by past positions but also by the behaviour of other traffic participants, we use this representation as input for a Long Short-Term Memory (LSTM) network for sequence to sequence prediction of vehicle positions. We train and evaluate our system on real-world driving data collected mainly on highways in southern Germany and compare it to other models for reference
Abstract: NengoDL is a software framework designed to combine the strengths of neuromorphic modelling and deep learning. NengoDL allows users to construct biologically detailed neural models, intermix those models with deep learning elements (such as convolutional networks), and then efficiently simulate those models in an easy-to-use, unified framework. In addition, NengoDL allows users to apply deep learning training methods to optimize the parameters of biological neural models. In this paper we present basic usage examples, benchmarking, and details on the key implementation elements of NengoDL. More details can be found at https://www.nengo.ai/nengo-dl.
Abstract: Modelling biologically-plausible neural structures for intelligent agents presents a unique challenge when operating in real-time domains. Neurons in our brains have different response properties, firing rates, and propagation lengths, creating noise that cannot be reliably decoded. This research explores the strengths and limitations of LIF spiking neuron ensembles for application in OpenAI virtual environments. Topics discussed include how we represent arbitrary environmental signals from multiple senses, choosing between equally viable actions in a given scenario, and how one can create a generic model that can learn and operate in a verity of situations.
Abstract: We present a new binding operation, vector-derived transformation binding (VTB), for use in vector symbolic architectures (VSA). The performance of VTB is compared to circular convolution, used in holographic reduced representations (HRRs), in terms of list and stack encoding capacity. A special focus is given to the possibility of a neural implementation by the means of the Neural Engineering Framework (NEF). While the scaling of required neural resources is slightly worse for VTB, it is found to be on par with circular convolution for list encoding and better for encoding of stacks. Furthermore, VTB influences the vector length less, which also benefits a neural implementation. Consequently, we argue that VTB is an improvement over HRRs for neurally implemented VSAs.
Abstract: Social robotics is a highly useful field that is rapidly growing. Advances in embedded systems and fields like neuromorphic computing provide hardware solutions for the computationally complex models needed to produce realistic, pro-social socio-emotional robots. This work details a robot which executes a simplified amygdala model to determine an emotional state from visual input and a subsequent behavioral response. Each nuclei of this model is processed on a different neuromorphic platform, including the SpiNNaker, Loihi, and Braindrop chips. Although simplified, this robot and its underlying model illustrate a proof of concept for more complicated and biologically-plausible socio-emotional robots.
Abstract: We present a novel method for constructing neurally implemented spatial representations that we show to be useful for building models of spatial cognition. This method represents continuous (i.e., real-valued) spaces using neurons, and identifies a set of operations for manipulating these representations. Specifically, we use "fractional binding" to construct "spatial semantic pointers" (SSPs) that we use to generate and manipulate representations of spatial maps encoding the positions of objects. We show how these representations can be transformed to answer queries about the location and identities of objects, move the relative or global position of items, and answer queries about regions of space, among other things. We demonstrate that the neural implementation in spiking networks of SSPs have similar accuracy and capacity as the mathematical ideal.
Abstract: Silicon neurons designed using subthreshold analog circuit techniques offer low power and compact area but are exponentially sensitive to threshold-voltage mismatch in transistors. The resulting heterogeneity in the neurons’ responses, however, provides a diverse set of basis functions for smooth nonlinear function approximation. For low-order polynomials, neuron spiking thresholds ought to be distributed uniformly across the function’s domain. This uniform distribution is difficult to achieve solely by sizing transistors to titrate mismatch. With too much mismatch, many neuron’s thresholds fall outside the domain (i.e. they either always spike or remain silent). With too little mismatch, all their thresholds bunch up in the middle of the domain. Here, we present a silicon-neuron design methodology that minimizes overall area by optimizing transistor sizes in concert with a few locally-stored programmable bits to adjust each neuron’s offset (and gain). We validated this methodology in a 28-nm mixed analog-digital CMOS process. Compared to relying on mismatch alone, augmentation with digital correction effectively reduced silicon area by 38 percent.
Abstract: Braindrop is the first neuromorphic system designed to be programmed at a high level of abstraction. Previous neuromorphic systems were programmed at the neurosynaptic level and required expert knowledge of the hardware to use. In stark contrast, Braindrop's computations are specified as coupled nonlinear dynamical systems and synthesized to the hardware by an automated procedure. This procedure not only leverages Braindrop's fabric of subthreshold analog circuits as dynamic computational primitives but also compensates for their mismatched and temperature-sensitive responses at the network level. Thus, a clean abstraction is presented to the user. Fabricated in a 28-nm FDSOI process, Braindrop integrates 4096 neurons in $0.65 \text mm^2$. Two innovations—sparse encoding through analog spatial convolution and weighted spike-rate summation though digital accumulative thinning—cut digital traffic drastically, reducing the energy Braindrop consumes per equivalent synaptic operation to 381 fJ for typical network configurations.
Abstract: Dynamical systems are universal computers. They can perceive stimuli, remember, learn from feedback, plan sequences of actions, and coordinate complex behavioural responses. The Neural Engineering Framework (NEF) provides a general recipe to formulate models of such systems as coupled sets of nonlinear differential equations and compile them onto recurrently connected spiking neural networks – akin to a programming language for spiking models of computation. The Nengo software ecosystem supports the NEF and compiles such models onto neuromorphic hardware. In this thesis, we analyze the theory driving the success of the NEF, and expose several core principles underpinning its correctness, scalability, completeness, robustness, and extensibility. We also derive novel theoretical extensions to the framework that enable it to far more effectively leverage a wide variety of dynamics in digital hardware, and to exploit the device-level physics in analog hardware. At the same time, we propose a novel set of spiking algorithms that recruit an optimal nonlinear encoding of time, which we call the Delay Network (DN). Backpropagation across stacked layers of DNs dramatically outperforms stacked Long Short-Term Memory (LSTM) networks—a state-of-the-art deep recurrent architecture—in accuracy and training time, on a continuous-time memory task, and a chaotic time-series prediction benchmark. The basic component of this network is shown to function on state-of-the-art spiking neuromorphic hardware including Braindrop and Loihi. This implementation approaches the energy-efficiency of the human brain in the former case, and the precision of conventional computation in the latter case.
Abstract: We propose a cognitively plausible method for representing and querying spatial relationships in a neural architecture. This technique employs a fractional binding operator that captures continuous spatial information in spatial semantic pointers (SSPs). We propose a model that takes an image with several objects, parses the image into an SSP memory representation, and answers queries about the objects. We demonstrate that our model allows us to not only store and extract objects and their spatial information, but also perform queries based on location and in relation to other objects. We show that we can query images with 2, 3, and 4 objects with relative spatial locations. We also show that the model qualitatively reproduces Kosslyn's famous map experiment.
Keywords: Emotion, Language, Appraisal, Construction, Multi-Level Mechanisms, Affective Computing, Neural Engineering Framework, Semantic Pointers
Abstract: Emotion theory needs to explain the relationship of language and emotions, and the embodiment of emotions, by specifying the computational mechanisms underlying emotion generation in the brain. We used Chris Eliasmith’s Semantic Pointer Architecture to develop POEM, a computational model that explains numerous important phenomena concerning emotions, including how some stimuli generate immediate emotional reactions, how some emotional reactions depend on cognitive evaluations, how bodily states influence the generation of emotions, how some emotions depend on interactions between physiological inputs and cognitive appraisals, and how some emotional reactions concern syntactically complex representations. We contrast our theory with current alternatives, and discuss some possible applications to individual and social emotions.
Abstract: Modern machine learning models are beginning to rival human performance on some realistic object recognition tasks, but we still lack a full understanding of how the human brain solves this same problem. This thesis combines knowledge from machine learning and computational neuroscience to create models of human object recognition that are increasingly realistic both in their treatment of low-level neural mechanisms and in their reproduction of high-level human behaviour. First, I present extensions to the Neural Engineering Framework to make its preferred type of model—the “fixed-encoding” network—more accurate for object recognition tasks. These extensions include better distributions—such as Gabor filters—for the encoding weights, and better loss functions—namely weighted squared loss, softmax loss, and hinge loss—to solve for decoding weights. Second, I introduce increased biological realism into deep convolutional neural networks trained with backpropagation, by training them to run using spiking leaky integrate-and-fire (LIF) neurons. These models have been successful in machine learning, and I am able to convert them to spiking networks while retaining similar levels of performance. I present a novel method to smooth the LIF rate response function in order to avoid the common problems associated with differentiating spiking neurons in general and LIF neurons in particular. I also derive a number of novel characterizations of spiking variability, and use these to train spiking networks to be more robust to this variability. Finally, to address the problems with implementing backpropagation in a biological system, I train spiking deep neural networks using the more biological Feedback Alignment algorithm. I examine this algorithm in depth, including many variations on the core algorithm, methods to train using non-differentiable spiking neurons, and some of the limitations of the algorithm. Using these findings, I construct a spiking model that learns online in a biologically realistic manner. The models developed in this thesis help to explain both how spiking neurons in the brain work together to allow us to recognize complex objects, and how the brain may learn this behaviour. Their spiking nature allows them to be implemented on highly efficient neuromorphic hardware, opening the door to object recognition on energy-limited devices such as cell phones and mobile robots.
Abstract: The interaction of humans and robots (HRI) is of great relevance for the field of neurorobotics as it can provide insights on motor control and sensor processing mechanisms in humans that can be applied to robotics. We propose a spiking neural network (SNN) to trigger motion reflexes on a robotic hand based on human EMG data. The first part of the network takes EMG signals to measure muscle activity, then classify the data to detect which finger is active in the human hand. The second part triggers single finger reflexes using the classification output. The finger reflexes are modeled with motion primitives activated with an oscillator and mapped to the robot kinematic. We evaluated the SNN by having users wear a non-invasive EMG sensor, record a training dataset, and then flex different fingers, one at a time. The muscle activity was recorded using a Myo sensor with eight channels. EMG signals were successfully encoded into spikes as input for the SNN. The classification could detect the active finger to trigger motion generation of finger reflexes. The SNN was able to control a real Schunk SVH robotic hand. Being able to map myo-electric activity to functions of motor control for a task, can provide an interesting interface for robotic applications, and also to study brain functioning. SNN provide a challenging but interesting framework to interact with human data. In future work the approach will be extended to control a robot arm at the same time.
Abstract: The representation of semantic knowledge poses a central modelling decision in many models of cognitive phenomena. However, not all such representations reflect properties observed in human semantic networks. Here, we evaluate the psychological plausibility of two distributional semantic models widely used in natural language processing: word2vec and GloVe. We use these models to construct directed and undirected semantic networks and compare them to networks of human association norms using a set of graph-theoretic analyses. Our results show that all such networks display small-world characteristics, while only undirected networks show similar degree distributions to those in the human semantic network. Directed networks also exhibit a hierarchical organization that is reminiscent of the human semantic network.
Abstract: We examine moral machine decision-making, inspired by a central question posed by Rossi regarding moral preferences: can AI systems based on statistical machine learning (which do not provide a natural way to explain or justify their decisions) be used for embedding morality into a machine in a way that allows us to prove that nothing morally wrong will happen? We argue for an evaluation held to the same standards as a human agent, removing the demand that ethical behavior is always achieved. We introduce four key meta-qualities desired for our moral standards, and then proceed to clarify how we can prove that an agent will correctly learn to perform moral actions given a set of samples within certain error bounds. Our group-dynamic approach enables us to demonstrate that the learned models converge to a common function to achieve stability. We further explain a valuable intrinsic consistency check made possible through the derivation of logical statements from the machine learning model. In all, this work proposes an approach for building ethical AI systems, from the perspective of artificial intelligence, and sheds important light on understanding how much learning is required for an intelligent agent to behave morally with negligible error.
Abstract: Learning in the Neural Engineering Framework (NEF) and the Semantic Pointer Architecture (SPA) has been recently extended beyond the supervised Prescribed Error Sensitivity (PES) to include the unsupervised Vector Oja (Voja). This thesis demonstrates how the combination of these learning rules can be used to learn associative memories. Moreover, these techniques are used to provide explanations of two behaving cognitive phenomena that are modeled with spiking neurons. First, the standard progression of cognitive addition strategies from counting to memorization, as occurs in children, is modelled as a transfer of skills. Initially, addition by counting is performed in the slow basal ganglia based system, before being overtaken by a rapid cortical associative memory as a type of pre-frontal, cortical consolidation. Second, a word-pair recognition task, where two distinct types of word-pairs are memorized, is modelled. The Voja learning rule is modified to match temporal lobe magnetoencephalography (MEG) data generated by each word-pair type observed during the task. This empirically grounds the associative memory model, which has not been possible using other cognitive modeling paradigms. The distinct implementation of Voja for each area, pre-frontal and temporal, demonstrates the different roles that the areas perform during learning.
Abstract: In our previous work, we have implemented a biologically realistic action selection system that can perform complex tasks such as sentence parsing, the n-Back task and the Tower of Hanoi. Although our models have successfully performed those tasks, they have so far required human researchers to tune multiple parameters before the models can be expected to exhibit good performance. In this paper, we show that an improved, parameter-sparse learning rule can be applied to a cognitive sequencing task.
Abstract: We present a whole-task spiking neural network model of associative recognition, developed in the Nengo framework. Because the resulting model is very complex (>750,000 neurons) we used magnetoencephalographic (MEG) data to constrain the model. The model matches data in occipital, temporal, prefrontal, and motor cortices, and shows how the associative recognition process could be effectively implemented in the human brain.
Abstract: I present the context-unified encoding (CUE) model, a large-scale spiking neural network model of human memory. It combines and integrates activity-based short-term memory with weight-based long-term memory. The implementation with spiking neurons ensures biological plausibility and allows for predictions on the neural level. At the same time, the model produces behavioural outputs that have been matched to human data from serial and free recall experiments. In particular, well-known results such as primacy, recency, transposition error gradients, and forward recall bias have been reproduced with good quantitative matches. Additionally, the model accounts for the effects of the acetylcholine antagonist scopolamine, and the Hebb repetition effect. The CUE model combines and extends the ordinal serial encoding (OSE) model, a spiking neuron model of short-term memory, and the temporal context model (TCM), a mathematical model of free recall. To the former, a neural mechanism for tracking the list position is added. The latter is converted into a spiking neural network under considerations of the main features and simplification of equations where appropriate. Previous models of the recall process in the TCM are replaced by a new independent accumulator recall process that is more suited to the integration into a large-scale network. To implement the modification of the required association matrices, a novel learning rule, the association matrix learning rule (AML), is derived that allows for one-shot learning without catastrophic forgetting. Its biological plausibility is discussed and it is shown that it accounts for changes in neural firing observed in human recordings from an association learning experiment. Furthermore, I discuss a recent proposal of an optimal fuzzy temporal memory as replacement for the TCM context signal and show it to be likely to require more neurons than there are in the human brain. To construct the CUE model, I have used the Neural Engineering Framework (NEF) and Semantic Pointer Architecture (SPA). This thesis makes novel contributions to both. I propose to distribute NEF intercepts according to the distribution of cosine similarities of random uniformly distributed unit vectors. This leads to a uniform distribution of active neurons and reduces the error introduced by spiking noise considerably in high-dimensional neuronal representations. It improves the asymptotic scaling of the noise error with dimensions d from O(d) to O(d\^(3/4))\$. These results are applied to achieve improved Semantic Pointer representations in neural networks are on par with or better than previous methods of optimizing neural representations for the Semantic Pointer Architecture. Furthermore, the vector-derived transformation binding (VTB) is investigated as an alternative to circular convolution in the SPA, with promising results.
Abstract: Low-power, high-speed neural networks are critical for providing deployable embedded AI applications at the edge. We describe an FPGA implementation of Neural Engineering Framework (NEF) networks with online learning that outperforms mobile GPU implementations by an order of magnitude or more. Specifically, we provide an embedded Python-capable PYNQ FPGA implementation supported with a High-Level Synthesis (HLS) workflow that allows sub-millisecond implementation of adaptive neural networks with low-latency, direct I/O access to the physical world. We tune the precision of the different intermediate variables in the code to achieve competitive absolute accuracy against slower and larger floating-point reference designs. The online learning component of the neural network exploits immediate feedback to adjust the network weights to best support a given arithmetic precision. As the space of possible design configurations of such networks is vast and is subject to a target accuracy constraint, we use the Hyperopt hyper-parameter tuning tool instead of manual search to find Pareto optimal designs. Specifically, we are able to generate the optimized designs in under 500 iterations of Vivado HLS before running the complete Vivado place-and-route phase on that subset. For neural network populations of 64–4096 neurons and 1–8 representational dimensions our optimized FPGA implementation generated by Hyperopt has a speedup of 10–484× over a competing cuBLAS implementation on the Jetson TX1 GPU while using 2.4–9.5× less power. Our speedups are a result of HLS-specific reformulation (15× improvement), precision adaptation (4× improvement), and low-latency direct I/O access (1000× improvement).
Abstract: Nonlinear interaction in the dendritic tree is known to be an important computational resource in biological neurons. Yet, high-level neural compilers – such as the Neural Engineering Framework (NEF), or the predictive coding method published by Denève et al. in 2013 – tend not to include conductance-based nonlinear synaptic interactions in their models, and so do not exploit these interactions systematically. In this study, we extend the NEF to include synaptic computation of nonlinear multivariate functions, such as controlled shunting, multiplication, and the Euclidean norm. We present a theoretical framework that provides sufficient conditions under which nonlinear synaptic interaction yields a similar precision compared to traditional NEF methods, while reducing the number of layers, neurons, and latency in the network. The proposed method lends itself to increasing the computational power of neuromorphic hardware systems and improves the NEF's biological plausibility by mitigating one of its long-standing limitations, namely its reliance on linear, current-based synapses. We perform a series of numerical experiments with a conductance-based two-compartment LIF neuron model. Preliminary results show that nonlinear interactions in conductance-based synapses are sufficient to compute a wide variety of nonlinear functions with performance competitive to using an additional layer of neurons as a nonlinearity.
Abstract: We have previously shown that a biologically realistic spiking neuron implementation of an action selection/execution system (constrained by the neurological connectivity of the cortex, basal ganglia, and thalamus) is capable of performing complex tasks, such as the Tower of Hanoi, n-Back, and semantic memory search. However, because the neural implementation approximates a strict rule-based structure of a production system, such models have involved hand-tweaking of multiple parameters to get the desired behaviour. Here, we show that a simple, local, online learning rule can be used to learn these parameters, resulting in neural models of cognitive behaviours that are more reliable and easier to construct than with prior methods.
Abstract: Hierarchical categorization inter-leaved with sequence recognition of incoming stimuli in the mammalian brain is theorized to be performed by circuits composed of the thalamus and the six-layer cortex. Using these circuits, the cortex is thought to learn a ‘brain grammar’ composed of recursive sequences of categories. A thalamo-cortical, hierarchical classification and sequence learning “Core” circuit implemented as a linear matrix simulation and was published by Rodriguez, Whitson & Granger in 2004. In the brain, these functions are implemented by cortical and thalamic circuits composed of recurrently-connected, spiking neurons. The Neural Engineering Framework (NEF) (Eliasmith & Anderson, 2003) allows for the construction of large-scale biologically plausible neural networks. Existing NEF models of the basal-ganglia and the thalamus exist but to the best of our knowledge there does not exist an integrated, spiking-neuron, cortical-thalamic-Core network model. We construct a more biologically-plausible version of the hierarchical-classification function of the Core circuit using leaky-integrate-and-fire neurons which performs progressive visual classification of static image sequences relying on the neural activity levels to trigger the progressive classification of the stimulus. We proceed by implementing a recurrent NEF model of the cortical-thalamic Core circuit and then test the resulting model on the hierarchical categorization of images.
Abstract: Building large-scale brain models is one method used by theoretical neuroscientists to understand the way the human brain functions. Researchers typically use either a bottom-up approach, which focuses on the detailed modelling of various biological properties of the brain and places less importance on reproducing functional behaviour, or a top-down approach, which generally aim to reproduce the behaviour observed in real cognitive agents, but typically sacrifices adherence to constraints imposed by the neuro-biology. The focus of this thesis is Spaun, a large-scale brain model constructed using a combination of the bottom-up and top-down approaches to brain modelling. Spaun is currently the world’s largest functional brain model, capable of performing eight distinct cognitive tasks ranging from digit recognition to inductive reasoning. The thesis is organized to discuss three aspects of the Spaun model. First, it describes the original Spaun model, and explores how a top-down approach, known as the Semantic Pointer Architecture (SPA), has been combined with a bottom-up approach, known as the Neural Engineering Framework (NEF), to integrate six existing cognitive models into a unified cognitive model that is Spaun. Next, the thesis identifies some of the concerns with the original Spaun model, and show the modifications made to the network to remedy these issues. It also characterizes how the Spaun model was re-organized and re-implemented (to include the aforementioned modifications) as the Spaun 2.0 model. As part of the discussion of the Spaun 2.0 model, task performance results are presented that compare the original Spaun model and the re-implemented Spaun 2.0 model, demonstrating that the modifications to the Spaun 2.0 model have improved its accuracy on the working memory task, and the two induction tasks. Finally, three extensions to Spaun 2.0 are presented. These extensions take advantage of the re-organized Spaun model, giving Spaun 2.0 new capabilities – a motor system capable of adapting to unknown force fields applied to its arm; a visual system capable of processing 256×256 full-colour images; and the ability to follow general instructions. The Spaun model and architecture presented in this thesis demonstrate that by using the SPA and the NEF, it is not only possible to construct functional large-scale brain models, but to do so in a manner that supports complex extensions to the model. The final Spaun 2.0 model consists of approximately 6.6 million neurons, can perform 12 cognitive tasks, and has been demonstrated to reproduce behavioural and neurological data observed in natural cognitive agents.
Abstract: Researchers building spiking neural networks face the challenge of improving the biological plausibility of their model networks while maintaining the ability to quantitatively characterize network behavior. In this work, we extend the theory behind the neural engineering framework (NEF), a method of building spiking dynamical networks, to permit the use of a broad class of synapse models while maintaining prescribed dynamics up to a given order. This theory improves our understanding of how low-level synaptic properties alter the accuracy of high-level computations in spiking dynamical networks. For completeness, we provide characterizations for both continuous-time (i.e., analog) and discrete-time (i.e., digital) simulations. We demonstrate the utility of these extensions by mapping an optimal delay line onto various spiking dynamical networks using higher-order models of the synapse. We show that these networks nonlinearly encode rolling windows of input history, using a scale invariant representation, with accuracy depending on the frequency content of the input signal. Finally, we reveal that these methods provide a novel explanation of time cell responses during a delay task, which have been observed throughout hippocampus, striatum, and cortex.
Abstract: Using Intel's Loihi neuromorphic research chip and ABR's Nengo Deep Learning toolkit, we analyze the inference speed, dynamic power consumption, and energy cost per inference of a two-layer neural network keyword spotter trained to recognize a single phrase. We perform comparative analyses of this keyword spotter running on more conventional hardware devices including a CPU, a GPU, Nvidia's Jetson TX1, and the Movidius Neural Compute Stick. Our results indicate that for this inference application, Loihi outperforms all of these alternatives on an energy cost per inference basis while maintaining near-equivalent inference accuracy. Furthermore, an analysis of tradeoffs between network size, inference speed, and energy cost indicates that Loihi's comparative advantage over other low-power computing devices improves for larger networks.
Abstract: Neural networks have long been used to study linguistic phenomena spanning the domains of phonology, morphology, syntax, and semantics. Of these domains, semantics is somewhat unique in that there is little clarity concerning what a model needs to be able to do in order to provide an account of how the meanings of complex linguistic expressions, such as sentences, are understood. We argue that one thing such models need to be able to do is generate predictions about which further sentences are likely to follow from a given sentence; these define the sentence's “inferential role.” We then show that it is possible to train a tree-structured neural network model to generate very simple examples of such inferential roles using the recently released Stanford Natural Language Inference (SNLI) dataset. On an empirical front, we evaluate the performance of this model by reporting entailment prediction accuracies on a set of test sentences not present in the training data. We also report the results of a simple study that compares human plausibility ratings for both human-generated and model-generated entailments for a random selection of sentences in this test set. On a more theoretical front, we argue in favor of a revision to some common assumptions about semantics: understanding a linguistic expression is not only a matter of mapping it onto a representation that somehow constitutes its meaning; rather, understanding a linguistic expression is mainly a matter of being able to draw certain inferences. Inference should accordingly be at the core of any model of semantic cognition.
Abstract: In previous work, we have implemented spiking neuron models that use a biologically realistic action selection system to solve complex cognitive tasks, including the Tower of Hanoi and semantic memory search. However, such models often require the fine-tuning of multiple parameters so that the model can reach a desired level of performance. Recently, we demonstrated that a local, online learning rule, which only requires a single parameter, the learning rate, is sufficient for teaching our model how to solve a general cognitive sequencing task. Here, we refine our method by showing that adding adaptive learning is more robust regarding our choice of parameters, and will achieve better performance on all versions of the cognitive task that we tested. These results provide a foundation for building complex cognitive models that require no hand-tuning of parameters.
Abstract: Standardized tests exist for the diagnostics of developmental lexical disorders, but it is still difficult to associate the resulting behavior of a child while speaking with functional deficits in the child´s brain. The mental lexicon is part of the speech and language knowledge repository of individuals. It enables humans to produce as well as to understand speech. The computational frameworks we used for implementing a model of the mental lexicon and speech processing are the NEF (Neural Engineering Framework, Eliasmith et al. 2012, Eliasmith 2013) and the SPA (Semantic Pointer Architecture, Eliasmith et al 2012, Stewart & Eliasmith 2014). These frameworks allow modeling of large scale neural networks, comprising sensory, motor and cognitive components. The modeled task is the WWT 6-10 (Word range and Word Retrieval Test, see Glück 2011), which comprises 95 items and is a picture naming and word comprehension task. In case of incorrect answers semantic and phonological cues are also given in order to facilitate word production. A major goal of this study is to introduce a quantitative neurocomputational model for lexical storage as well as for lexical retrieval. A further goal of this study is to associate neural dysfunctions with deficits in speech behavior. Concretely, the deficits of interest are in lexical storage and lexical access. The dysfunctions introduced here are the lesioning of specific neural SPA-buffers and of specific neural connections between these buffers. Based on the behavioral data given by the WWT, we are now able to associate functional neural deficits with symptomatic behavioral data. This allows us to identify potential dysfunctions at neural level for word retrieval and word storage.
Abstract: Artificial neural networks are known to perform function approximation but with increasingly large non-redundant input spaces, the number of required neurons grows drastically. Functions have to be sampled densely leading to large data sets which imposes problems for applications such as neurorobotics, and requires a long time for training. Furthermore, they perform poorly on extrapolation as there are no model assumptions about the target function. This paper presents a novel network architecture of spiking neural networks for efficient model-based function approximation and prediction based on the concept of multivariate polynomial function approximation. This approach reduces the number of both training samples and required neurons, provides generalization and extrapolation depending on the chosen basis, and is capable of supervised learning. The network is implemented using the Neural Engineering Framework in the Nengo simulator and is centered around a mechanism for efficiently computing products of many input signals. We present the construction of the compound network, performance evaluation and propose a use case of its application.
Abstract: We present a novel approach to achieving temperature-robust behavior in neuromorphic systems that operates at the population level, trading an increase in silicon-neuron count for robustness across temperature. Our silicon neurons' tuning curves were highly sensitive to temperature, which could be decoded from a 400-neuron population with a precision of 0.07°C. We overcame this temperature-sensitivity by combining methods from robust optimization theory with the Neural Engineering Framework. We developed two algorithms and compared their temperature-robustness across a range of 2°C by decoding one period of a sinusoid-like function from populations with 25 to 800 neurons. We find that 560 neurons are required to achieve the same precision across this temperature range as 35 neurons achieved at a single temperature.
Abstract: The most accurate stereo disparity algorithms take dozens or hundreds of seconds to process a single frame. This timescale is impractical for many applications. However, high accuracy is often not needed throughout the scene. Here, we investigate a “foveation” approach (in which some parts of an image are processed more intensively than others) in the context of modern stereo algorithms. We consider two scenarios: disparity estimation with a convolutional network in a robotic grasping context, and disparity estimation with a Markov random field in a navigation context. In each case, combining fast and slow methods in different parts of the scene improves frame rates while maintaining accuracy in the most task-relevant areas. We also demonstrate a simple and broadly applicable utility function for choosing foveal regions, which combines image and task information. Finally, we characterize the benefits of defining multiple individually placed small foveae per image, rather than a single large fovea. We find little benefit, supporting the use of hardware foveae of fixed size and shape. More generally, our results reaffirm that foveation is a practical way to combine speed with task-relevant accuracy. Foveae are present in the most complex biological vision systems, suggesting that they may become more important in artificial vision systems, as these systems become more complex.
Abstract: We explore the effects of parameters in our novel model of model-based reinforcement learning. In this model, spiking neurons are used to represent state-action pairs, learn state transition probabilities, and compute the resulting Q-values needed for action selection. All other aspects of model-based reinforcement learning are computed normally, without neurons. We test our model on a two-stage decision task, and compare its behaviour to ideal model-based behaviour. While some of these parameters have expected effects, such as increasing the learning rate and the number of neurons, we find that the model is surprisingly sensitive to variations in the distribution of neural tuning curves and the length of the time interval between state transitions.
Abstract: We provide a short proof that the uniform distribution of points for the n-ball is equivalent to the uniform distribution of points for the (n + 1)-sphere projected onto n dimensions. This implies the surprising result that one may uniformly sample the n-ball by instead uniformly sampling the (n + 1)-sphere and then arbitrarily discarding two coordinates. Consequently, any procedure for sampling coordinates from the uniform (n + 1)-sphere may be used to sample coordinates from the uniform n-ball without any modification. For purposes of the Semantic Pointer Architecture (SPA), these insights yield an efficient and novel procedure for sampling the dot-product of vectors—sampled from the uniform ball—with unit-length encoding vectors.
Abstract: Theoretical neuroscience is fundamentally concerned with the relationship between biological mechanisms, information processing, and cognitive abilities, yet current models often lack either biophysical realism or cognitive functionality. This thesis aims to partially fill this gap by incorporating geometrically and electrophisologically accurate models of individual neurons into the Neural Engineering Framework (NEF). After discussing the relationship between biologically complex neurons and the core principles/assumptions of the NEF, a neural model of working memory is introduced to demonstrate the NEF's existing capacity to capture biological and cognitive features. This model successfully performs the delayed response task and provides a medium for simulating mental disorders (ADHD) and its pharmacological treatments. Two methods of integrating more biologically sophisticated NEURON models into the NEF are subsequently explored and their ability to implement networks of varying complexity are assessed: the trained synaptic weights do realize the core NEF principles, though several errors remain unresolved. Returning to the working memory model, it is shown that bioneurons can perform the requisite computations in context, and that simulating the biophysical effects of pharmacological compounds produces results consistent with electrophysiological and behavioral data from monkeys.
Abstract: Finger gnosis (the ability to identify which finger has been touched) and magnitude comparison (the ability to determine which of two numbers is larger) are surprisingly correlated. We present a spiking neuron model of a common component that could be used in both tasks: an array of pointers. We show that if the model's single tuned parameter is set to match human accuracy performance in one task, then it also matches on the other task (with the exception of one data point). This provides a novel explanation of the relation, and proposes a common component that could be used across cognitive tasks.
Abstract: Generating associations is important for cognitive tasks including language acquisition and creative problem solving. It remains an open question how the brain represents and processes associations. The Remote Associates Test (RAT) is a task, originally used in creativity research, that is heavily dependent on generating associations in a search for the solutions to individual RAT problems. In this work we present a model that solves the test. Compared to earlier modeling work on the RAT, our hybrid (i.e. non-developmental) model is implemented in a spiking neural network by means of the Neural Engineering Framework (NEF), demonstrating that it is possible for spiking neurons to be organized to store the employed representations and to manipulate them. In particular, the model shows that distributed representations can support sophisticated linguistic processing. The model was validated on human behavioral data including the typical length of response sequences and similarity relationships in produced responses. These data suggest two cognitive processes that are involved in solving the RAT: one process generates potential responses and a second process filters the responses.
Abstract: We present a trajectory generating circuit using efficient function representation coding in a spiking neural network that can generate multiple complex trajectories dynamically from a single network. Integrating multiple trajectories within a single network allows us to explore the transitions between movements. We suggest that this kind of network is a possible mechanism for efficiently storing a wide array of movement features in the cortex, and compare our results to experimental data.
Abstract: One critical factor limiting the size of neural cognitive models is the time required to simulate such models. To reduce simulation time, specialized hardware is often used. However, such hardware can be costly, not readily available, or require specialized software implementations that are difficult to maintain. Here, we present an algorithm that optimizes the computational graph of the Nengo neural network simulator, allowing simulations to run more quickly on commodity hardware. This is achieved by merging identical operations into single operations and restructuring the accessed data in larger blocks of sequential memory. In this way, a time speed-up of up to 6.8 is obtained. While this does not beat the specialized OpenCL implementation of Nengo, this optimization is available on any platform that can run Python. In contrast, the OpenCL implementation supports fewer platforms and can be difficult to install.
Abstract: A model producing behavior mimicking that of a homing desert ant while approaching the nest along a habitual route is presented. The model combines two strategies that interact with each other: local vector navigation and landmark guidance with an average landmark vector. As a multi-segment route with several waypoints is traversed, local vector navigation is mainly used when leaving a waypoint, landmark guidance is mostly used when approaching a waypoint, and a weighted interplay of the two is used in between waypoints. The model comprises a spiking neural network that is developed based on the principles of the Neural Engineering Framework. Its performance is demonstrated with a simulated robot in a virtual environment, which is shown to successfully navigate to the final waypoint in different scenes.
Abstract: We review our current software tools and theoretical methods for applying the Neural Engineering Framework to state-of-the-art neuromorphic hardware. These methods can be used to implement linear and nonlinear dynamical systems that exploit axonal transmission time-delays, and to fully account for nonideal mixed-analog-digital synapses that exhibit higher-order dynamics with heterogeneous time-constants. This summarizes earlier versions of these methods that have been discussed in a more biological context (Voelker & Eliasmith, 2017) or regarding a specific neuromorphic architecture (Voelker et al., 2017).
Abstract: Large-scale neuromorphic hardware platforms, specialized computer systems for energy efficient simulation of spiking neural networks, are being developed around the world, for example as part of the European Human Brain Project (HBP). Due to conceptual differences, a universal performance analysis of these systems in terms of runtime, accuracy and energy efficiency is non-trivial, yet indispensable for further hard- and software development. In this paper we describe a scalable benchmark based on a spiking neural network implementation of the binary neural associative memory. We treat neuromorphic hardware and software simulators as black-boxes and execute exactly the same network description across all devices. Experiments on the HBP platforms under varying configurations of the associative memory show that the presented method allows to test the quality of the neuron model implementation, and to explain significant deviations from the expected reference output.
Abstract: We believe that a Standard Model of the Mind should take into account continuous state representations, continuous timing, continuous actions, continuous learning, and parallel control loops. For each of these, we describe initial models that we have made exploring these directions. While we have demonstrated that it is possible to construct high-level cognitive models with these features (which are uncommon in most cognitive modeling approaches), there are many theoretical challenges still to be faced to allow these features to interact in useful ways and to characterize what may be gained by including these features.
Abstract: Winner-take-all (WTA) mechanisms are an important component of many cognitive models. For example, they are often used to decide between multiple choices or to selectively direct attention. Here we compare two biologically plausible, spiking neural WTA mechanisms. We first provide a novel spiking implementation of the well-known leaky, competing accumulator (LCA) model, by mapping the dynamics onto a population-level representation. We then propose a two-layer spiking independent accumulator (IA) model, and compare its performance against the LCA network on a variety of WTA benchmarks. Our findings suggest that while the LCA network can rapidly adapt to new winners, the IA network is better suited for stable decision making in the presence of noise.
Abstract: The development of the field of reinforcement learning was based on psychological studies of the instrumental conditioning of humans and other animals. Recently, reinforcement learning algorithms have been applied to neuroscience to help characterize neural activity and animal behaviour in instrumental conditioning tasks. A specific example is the hybrid learner developed to match human behaviour on a two-stage decision task. This hybrid learner is composed of a model-free and a model-based system. The model presented in this thesis is an implementation of that model-based system where the state transition probabilities and Q-value calculations use biologically plausible spiking neurons. Two variants of the model demonstrate the behaviour when the state transition probabilities are encoded in the network at the beginning of the task, and when these probabilities are learned over the course of the task. Various parameters that affect the behaviour of the model are explored, and ranges of these parameters that produce characteristically model-based behaviour are found. This work provides an important first step toward understanding how a model-based system in the human brain could be implemented, and how this system contributes to human behaviour.
Abstract: Conceptors, although biologically implausible, admirably capture high dimensional dynamical patterns. This report contains a concise overview of Conceptors and describes how the same dynamical pattern approximation can be achieved, with limitations, in a biologically plausible manner using the Neural Engineering Framework. Two methods are compared to Conceptors with this goal in mind: Rhythmic Dynamic Movement Primitives, both with and without point attractors. In terms of representing a dynamic signal, Dynamic Movement Primitives implemented directly performed well for low-frequency signals, while Conceptors performed well for high-frequency sinusoidal signals. In terms of blending between dynamic signals, Conceptors are distinct from Dynamic Movement Primitives, but this usefulness of this is unclear.
Abstract: Prescribed Error Sensitivity (PES) is a biologically plausible supervised learning rule that is frequently used with the Neural Engineering Framework (NEF). PES modifies the connection weights between populations of spiking neurons to minimize an error signal. Continuing the work of Voelker (2015), we solve for the dynamics of PES, while filtering the error with an arbitrary linear synapse model. For the most common case of a lowpass filter, the continuous-time weight changes are characterized by a second-order bandpass filter with frequency $\omega = \sqrt \tau ^-1 \kappa \|\bf a\|^2 $ and bandwidth $Q = \sqrt \tau \kappa \|\bf a\|^2 $, where τ is the exponential time constant, κ is the learning rate, and $\bf a$ is the activity vector. Therefore, the error converges to zero, yet oscillates if and only if $\tau \kappa \|\bf a\|^2 > \frac 14$. This provides a heuristic for setting κ based on the synaptic τ, and a method for engineering remarkably accurate decaying oscillators using only a single spiking leaky integrate-and-fire neuron.
Abstract: Cognitive models have long been used to study linguistic phenomena spanning the domains of phonology, syntax, and semantics. Of these domains, semantics is unique in that there is little clarity concerning what a model ought to do to provide an account of how the meanings of complex linguistic expressions are understood. To address this problem, we introduce a neural model that is trained to generate sentences that follow from an input sentence. The model is trained using the Stanford Natural Language Inference dataset, and to evaluate its performance, we report entailment prediction accuracies on test sentences not present in the training data. We also report the results of a simple study that compares human plausibility ratings for both ground-truth and model-generated entailments for a random selection of test sentences. Taken together, these analyses indicate that the model accounts for important inferential relationships amongst linguistic expressions.
Abstract: We develop a novel, biologically detailed neural model of reinforcement learning (RL) processes in the brain. This model incorporates a broad range of biological features that pose challenges to neural RL, such as temporally extended action sequences, continuous environments involving unknown time delays, and noisy/imprecise computations. Most significantly, we expand the model into the realm of hierarchical reinforcement learning (HRL), which divides the RL process into a hierarchy of actions at different levels of abstraction. Here we implement all the major components of HRL in a neural model that captures a variety of known anatomical and physiological properties of the brain. We demonstrate the performance of the model in a range of different environments, in order to emphasize the aim of understanding the brain’s general reinforcement learning ability. These results show that the model compares well to previous modelling work and demonstrates improved performance as a result of its hierarchical ability. We also show that the model’s behaviour is consistent with available data on human hierarchical RL, and generate several novel predictions.
Abstract: The "backprop" algorithm has led to incredible successes for machines on object recognition tasks (among others), but how similar types of supervised learning may occur in the brain remains unclear. We present a fully spiking, biologically plausible supervised learning algorithm that extends the Feedback Alignment (FA) algorithm to run in spiking LIF neurons. This entirely spiking learning algorithm is a novel hypothesis about how biological systems may perform deep supervised learning. It addresses a number of the key problems with the biological plausibility of the backprop algorithm: 1) It does not use the transpose weight matrix to propagate error backwards, but rather uses a random weight matrix. 2) It does not use the derivative of the hidden unit activation function, but rather uses a function of the hidden neurons' filtered spiking outputs. We test this algorithm on a simple input-output function learning task with a two-hidden-layer deep network. The algorithm is able to learn at both hidden layers, and performs much better than shallow learning. Future work includes extending this algorithm to more challenging datasets, and comparing it with other candidate algorithms for more biologically plausible learning.
Keywords: Working memory; Delayed response task; Neural Engineering Framework;Pharmacology; ADHD
Abstract: We use a spiking neural network model of working memory (WM) capable of performing the spatial delayed response task (DRT) to investigate two drugs that affect WM: guanfacine (GFC) and phenylephrine (PHE). In this model, the loss of information over time results from changes in the spiking neural activity through recurrent connections. We reproduce the standard forgetting curve and then show that this curve changes in the presence of GFC and PHE, whose application is simulated by manipulating functional, neural, and biophysical properties of the model. In particular, applying GFC causes increased activity in neurons that are sensitive to the information currently being remembered, while applying PHE leads to decreased activity in these same neurons. Interestingly, these differential effects emerge from network-level interactions because GFC and PHE affect all neurons equally. We compare our model to both electrophysiological data from neurons in monkey dorsolateral prefrontal cortex and to behavioral evidence from monkeys performing the DRT.
Abstract: The most general goal of semantic theory is to explain facts about language use. In keeping with this goal, I introduce a framework for thinking about linguistic expressions in terms of (a) the inferences they license, (b) the behavioral predictions that their uses thereby sustain, and (c) the affordances that they provide to language users in virtue of these inferential and predictive involvements. Within this framework, linguistic expressions acquire meanings by regulating social practices that involve “intentional interpretation,” wherein people explain and predict one another’s behavior through linguistically specified mental state attributions. Developing a theory of meaning therefore requires formalizing the inferential roles that determine how linguistic expressions license predictions in the context intentional interpretation. Accordingly, the view I develop is an inferential role semantics for natural language. To describe this semantics, I take advantage of recently developed techniques in the field of natural language processing. I introduce a model that assigns inferential roles to arbitrary linguistic expressions by learning from examples of how sentences are distributed as premises and conclusions in a space of possible inferences. I then empirically evaluate the model’s ability to generate accurate entailments for novel sentences not used as training examples. I argue that this model takes a small but important step towards codifying the meanings of the expressions it manipulates. Next, I examine the theoretical implications of this work with respect to debates about the compositionality of language, the relationship between language and cognition, and the relationship between language and the world. With respect to compositionality, I argue that the debate is really about generalization in language use, and that the required sort of generalization can be achieved by “interpolating” between familiar examples of correct inferential transitions. With respect to the relationship between thought and language, I argue that it is a mistake to try to derive a theory of natural language semantics from a prior theory of mental representation because theories of mental representation invoke the sort of intentional interpretation at play in language use from the get-go. With respect to the relationship between language and the world, I argue that questions about truth conditions and reference relations are best thought of in terms of questions about the norms governing language use. These norms, in turn, are best characterized in primarily inferential terms. I conclude with an all-things-considered evaluation of my theory that demonstrates how it overcomes a number of challenges associated with semantic theories that take inference, rather than reference, as their starting point.
Abstract: The Neural Engineering Framework (NEF) is a theory for mapping computations onto biologically plausible networks of spiking neurons. This theory has been applied to a number of neuromorphic chips. However, within both silicon and real biological systems, synapses exhibit higher-order dynamics and heterogeneity. To date, the NEF has not explicitly addressed how to account for either feature. Here, we analytically extend the NEF to directly harness the dynamics provided by heterogeneous mixed-analog-digital synapses. This theory is successfully validated by simulating two fundamental dynamical systems in Nengo using circuit models validated in SPICE. Thus, our work reveals the potential to engineer robust neuromorphic systems with well-defined high-level behaviour that harness the low-level heterogeneous properties of their physical primitives with millisecond resolution.
Abstract: Insect-scale flapping robots are challenging to stabilize due to their fast dynamics, unmodeled parameter variations, and the periodic nature of their control input. Effective controller designs must tolerate wing asymmetries that occur due to manufacturing errors and react quickly to stabilize the fast unstable modes of the system. Additionally, they should have minimal power requirements to fit within the tightly constrained power budget associated with insect-scale flying robots. Adaptive control methods are capable of learning online to account for uncertain physical parameters and other model uncertainties, and can thus improve system performance over time. In this work, a spiking neural network is used to stabilize hovering of an insect-scale robot in the presence of unknown parameter variations. The controller is shown to adapt rapidly during a simulated flight test and requires a total of only 800 neurons, allowing it to be implemented with minimal power requirements.
Abstract: In the Neural Engineering Framework (NEF), individual neuron tuning curves are often characterized in terms of a maximum firing rate and an $x$-intercept. However, for LIF neurons with conductance-based synapses it is not immediately clear how maximum rate and $x$-intercept should be mapped to excitatory and inhibitory conductance input functions $g_\mathrm E(x)$, $g_\mathrm I(x)$. In this technical report we describe a method for deriving such functions and compare the resulting conductance-based tuning curves to current-based tuning curves with equivalent parameters. For large maximum rates and $x$-intercepts the conductance-based tuning curves possess a significantly steeper spike-rate onset compared to their current-based counterparts.
Abstract: The semantic fluency task has been used to understand the effects of semantic relationships on human memory search. A variety of computational models have been proposed that explain human behavioral data, yet it remains unclear how millions of spiking neurons work in unison to realize the cognitive processes involved in memory search. In this paper, we present a biologically constrained neural network model that performs the task in a fashion similar to humans. The model reproduces experimentally observed response timing effects, as well as similarity trends within and across semantic categories derived from responses. Three different sources of the association data have been tested by embedding associations in neural connections, with free association norms providing the best match.
Abstract: The mathematical model underlying the Neural Engineering Framework (NEF) expresses neuronal input as a linear combination of synaptic currents. However, in biology, synapses are not perfect current sources and are thus nonlinear. Detailed synapse models are based on channel conductances instead of currents, which require independent handling of excitatory and inhibitory synapses. This, in particular, significantly affects the influence of inhibitory signals on the neuronal dynamics. In this technical report we first summarize the relevant portions of the NEF and conductance-based synapse models. We then discuss a naïve translation between populations of LIF neurons with current- and conductance-based synapses based on an estimation of an average membrane potential. Experiments show that this simple approach works relatively well for feed-forward communication channels, yet performance degrades for NEF networks describing more complex dynamics, such as integration.
Abstract: Agent-based models are versatile tools for studying how societal opinion change, including political polarization and cultural diffusion, emerges from individual behavior. This study expands agents’ psychological realism using empirically-motivated rules governing interpersonal influence, commitment to previous beliefs, and conformity in social contexts. Computational experiments establish that these extensions produce three novel results: (a) sustained “strong” diversity of opinions across society, (b) opinion subcultures, and (c) pluralistic ignorance. These phenomena arise from a combination of agents’ intolerance, susceptibility and conformity, with extremist agents and social networks playing important roles. The distribution and dynamics of simulated opinions reproduce two empirical datasets on Americans' political opinions.
Abstract: The reconciliation of theories of concepts based on prototypes, exemplars, and theory-like structures is a longstanding problem in cognitive science. In response to this problem, researchers have recently tended to adopt either hybrid theories that combine various kinds of representational structure, or eliminative theories that replace concepts with a more finely grained taxonomy of mental representations. In this paper, we describe an alternative approach involving a single class of mental representations called “semantic pointers.” Semantic pointers are symbol-like representations that result from the compression and recursive binding of perceptual, lexical, and motor representations, effectively integrating traditional connectionist and symbolic approaches. We present a computational model using semantic pointers that replicates experimental data from categorization studies involving each prior paradigm. We argue that a framework involving semantic pointers can provide a unified account of conceptual phenomena, and we compare our framework to existing alternatives in accounting for the scope, content, recursive combination, and neural implementation of concepts.
Keywords: Biologically-inspired robots, Force and tactile sensing, Neurorobotics
Abstract: Giving robots the ability to classify surface textures requires appropriate sensors and algorithms. Inspired by the biology of human tactile perception, we implement a neurorobotic texture classifier with a recurrent spiking neural network, using a novel semi-supervised approach for classifying dynamic stimuli. Input to the network is supplied by accelerometers mounted on a robotic arm. The sensor data is encoded by a heterogeneous population of neurons, modeled to match the spiking activity of mechanoreceptor cells. This activity is convolved by a hidden layer using bandpass filters to extract nonlinear frequency information from the spike trains. The resulting high-dimensional feature representation is then continuously classified using a neurally implemented support vector machine. We demonstrate that our system classifies 18 metal surface textures scanned in two opposite directions at a constant velocity. We also demonstrate that our approach significantly improves upon a baseline model that does not use the described feature extraction. This method can be performed in real-time using neuromorphic hardware, and can be extended to other applications that process dynamic stimuli online.
Keywords: semantic spaces; vector representations; spiking neurons; insight, Remote Associates Test
Abstract: The ability to associate words is an important cognitive skill. In this study we investigate different methods for representing word associations in the brain, using the Remote Associates Test (RAT) as a task. We explore representations derived from free association norms and statistical n-gram data. Although n-gram representations yield better performance on the test, a closer match with the human performance is obtained with representations derived from free associations. We propose that word association strengths derived from free associations play an important role in the process of RAT solving. Furthermore, we show that this model can be implemented in spiking neurons, and estimate the number of biologically realistic neurons that would suffice for an accurate representation.
Keywords: FPGA; Neural Engineering Framework; neuromorphic engineering
Abstract: Models of neural systems often use idealized inputs and out- puts, but there is also much to learn by forcing a neural model to inter- act with a complex simulated or physical environment. Unfortunately, sophisticated interactions require models of large neural systems, which are difficult to run in real time. We have prototyped a system that can simulate efficient surrogate models of a wide range of neural circuits in real time, with a field programmable gate array (FPGA). The scale of the simulations is increased by avoiding simulation of individual neu- rons, and instead simulating approximations of the collective activity of groups of neurons. The system can approximate roughly a million spiking neurons in a wide range of configurations
Keywords: Neural engineering framework, Semantic pointer architecture, Nengo, Cognitive modeling, Mathematical ability, Dyscalculia, Skill consolidation
Abstract: The ability to improve in speed and accuracy as a result of repeating some task is an important hallmark of intelligent biological systems. Although gradual behavioral improvements from practice have been modeled in spiking neural networks, few such models have attempted to explain cognitive development of a task as complex as addition. In this work, we model the progression from a counting-based strategy for addition to a recall-based strategy. The model consists of two networks working in parallel: a slower basal ganglia loop and a faster cortical network. The slow network methodically computes the count from one digit given another, corresponding to the addition of two digits, whereas the fast network gradually “memorizes” the output from the slow network. The faster network eventually learns how to add the same digits that initially drove the behavior of the slower network. Performance of this model is demonstrated by simulating a fully spiking neural network that includes basal ganglia, thalamus, and various cortical areas. Consequently, the model incorporates various neuroanatomical data, in terms of brain areas used for calculation and makes psychologically testable predictions related to frequency of rehearsal. Furthermore, the model replicates developmental progression through addition strategies in terms of reaction times and accuracy, and naturally explains observed symptoms of dyscalculia.
Abstract: Production and comprehension of speech are closely interwoven. For example, the ability to detect an error in one's own speech, halt speech production, and finally correct the error can be explained by assuming an inner speech loop which continuously compares the word representations induced by production to those induced by perception at various cognitive levels (e.g. conceptual, word, or phonological levels). Because spontaneous speech errors are relatively rare, a picture naming and halt paradigm can be used to evoke them. In this paradigm, picture presentation (target word initiation) is followed by an auditory stop signal (distractor word) for halting speech production. The current study seeks to understand the neural mechanisms governing self-detection of speech errors by developing a biologically inspired neural model of the inner speech loop. The neural model is based on the Neural Engineering Framework (NEF) and consists of a network of about 500,000 spiking neurons. In the first experiment we induce simulated speech errors semantically and phonologically. In the second experiment, we simulate a picture naming and halt task. Target-distractor word pairs were balanced with respect to variation of phonological and semantic similarity. The results of the first experiment show that speech errors are successfully detected by a monitoring component in the inner speech loop. The results of the second experiment show that the model correctly reproduces human behavioral data on the picture naming and halt task. In particular, the halting rate in the production of target words was lower for phonologically similar words than for semantically similar or fully dissimilar distractor words. We thus conclude that the neural architecture proposed here to model the inner speech loop reflects important interactions in production and perception at phonological and semantic levels.
Abstract: The Semantic Pointer Architecture (SPA) is a proposal of specifying the computations and architectural elements needed to account for cognitive functions. By means of the Neural Engineering Framework (NEF) this proposal can be realized in a spiking neural network. However, in any such network each SPA transformation will accumulate noise. By increasing the accuracy of common SPA operations, the overall network performance can be increased considerably. As well, the representations in such networks present a trade-off between being able to represent all possible values and being only able to represent the most likely values, but with high accuracy. We derive a heuristic to find the near-optimal point in this trade-off. This allows us to improve the accuracy of common SPA operations by up to 25 times. Ultimately, it allows for a reduction of neuron number and a more efficient use of both traditional and neuromorphic hardware, which we demonstrate here.
Abstract: This report investigates how neurons with complex dynamics, specifically adaptation, can be incorporated into the Neural Engineering Framework. The focus of the report is fitting a linear-nonlinear system model to an adapting neuron model using system identification techniques. By characterizing the neuron dynamics in this way, we hope to gain a better understanding of what sort of temporal basis the neurons in a population provide, which will determine what kinds of dynamics can be decoded from the neural population. The report presents four system identification techniques: a correlation-based method, a least-squares method, an iterative least-squares technique based of Paulin's algorithm, and a general iterative least squares method based of gradient descent optimization. These four methods are all used to fit linear-nonlinear models to the adapting neuron model. We find that the Paulin least-squares method performs the best in this situation, and linear-nonlinear models fit in this manner are able to capture the relevant adaptation dynamics of the neuron model. Other questions related to the system identification, such as the type of input to use and the amount of regularization required for the least-squares methods, are also answered empirically. The report concludes by performing system identification on 20 neurons with a range of adaptation parameters, and examining what type of temporal basis these neurons provide.
Abstract: We describe a method to train spiking deep networks that can be run using leaky integrate-and-fire (LIF) neurons, achieving state-of-the-art results for spiking LIF networks on five datasets, including the large ImageNet ILSVRC-2012 benchmark. Our method for transforming deep artificial neural networks into spiking networks is scalable and works with a wide range of neural nonlinearities. We achieve these results by softening the neural response function, such that its derivative remains bounded, and by training the network with noise to provide robustness against the variability introduced by spikes. Our analysis shows that implementations of these networks on neuromorphic hardware will be many times more power-efficient than the equivalent non-spiking networks on traditional hardware.
Abstract: The ability to improve in speed and accuracy as a result of repeating some task is an important hallmark of intelligent biological systems. We model the progression from a counting-based strategy for addition to a recall-based strategy. The model consists of two networks working in parallel: a slower basal ganglia loop, and a faster cortical network. The slow network methodically computes the count from one digit given another, corresponding to the addition of two digits, while the fast network gradually "memorizes" the output from the slow network. The faster network eventually learns how to add the same digits that initially drove the behaviour of the slower network. Performance of this model is demonstrated by simulating a fully spiking neural network that includes basal ganglia, thalamus and various cortical areas. (*) Best Student Paper Award: Computational Modeling Prize in Applied Cognition
Abstract: Past research on action planning has shed light on the neural mechanisms underlying the selection of simple motor actions, along with the cognitive mechanisms underlying the planning of action sequences in constrained problem solving domains. We extend this research by describing a neural model that rapidly plans action sequences in relatively unconstrained domains by manipulating structured representations of objects and the actions they typically afford. We provide an analysis that indicates our model is able to reliably accomplish goals that require correctly performing a sequence of up to 5 actions in a simulated environment. We also provide an analysis of the scaling properties of our model with respect to the number of objects and affordances that constitute its knowledge of the environment. Using simplified simulations we find that our model is likely to function effectively while picking from 10,000 actions related to 25,000 objects.
Abstract: We present a spiking neuron model of the motor cortices and cerebellum of the motor control system. The model consists of anatomically organized spiking neurons encompassing premotor, primary motor, and cerebellar cortices. The model proposes novel neural computations within these areas to control a nonlinear three-link arm model that can adapt to unknown changes in arm dynamics and kinematic structure. We demonstrate the mathematical stability of both forms of adaptation, suggesting that this is a robust approach for common biological problems of changing body size (e.g. during growth), and unexpected dynamic perturbations (e.g. when moving through different media, such as water or mud). To demonstrate the plausibility of the proposed neural mechanisms, we show that the model accounts for data across 19 studies of the motor control system. These data include a mix of behavioural and neural spiking activity, across subjects performing adaptive and static tasks. Given this proposed characterization of the biological processes involved in motor control of the arm, we provide several experimentally testable predictions that distinguish our model from previous work.
Keywords: Biological system modeling;Computational modeling;Decoding;Neural engineering;Neurons;Spinnaker;Neuromorphics
Abstract: The biological brain is a highly plastic system within which the efficacy and structure of synaptic connections are constantly changing in response to internal and external stimuli. While numerous models of this plastic behavior exist at various levels of abstraction, how these mechanisms allow the brain to learn meaningful values is unclear. The Neural Engineering Framework (NEF) is a hypothesis about how large-scale neural systems represent values using populations of spiking neurons, and transform them using functions implemented by the synaptic weights between populations. By exploiting the fact that these connection weight matrices are factorable, we have recently shown that static NEF models can be simulated very efficiently using the SpiNNaker neuromorphic architecture. In this paper, we demonstrate how this approach can be extended to efficiently support both supervised and unsupervised learning rules designed to operate on these factored matrices. We then present a heteroassociative memory architecture built using these learning rules and prove that it is capable of learning a human-scale semantic network. Finally we demonstrate a 100 000 neuron version of this architecture running on the SpiNNaker simulator with a speed-up exceeding 150x when compared to the Nengo reference simulator.
Abstract: Current state-of-the-art approaches to computational speech recognition and synthesis are based on statistical analyses of extremely large data sets. It is currently unknown how these methods relate to the methods that the human brain uses to perceive and produce speech. In this thesis, I present a conceptual model, Sermo, which describes some of the computations that the human brain uses to perceive and produce speech. I then implement three large-scale brain models that accomplish tasks theorized to be required by Sermo, drawing upon techniques in automatic speech recognition, articulatory speech synthesis, and computational neuroscience. The first model extracts features from an audio signal by performing a frequency decomposition with an auditory periphery model, then decorrelating the information in that power spectrum with methods commonly used in audio and image compression. I show that the features produced by this model implemented with biologically plausible spiking neurons can be used to classify phones in pre-segmented speech with significantly better accuracy than the features typically used in automatic speech recognition systems. Additionally, I show that this model can be used to compare auditory periphery models in terms of their ability to support phone classification of pre-segmented speech. The second model uses a symbol-like neural representation of a sequence of syllables to generate a trajectory of premotor commands that can be used to control an articulatory synthesizer. I show that the model can produce trajectories up to several seconds in length from a static syllable sequence representation that result in intelligible synthesized speech. The trajectories reflect the high temporal variability of human speech, and smoothly transition between successive syllables, even in rapid utterances. The third model classifies syllables from a trajectory of premotor commands. I show that the model is able to classify syllables online despite high temporal variability, and can produce the same syllable representations used by the second model. These two models can be connected in future work in order to implement a closed-loop sensorimotor speech system. Unlike current computational approaches, all three of these models are implemented with biologically plausible spiking neurons, which can be simulated with neuromorphic hardware, and can interface naturally with artificial cochleas. All models are shown to scale to the level of adult human vocabularies in terms of the neural resources required, though limitations on their performance as a result of scaling will be discussed.
Keywords: Python, Bayesian optimization, machine learning, Scikit-learn
Abstract: Sequential model-based optimization (also known as Bayesian optimization) is one of the most efficient methods (per function evaluation) of function minimization. This efficiency makes it appropriate for optimizing the hyperparameters of machine learning algorithms that are slow to train. The Hyperopt library provides algorithms and parallelization infrastructure for performing hyperparameter optimization (model selection) in Python. This paper presents an introductory tutorial on the usage of the Hyperopt library, including the description of search spaces, minimization (in serial and parallel), and the analysis of the results collected in the course of minimization. This paper also gives an overview of Hyperopt-Sklearn, a software project that provides automatic algorithm configuration of the Scikit-learn machine learning library. Following Auto-Weka, we take the view that the choice of classifier and even the choice of preprocessing module can be taken together to represent a single large hyperparameter optimization problem. We use Hyperopt to define a search space that encompasses many standard components (e.g. SVM, RF, KNN, PCA, TFIDF) and common patterns of composing them together. We demonstrate, using search algorithms in Hyperopt and standard benchmarking data sets (MNIST, 20-newsgroups, convex shapes), that searching this space is practical and effective. In particular, we improve on best-known scores for the model space for both MNIST and convex shapes. The paper closes with some discussion of ongoing and future work.
Abstract: Prescribed Error Sensitivity (PES) is a biologically plausible supervised learning rule that is frequently used with the Neural Engineering Framework (NEF). PES modifies the connection weights between populations of neurons to minimize an external error signal. We solve the discrete dynamical system for the case of constant inputs and no noise, to show that the decoding vectors given by the NEF have a simple closed-form expression in terms of the number of simulation timesteps. Moreover, with $\gamma = (1 - \kappa ||a||^2) < 1$, where κ is the learning rate and a is the vector of firing rates, the error at timestep $k$ is the initial error times $\gamma ^k$. Thus, $\gamma > -1$ implies exponential convergence to a unique stable solution, $\gamma < 0$ results in oscillatory weight changes, and $\gamma \le -1$ implies instability.
Abstract: This thesis explores the application of a biologically inspired adaptive controller to quadcopter flight control. This begins with an introduction to modelling the dynamics of a quadcopter, followed by an overview of control theory and neural simulation in Nengo. The Virtual Robotics Experimentation Platform (V-REP) is used to simulate the quadcopter in a physical environment. Iterative design improvements leading to the final controller are discussed. The controller model is run on a series of benchmark tasks and its performance is compared to conventional controllers. The results show that the neural adaptive controller performs on par with conventional controllers on simple tasks but exceeds far beyond these controllers on tasks involving unexpected external forces in the environment.
Abstract: Evaluating the effectiveness and performance of neuromorphic hardware is difficult. It is even more difficult when the task of interest is a closed-loop task; that is, a task where the output from the neuromorphic hardware affects some environment, which then in turn affects the hardware's future input. However, closed-loop situations are one of the primary potential uses of neuromorphic hardware. To address this, we present a methodology for generating closed-loop benchmarks that makes use of a hybrid of real physical embodiment and a type of 'minimal' simulation. Minimal simulation has been shown to lead to robust real-world performance, while still maintaining the practical advantages of simulation, such as making it easy for the same benchmark to be used by many researchers. This method is flexible enough to allow researchers to explicitly modify the benchmarks to identify specific task domains where particular hardware excels. To demonstrate the method, we present a set of novel benchmarks that focus on motor control for an arbitrary system with unknown external forces. Using these benchmarks, we show that an error-driven learning rule can consistently improve motor control performance across a randomly generated family of closed-loop simulations, even when there are up to 15 interacting joints to be controlled.
Keywords: natural language processing; parsing; optimization; harmonic grammar; holographic reduced representations; semantic pointer architecture
Abstract: The idea that optimization plays a key role in linguistic cognition is supported by an increasingly large body of research. Building on this research, we describe a new approach to parsing distributed representations via optimization over a set of soft constraints on the wellformedness of parse trees. This work extends previous research involving the use of constraint-based or “harmonic” grammars by suggesting how parsing can be accomplished using fully distributed representations that preserve their dimensionality with arbitrary increases in structural complexity. We demonstrate that this method can be used to correctly evaluate the wellformedness of linguistic structures generated by a simple context-free grammar, and discuss a number of extensions concerning the neural implementation of the method and its application to complex parsing tasks.
Abstract: The modeling of neural systems often involves representing the temporal structure of a dynamic stimulus. We extend the methods of the Neural Engineering Framework (NEF) to generate recurrently connected populations of spiking neurons that compute functions across the history of a time-varying signal, in a biologically plausible neural network. To demonstrate the method, we propose a novel construction to approximate a pure delay, and use that approximation to build a network that represents a finite history (sliding window) of its input. Specifically, we solve for the state-space representation of a pure time-delay filter using Pade-approximants, and then map this system onto the dynamics of a recurrently connected population. The construction is robust to noisy inputs over a range of frequencies, and can be used with a variety of neuron models including: leaky integrate-and-fire, rectified linear, and Izhikevich neurons. Furthermore, we extend the approach to handle various models of the post-synaptic current (PSC), and characterize the effects of the PSC model on overall dynamics. Finally, we show that each delay may be modulated by an external input to scale the spacing of the sliding window on-the-fly. We demonstrate this by transforming the sliding window to compute filters that are linear (e.g., discrete Fourier transform) and nonlinear (e.g., mean squared power), with controllable frequency.
Abstract: Our ongoing investigations into biologically plausible syntactic and semantic parsing have identified a novel methodology for processing complex structured information. This approach combines Vector Symbolic Architectures (a method for representing sentence structures as distributed vectors), the Neural Engineering Framework (a method for organizing biologically realistic neurons to approximate algorithms), and constraint-based parsing (a method for creating dynamic systems that converge to correct parsings). Here, we present some of our initial findings that show the promise of this approach for explaining the complex, flexible, and scalable parsing abilities found in humans.
Abstract: We train spiking deep networks using leaky integrate-and-fire (LIF) neurons, and achieve state-of-the-art results for spiking networks on the CIFAR-10 and MNIST datasets. This demonstrates that biologically-plausible spiking LIF neurons can be integrated into deep networks can perform as well as other spiking models (e.g. integrate-and-fire). We achieved this result by softening the LIF response function, such that its derivative remains bounded, and by training the network with noise to provide robustness against the variability introduced by spikes. Our method is general and could be applied to other neuron types, including those used on modern neuromorphic hardware. Our work brings more biological realism into modern image classification models, with the hope that these models can inform how the brain performs this difficult task. It also provides new methods for training deep networks to run on neuromorphic hardware, with the aim of fast, power-efficient image classification for robotics applications.
Keywords: n-back task; neural engineering; computational neuroscience, vector symbolic architecture
Abstract: We present a computational model performing the n-back task. This task requires a number of cognitive processes including rapid binding, updating, and retrieval of items in working memory. The model is implemented in spiking leaky-integrate-and-fire neurons with physiologically constrained parameters, and anatomically constrained organization. The methods of the Semantic Pointer Architecture (SPA) are used to construct the model. Accuracies and reaction times produced by the model are shown to match human data. Namely, characteristic decline in accuracy and response speed with increase of n is reproduced. Furthermore, the model provides evidence, contrary to some past proposals, that an active removal process of items in working memory is not necessary for an accurate performance on the n-back task.
Abstract: It has been suggested that Marr took the three levels he famously identifies to be independent. In this paper, we argue that Marr’s view is more nuanced. Specifically, we show that the view explicitly articulated in his work attempts to integrate the levels, and in doing so results in Marr attacking both reductionism and vagueness. The result is a perspective in which both high-level information-processing constraints and low-level implementational constraints play mutually reinforcing and constraining roles. We discuss our recent work on Spaun — currently the world’s largest functional brain model — that demonstrates the positive impact of this kind of unifying integration of Marr’s levels. We argue that this kind of integration avoids his concerns with both reductionism and vagueness. In short, we suggest that the methods behind Spaun can be used to satisfy Marr’s explicit interest in combining high-level functional and detailed mechanistic explanations.
Keywords: Knowledge representation, Connectionism, Neural network, Biologically plausible, Vector symbolic architecture, WordNet, Scaling
Abstract: Several approaches to implementing symbol-like representations in neurally plausible models have been proposed. These approaches include binding through synchrony (Shastri & Ajjanagadde, 1993), "mesh" binding (van der Velde & de Kamps, 2006), and conjunctive binding (Smolensky, 1990). Recent theoretical work has suggested that most of these methods will not scale well, that is, that they cannot encode structured representations using any of the tens of thousands of terms in the adult lexicon without making implausible resource assumptions. Here, we empirically demonstrate that the biologically plausible structured representations employed in the Semantic Pointer Architecture (SPA) approach to modeling cognition (Eliasmith, 2013) do scale appropriately. Specifically, we construct a spiking neural network of about 2.5 million neurons that employs semantic pointers to successfully encode and decode the main lexical relations in WordNet, which has over 100,000 terms. In addition, we show that the same representations can be employed to construct recursively structured sentences consisting of arbitrary WordNet concepts, while preserving the original lexical structure. We argue that these results suggest that semantic pointers are uniquely well-suited to providing a biologically plausible account of the structured representations that underwrite human cognition.
Keywords: Freezing of speech movements
Abstract: Abstract Background: Reduction of dopamine in basal ganglia is a common cause of Parkinson's Disease (PD). If dopamine-producing cells die in the substantia nigra, as seen in PD, a typical symptom is freezing of articulatory movements during speech production. Goal: It is the goal of this study to simulate syllable sequencing tasks by computer modelling of the cortico-basal ganglia-thalamus-cortical action selection loop using different levels of dopamine in order to investigate the freezing effect in more detail. Method: This simulation was done using the Neural Engineering Object (Nengo) software tool. In the simulation, two dopamine level parameters (lg and le), representing the effect of \D 1 and \D 2 receptors, and therefore the level of dopamine in striatum respectively, can be differentiated and modified. Results: By a decrease of the dopamine level parameters lg and le to 50
Abstract: By building and simulating neural systems we hope to understand how the brain may work and use this knowledge to build neural and cognitive systems to tackle engineering problems. The Neural Engineering Framework (NEF) is a hypothesis about how such systems may be constructed and has recently been used to build the world's first functional brain model, Spaun. However, while the NEF simplifies the design of neural networks, simulating them using standard computer hardware is still computationally expensive – often running far slower than biological real-time and scaling very poorly: problems the SpiNNaker neuromorphic simulator was designed to solve. In this paper we (1) argue that employing the same model of computation used for simulating general purpose spiking neural networks on SpiNNaker for NEF models results in suboptimal use of the architecture, and (2) provide and evaluate an alternative simulation scheme which overcomes the memory and compute challenges posed by the NEF. This proposed method uses factored weight matrices to reduce memory usage by around 90
Abstract: This report discusses how to implement the multiplication of two numbers in the NEF with a high accuracy. The main improvement will be achieved by using diagonal encoders instead of randomly distributed encoders. In the selection of the evaluation points a trade-off is found as they cannot give a uniform distribution when projected to the encoders while being limited to the input domain. That leads to an alternative multiplication network architecture improving the accuracy further.
Keywords: neuromorphic, robotics, NEF, Spinnaker
Abstract: (Best Paper Honorable Mention) Brain-inspired, spike-based computation in electronic systems is being investigated for developing alternative, non-conventional computing technologies. The Neural Engineering Framework provides a method for programming these devices to implement computation. In this paper we apply this approach to perform arbitrary mathematical computation using a mixed signal analog/digital neuromorphic multi-neuron VLSI chip. This is achieved by means of a network of spiking neurons with multiple weighted connections. The synaptic weights are stored in a 4-bit on-chip programmable SRAM block. We propose a parallel event-based method for calibrating appropriately the synaptic weights and demonstrate the method by encoding and decoding arbitrary mathematical functions, and by implementing dynamical systems via recurrent connections.
Abstract: Hyperopt-sklearn is a new software project that provides automatic algorithm configuration of the Scikit-learn machine learning library. Following Auto-Weka, we take the view that the choice of classifier and even the choice of pre-processing module can be taken together to represent a single large hyperparameter optimization problem. We use Hyperopt to define a search space that encompasses many standard components (e.g. SVM, RF, KNN, PCA, TFIDF) and common patterns of composing them together. We demonstrate, using search algorithms in Hyperopt and standard benchmarking data sets (MNIST, 20-Newsgroups, Convex Shapes), that searching this space is practical and effective. In particular, we improve on best-known scores for the model space for both MNIST and Convex Shapes.
Abstract: Model selection, also known as hyperparameter tuning, can be viewed as a blackbox optimization problem. Recently the HPOlib benchmarking suite was advanced to facilitate algorithm comparison between hyperparameter optimization algorithms. We compare seven optimization algorithms implemented in the Hyperopt optimization package, including a new annealing-type algorithm and a new family of Gaussian Process-based SMBO methods, on four screening problems from HPOLib. We find that methods based on Gaussian Processes (GPs) are the most call-efficient. Vanilla GP-based methods using stationary RBF kernels and maximum likelihood kernel parameter estimation provide a near-perfect ability to optimize the benchmarks. Despite being slower than more heuristic baselines, a Theano-based GP-SMBO implementation requires at most a few seconds to produce a candidate evaluation point. We compare this vanilla approach to Hybrid Monte-Carlo integration of the kernel lengthscales and fail to find compelling advantages of this more expensive procedure.
Abstract: A long-standing challenge in cognitive science is how neurons could be capable of the flexible structured processing that is the hallmark of cognition. We present a spiking neural model that can be given an input sequence of words (a sentence) and produces a structured tree-like representation indicating the parts of speech it has identified and their relations to each other. While this system is based on a standard left-corner parser for constituency grammars, the neural nature of the model leads to new capabilities not seen in classical implementations. For example, the model gracefully decays in performance as the sentence structure gets larger. Unlike previous attempts at building neural parsing systems, this model is highly robust to neural damage, can be applied to any binary-constituency grammar, and requires relatively few neurons ( 150,000).
Keywords: Brain modeling, Computational modeling, Decoding, NEF/SPA/Nengo combination, Network architecture, Neural ENGineering Objects,Neural computation,Neural networks,Neurons,Neuroscience,Spaun,Spaun scale,Statistics,biologically plausible spiking networks,brain model,brain models,cognition,cognitive task,function resources,functional spiking neural circuits,large-scale synthesis,large-scale systems,mammalian brain,mathematical theory,medical computing,motor task,neural engineering framework,neural engineering framework (NEF),neural model simulation,neural model synthesis,neural modeling,neural nets,neuromorphic engineering,neuron-like components,neurophysiology,nonlinear dynamical systems,organization resources,perceptual task,representational resources,reverse-engineer,review,reviews,semantic pointer architecture,semantic pointer architecture (SPA),simple nonlinear components,software tool,spiking neural networks,synapses,theoretical tool
Abstract: In this paper, we review the theoretical and software tools used to construct Spaun, the first (and so far only) brain model capable of performing cognitive tasks. This tool set allowed us to configure 2.5 million simple nonlinear components (neurons) with 60 billion connections between them (synapses) such that the resulting model can perform eight different perceptual, motor, and cognitive tasks. To reverse-engineer the brain in this way, a method is needed that shows how large numbers of simple components, each of which receives thousands of inputs from other components, can be organized to perform the desired computations. We achieve this through the neural engineering framework (NEF), a mathematical theory that provides methods for systematically generating biologically plausible spiking networks to implement nonlinear and linear dynamical systems. On top of this, we propose the semantic pointer architecture (SPA), a hypothesis regarding some aspects of the organization, function, and representational resources used in the mammalian brain. We conclude by discussing Spaun, which is an example model that uses the SPA and is implemented using the NEF. Throughout, we discuss the software tool Neural ENGineering Objects (Nengo), which allows for the synthesis and simulation of neural models efficiently on the scale of Spaun, and provides support for constructing models using the NEF and the SPA. The resulting NEF/SPA/Nengo combination is a general tool set for both evaluating hypotheses about how the brain works, and for building systems that compute particular functions using neuron-like components.
Abstract: Damage to the right parietal cortex often leads to a syndrome known as unilateral neglect in which the patient fails to attend or respond to stimuli in left space. Recent work attempting to rehabilitate the disorder has made use of rightward-shifting prisms that displace visual input further rightward. After a brief period of adaptation to prisms, many of the symptoms of neglect show improvements that can last for hours or longer, depending on the adaptation procedure. Recent work has shown, however, that differential effects of prisms can be observed on actions (which are typically improved) and perceptual biases (which often remain unchanged). Here, we present a computational model capable of explaining some basic symptoms of neglect (line bisection behaviour), the effects of prism adaptation in both healthy controls and neglect patients and the observed dissociation between action and perception following prisms. The results of our simulations support recent contentions that prisms primarily influence behaviours normally thought to be controlled by the dorsal stream.
Abstract: In this thesis I present the Recurrent Error-driven Adaptive Control Hierarchy (REACH); a large-scale spiking neuron model of the motor cortices and cerebellum of the motor control system. The REACH model consists of anatomically organized spiking neurons that control a nonlinear three-link arm to perform reaching and handwriting, while being able to adapt to unknown changes in arm dynamics and structure. I show that the REACH model accounts for data across 19 clinical and experimental studies of the motor control system. These data includes a mix of behavioural and neural spiking activity, across normal and damaged subjects performing adaptive and static tasks. The REACH model is a dynamical control system based on modern control theoretic methods, specifically operational space control, dynamic movement primitives, and nonlinear adaptive control. The model is implemented in spiking neurons using the Neural Engineering Framework (NEF). The model plans trajectories in end-effector space, and transforms these commands into joint torques that can be sent to the arm simulation. Adaptive components of the model are able to compensate for unknown kinematic or dynamic system parameters, such as arm segment length or mass. Using the NEF the adaptive components of the system can be seeded with approximations of the system kinematics and dynamics, allowing faster convergence to stability. Stability proofs for nonlinear adaptation methods implemented in distributed systems with scalar output are presented. By implementing the motor control model in spiking neurons, biological constraints such as neurotransmitter time-constants and anatomical connectivity can be imposed, allowing further comparison to experimental data for model validation. The REACH model is compared to clinical data from human patients as well as neural recording from monkeys performing reaching experiments. The REACH model represents a novel integration of control theoretic methods and neuroscientific constraints to specify a general, adaptive, biologically plausible motor control algorithm.
Keywords: neuromorphic, robotics, NEF, Spinnaker
Abstract: Living organisms are capable of autonomously adapting to dynamically changing environments by receiving inputs from highly specialized sensory organs and elaborating them on the same parallel, power-efficient neural substrate. In this paper we present a prototype for a comprehensive integrated platform that allows replicating principles of neural information processing in real-time. Our system consists of (a) an autonomous mobile robotic platform, (b) on-board actuators and multiple (neuromorphic) sensors, and (c) the SpiNNaker computing system, a configurable neural architecture for exploration of parallel, brain-inspired models. The simulation of neurally inspired perception and reasoning algorithms is performed in real-time by distributed, low-power, low-latency event-driven computing nodes, which can be flexibly configured using C or specialized neural languages such as PyNN and Nengo. We conclude by demonstrating the platform in two experimental scenarios, exhibiting real-world closed loop behavior consisting of environmental perception, reasoning and execution of adequate motor actions.
Keywords: Aging,Cognitive decline,Raven's Progressive Matrices,Spiking neural model,Vector symbolic architectures
Abstract: Noise and heterogeneity are both known to benefit neural coding. Stochastic resonance describes how noise, in the form of random fluctuations in a neuron's membrane voltage, can improve neural representations of an input signal. Neuronal heterogeneity refers to variation in any one of a number of neuron parameters and is also known to increase the information content of a population. We explore the interaction between noise and heterogeneity and find that their benefits to neural coding are not independent. Specifically, a neuronal population better represents an input signal when either noise or heterogeneity is added, but adding both does not always improve representation further. To explain this phenomenon, we propose that noise and heterogeneity operate using two shared mechanisms: (1) temporally desynchronizing the firing of neurons in the population and (2) linearizing the response of a population to a stimulus. We first characterize the effects of noise and heterogeneity on the information content of populations of either leaky integrate-and-fire or FitzHugh-Nagumo neurons. We then examine how the mechanisms of desynchronization and linearization produce these effects, and find that they work to distribute information equally across all neurons in the population in terms of both signal timing (desynchronization) and signal amplitude (linearization). Without noise or heterogeneity, all neurons encode the same aspects of the input signal; adding noise or heterogeneity allows neurons to encode complementary aspects of the input signal, thereby increasing information content. The simulations detailed in this letter highlight the importance of heterogeneity and noise in population coding, demonstrate their complex interactions in terms of the information content of neurons, and explain these effects in terms of underlying mechanisms.
Abstract: As we experience life, we are constantly creating new memories, and the hippocampus plays an important role in the formation and recall of these episodic memories. We begin by describing the neural mechanisms that make the hippocampus ideally suited for memory formation, consolidation and recall. We then describe a biologically plausible spiking-neuron model of the hippocampus' role in episodic memory. The model includes a mechanism for generating temporal indexing vectors, for associating these indices with experience vectors to form episodes, and for replaying the original experience vectors in sequence when prompted. The model also associates these episodes with context vectors using synaptic plasticity, such that it is able to retrieve an episodic memory associated with a given context and replay it, even after long periods of time. We demonstrate the model's ability to experience sequences of sensory information in the form of semantic pointer vectors and replay the same sequences later, comparing the results to experimental data. In particular, the model runs a T-maze experiment in which a simulated rat is forced to choose between left or right at a decision point, during which the neural firing patterns of the model's place cells closely match those found in real rats performing the same task. We demonstrate that the model is robust to both spatial and non-spatial data, since the vector representation of the input data remains the same in either case. To our knowledge, this is the first spiking neural hippocampal model that can encode and recall sequences of both spatial and non-spatial data, while exhibiting temporal and spatial selectivity at a neural level.
Abstract: Associative memories have been an active area of research over the last forty years (Willshaw et al., 1969; Kohonen, 1972; Hopfield, 1982) because they form a central component of many cognitive architectures (Pollack, 1988; Anderson & Lebiere, 1998). We focus specifically on associative memories that store associations between arbitrary pairs of neural states. When a noisy version of an input state vector is presented to the network, it must output a "clean" version of the associated state vector. We describe a method for building large-scale networks for online learning of associations using spiking neurons, which works by exploiting the techniques of the Neural Engineering Framework (Eliasmith & Anderson, 2003). This framework has previously been used by Stewart et al. (2011) to create memories that possess a number of desirable properties including high accuracy, a fast, feedforward recall process, and etcient scaling, requiring a number of neurons linear in the number of stored associations. These memories have played a central role in several recent neural cognitive models including Spaun, the world's largest functional brain model (Eliasmith et al., 2012), as well as a proposal for human-scale, biologically plausible knowledge representation (Crawford et al., 2013). However, these memories are constructed using an ne optimization method that is not biologically plausible. Here we demonstrate how a similar set of connection weights can be arrived at through a biologically plausible, online learning process featuring a novel synaptic learning rule inspired in part by the well-known Oja learning rule (Oja, 1989). We present the details of our method and report the results of simulations exploring the storage capacity of these networks. We show that our technique scales up to large numbers of associations, and that recall performance degrades gracefully as the theoretical capacity is exceeded. This work has been implemented in the Nengo simulation package (http://nengo.ca), which will allow straightforward implementations of spiking neural networks on neuromorphic hardware. The result of our work is a fast, adaptive, scalable associative memory composed of spiking neurons which we expect to be a valuable addition to large systems peforming online neural computation.
Abstract: Neuroscience currently lacks a comprehensive theory of how cognitive processes can be implemented in a biological substrate. The Neural Engineering Framework (NEF) proposes one such theory, but has not yet gathered significant empirical support, partly due to the technical challenge of building and simulating large-scale models with the NEF. Nengo is a software tool that can be used to build and simulate large-scale models based on the NEF; currently, it is the primary resource for both teaching how the NEF is used, and for doing research that generates specific NEF models to explain experimental data. Nengo 1.4, which was implemented in Java, was used to create Spaun, the world's largest functional brain model (Eliasmith et al., 2012). Simulating Spaun highlighted limitations in Nengo 1.4's ability to support model construction with simple syntax, to simulate large models quickly, and to collect large amounts of data for subsequent analysis. This paper describes Nengo 2.0, which is implemented in Python and overcomes these limitations. It uses simple and extendable syntax, simulates a benchmark model on the scale of Spaun 50 times faster than Nengo 1.4, and has a flexible mechanism for collecting simulation results.
Abstract: Subjects performing simple reaction-time tasks can improve reaction times by learning the expected timing of action-imperative stimuli and preparing movements in advance. Success or failure on the previous trial is often an important factor for determining whether a subject will attempt to time the stimulus or wait for it to occur before initiating action. The medial prefrontal cortex (mPFC) has been implicated in enabling the top-down control of action depending on the outcome of the previous trial. Analysis of spike activity from the rat mPFC suggests that neural integration is a key mechanism for adaptive control in precisely timed tasks. We show through simulation that a spiking neural network consisting of coupled neural integrators captures the neural dynamics of the experimentally recorded mPFC. Errors lead to deviations in the normal dynamics of the system, a process that could enable learning from past mistakes. We expand on this coupled integrator network to construct a spiking neural network that performs a reaction-time task by following either a cue-response or timing strategy, and show that it performs the task with similar reaction times as experimental subjects while maintaining the same spiking dynamics as the experimentally recorded mPFC.
Keywords: bayesian optimization, model selection, hyperparameter optimization, scikit-learn
Abstract: Hyperopt-sklearn is a new software project that provides automatic algorithm configuration of the Scikit-learn machine learning library. Following Auto-Weka, we take the view that the choice of classifier and even the choice of preprocessing module can be taken together to represent a single large hyperparameter optimization problem. We use Hyperopt to define a search space that encompasses many standard components (e.g. SVM, RF, KNN, PCA, TFIDF) and common patterns of composing them together. We demonstrate, using search algorithms in Hyperopt and standard benchmarking data sets (MNIST, 20-Newsgroups, Convex Shapes), that searching this space is practical and effective. In particular, we improve on best-known scores for the model space for both MNIST and Convex Shapes.
Abstract: We present a mobile robot with sufficient computing power to simulate up to a quarter of a million neurons in real-time. We use this computing power, combined with various on-board sensory and motor systems (including silicon retinae) to implement a novel method for learning sensorimotor competences by example. That is, by temporarily manually controlling the robot, it can gather information about what sensorimotor mapping it should be performing. We show that such a learning-by-example system is well-suited to power efficient neuron-based computation (60 W for all quarter of a million neurons), that it can learn quickly (a few tens of seconds), and that its learning generalizes well to novel situations.
Keywords: Biology and life sciences,Circuit models,Coding mechanisms,Computational biology,Computational neuroscience,Neuroscience,Research Article,Sensory systems,Single neuron function,Visual system
Abstract: Visuospatial attention produces myriad effects on the activity and selectivity of cortical neurons. Spiking neuron models capable of reproducing a wide variety of these effects remain elusive. We present a model called the Attentional Routing Circuit (ARC) that provides a mechanistic description of selective attentional processing in cortex. The model is described mathematically and implemented at the level of individual spiking neurons, with the computations for performing selective attentional processing being mapped to specific neuron types and laminar circuitry. The model is used to simulate three studies of attention in macaque, and is shown to quantitatively match several observed forms of attentional modulation. Specifically, ARC demonstrates that with shifts of spatial attention, neurons may exhibit shifting and shrinking of receptive fields; increases in responses without changes in selectivity for non-spatial features (i.e. response gain), and; that the effect on contrast-response functions is better explained as a response-gain effect than as contrast-gain. Unlike past models, ARC embodies a single mechanism that unifies the above forms of attentional modulation, is consistent with a wide array of available data, and makes several specific and quantifiable predictions.
Abstract: We present a biologically based neural model capable of performing reinforcement learning in complex tasks. The model is unique in its ability to solve tasks that require the agent to make a sequence of unrewarded actions in order to reach the goal, in an environment where there are unknown and variable time delays between actions, state transitions, and rewards. Specifically, this is the first neural model of reinforcement learning able to function within a Semi-Markov Decision Process (SMDP) framework. We believe that this extension of current modelling efforts lays the groundwork for increasingly sophisticated models of human decision making.
Keywords: heterogeneity, noise, population coding, stochastic resonance
Keywords: semantic memory; convolution; random permutation; vector space models; distributional semantics
Abstract: Distributed models of lexical semantics increasingly incorporate information about word order. One influential method for encoding this information into high-dimensional spaces uses convolution to bind together vectors to form representations of numerous n-grams that a target word is a part of. The computational complexity of this method has led to the development of an alternative that uses random permutation to perform order-sensitive vector combinations. We describe a simplified form of order encoding with convolution that yields comparable performance to earlier models, and we discuss considerations of neural implementation that favor the use of the proposed encoding. We conclude that this new encoding method is a more neurally plausible alternative than its predecessors.
Abstract: In this thesis, I present a computational neural model that reproduces the high-level behavioural results of well-known fear conditioning experiments: first-order conditioning, second-order conditioning, sensory preconditioning, context conditioning, blocking, first-order extinction and renewal (AAB, ABC, ABA), and extinction and renewal after second-order conditioning and sensory preconditioning. The simulated neural populations used to account for the behaviour observed in these experiments correspond to known anatomical regions of the mammalian brain. Parts of the amygdala, periaqueductal gray, cortex and thalamus, and hippocampus are included and are connected to each other in a biologically plausible manner. The model was built using the principles of the Neural Engineering Framework (NEF): a mathematical framework that allows information to be encoded and manipulated in populations of neurons. Each population represents information via the spiking activity of simulated neurons, and is connected to one or more other populations; these connections allow computations to be performed on the information being represented. By specifying which populations are connected to which, and what functions these connections perform, I developed an information processing system that behaves analogously to the fear conditioning circuit in the brain.
Abstract: (Commentary) Quantum probability theory can be seen as a type of Vector Symbolic Architecture: mental states are vectors storing structured information and manipulated using algebraic operations. Furthermore, the operations needed by QP match those in other VSAs. This allows existing biologically realistic neural models to be adapted to provide a mechanistic explanation of the cognitive phenomena described in the target article.
Abstract: We present a novel, biologically plausible model of visual motion processing and perceptual decision making that is independent of the number of choice categories or alternatives. The implementation is a large-scale spiking neural circuit consisting of: 1) a velocity filter using the principle of oscillator interference to determine the direction and speed of pattern motion in V1; 2) a representation of motion evidence in the middle temporal area (MT); and 3) integration of sensory evidence over time by a higher-dimensional attractor network in the lateral intraparietal area (LIP). We demonstrate the model by reproducing behavioral and neural results from classic perceptual decision making experiments that test the perceived direction of motion of variable coherence dot kinetograms. Specifically, these results capture monkey data from two-alternative forced-choice motion decision tests. We note that without any reconfiguration of the circuit, the implementation can be used to make decisions among a continuum of alternatives.
Abstract: We present a neural mechanism for interpreting and executing visually presented commands. These are simple verb-noun commands (such as WRITE THREE) and can also include conditionals ([if] SEE SEVEN, [then] WRITE THREE). We apply this to a simplified version of our large-scale functional brain model "Spaun", where input is a 28x28 pixel visual stimulus, with a different pattern for each word. Output controls a simulated arm, giving hand-written answers. Cortical areas for categorizing, storing, and interpreting information are controlled by the basal ganglia (action selection) and thalamus (routing). The final model has approximately 100,000 LIF spiking neurons. We show that the model is extremely robust to neural damage (40 percent of neurons can be destroyed before performance drops significantly). Performance also drops for visual display times less than 250ms. Importantly, the system also scales to large vocabularies (approximately 100,000 nouns and verbs) without requiring an exponentially large number of neurons.
Keywords: category representation, image categorization, Neural Engineering Framework, vector symbolic architecture
Abstract: Although studies of categorization have been a staple of psychological research for decades, there continues to be substantial disagreement about how unique classes of objects are represented in the brain. We present a neural architecture for categorizing visual stimuli based on the Neural Engineering Framework and the manipulation of semantic pointers. The model accounts for how the visual system computes semantic representations from raw images, and how those representations are then manipulated to produce category judgments. All computations of the model are carried out in simulated spiking neurons. We demonstrate that the model matches human performance on two seminal behavioural studies of image-based concept acquisition: Posner and Keele (1968) and Regehr and Brooks (1993).
Abstract: We provide an overview and comparison of several recent large-scale brain models. In addition to discussing challenges involved with building large neural models, we identify several expected benefits of pursuing such a research program. We argue that these benefits are only likely to be realized if two basic guidelines are made central to the pursuit. The first is that such models need to be intimately tied to behavior. The second is that models, and more importantly their underlying methods, should provide mechanisms for varying the level of simulated detail. Consequently, we express concerns with models that insist on a 'correct' amount of detail while expecting interesting behavior to simply emerge.
Keywords: motor control, automaticity, expertise, procedural learning, basal ganglia, motor cortex
Abstract: The ability to develop expertise through practice is a hallmark of biological systems, for both cognitive and motor based skills. At first, animals exhibit high variability and perform slowly, reliant on feedback signals constantly evaluating performance. With practice, the system develops a proficiency and consistency in skill execution, reflected in an increase in the associated cortical area (Pascual-Leone, 1995). Here we present a neural model of this expertise development. In the model, initial attempts at performing a task are based on generalizing previously learned control signals, which we refer to generically as `actions', stored in the cortex. The basal ganglia evaluates these actions and modulates their contributions to the output signal, creating a novel action that performs the desired task. With repeated performance, the cortex learns to generate this action on its own, eventually developing an explicit representation of the action that can be called directly. This transference allows the system to more quickly and consistently execute the task, reflecting development of expertise. We present simulation results matching both behavioral and single cell spiking data.
Keywords: cognitive architecture
Abstract: The predictive processing framework lacks many of the architectural and implementational details needed to fully investigate or evaluate the ideas it presents. One way to begin to fill in these details is by turning to standard control-theoretic descriptions of these types of systems (e.g., Kalman filters), and by building complex, unified computational models in biologically realistic neural simulations.
Abstract: We present a spiking neuron brain model implemented in 318,870 LIF neurons organized with distinct cortical modules, a basal ganglia, and a thalamus, that is capable of flexibly following memorized commands. Neural activity represents a structured set of rules, such as "If you see a 1, then push button A, and if you see a 2, then push button B". Synaptic connections between these neurons and the basal ganglia, thalamus, and other areas cause the system to detect when rules should be applied and to then do so. The model gives a reaction time difference of 77 ms between the simple and two-choice reaction time tasks, and requires 384 ms per item for sub-vocal counting, consistent with human experimental results. This is the first biologically realistic spiking neuron model capable of flexibly responding to complex structured instructions.
Abstract: We present a novel learning rule for learning transformations of sophisticated neural representations in a biologically plausible manner. We show that the rule can learn to transmit and bind semantic pointers. Semantic pointers have previously been used to build Spaun, which is currently the world's largest functional brain model (Eliasmith et al., 2012) and can perform several complex cognitive tasks. The learning rule combines a previously proposed supervised learning rule and a novel spiking form of the BCM unsupervised learning rule. We show that spiking BCM increases sparsity of connection weights at the cost of increased signal transmission error. We demonstrate that the combined learning rule can learn transformations as well as the supervised rule alone, and as well as the offline optimization used previously. We also demonstrate that the combined learning rule is more robust to changes in parameters and leads to better outcomes in higher dimensional spaces.
Abstract: Reinforcement learning based on rewarding or aversive stimuli is critical to understanding the adaptation of cognitive systems. One of the most basic and well-studied forms of reinforcement learning in mammals is found in fear conditioning. We present a biologically plausible spiking neuron model of mammalian fear conditioning and show that the model is capable of reproducing the results of four well known fear conditioning experiments (conditioning, second-order conditioning, blocking, and context-dependent extinction and renewal). The model contains approximately 2000 spiking neurons which make up various populations of primarily the amygdala, periaqueductal gray, and hippocampus. The connectivity and organization of these populations follows what is known about the fear conditioning circuit in mammalian brains. Input to the model is made up of populations representing sensory stimuli, contextual information, and electric shock, while the output is a population representing an autonomic fear response: freezing. Using a novel learning rule for spiking neurons, associations are learned between cues, contexts, and the aversive shock, reproducing the behaviors seen in rats during fear conditioning experiments.
Abstract: We discuss work aimed at building functional models of the whole brain implemented in large-scale simulations of millions of individual neurons. Recent developments in this area demonstrate that such models can explain a variety of behavioral, neurophysiological, and neuroanatomical data. We argue that these models hold the potential to expand our understanding of the brain by connecting these levels of analysis in new and informative ways. However, current modeling efforts fall short of the target of whole-brain modeling. Consequently, we discuss different avenues of research that continue to progress toward that distant, but achievable, goal.
Keywords: Fourier, Neural Engineering Framework, oscillators, path integration
Abstract: In 2005, Hafting et al. reported that some neurons in the entorhinal cortex (EC) fire bursts when the animal occupies locations oraganized in a hexagonal grid pattern in their spatial environment. Previous to that, place cells had been observed, firing bursts only when the animal occupied a particular region of the environment. Both of these types of cells exhibit theta-cycle modulation, firing bursts in the 4-12Hz range. In particular, grid cells fire bursts of action potentials that precess with respect to the theta cycle, a phenomenon dubbed "theta precession". Since then, various models have been proposed to explain the relationship between grid cells, place cells, and theta precession. However, most models have lacked a fundamental, overarching framework. As a reformulation of the pioneering work of Welday et al. (2011), we propose that the EC is implementing its spatial coding using the Fourier Transform. We show how the Fourier Shift Theorem relates to the phases of velocity-controlled oscillators (VCOs), and propose a model for how various other spatial maps might be implemented (eg. border cells). Our model exhibits the standard EC behaviours: grid cells, place cells, and phase precession, as bourne out by theoretical computations and spiking-neuron simulations. We hope that framing this constellation of phenomena in Fourier Theory will accelerate our understanding of how the EC – and perhaps the hippocampus – encodes spatial information.
Keywords: cleanup memory, knowledge representation, Semantic Pointer Architecture, vector symbolic architecture, WordNet
Abstract: Several approaches to implementing symbol-like representations in neurally plausible models have been proposed. These approaches include binding through synchrony (Shastri & Ajjanagadde, 1993), mesh binding (van Der Velde & de Kamps, 2006), and conjunctive binding (Smolensky, 1990; Plate, 2003). Recent theoretical work has suggested that most of these methods will not scale well – that is, they cannot encode structured representations that use any of the tens of thousands of terms in the adult lexicon without making implausible resource assumptions (Stewart & Eliasmith, 2011; Eliasmith, 2013). Here we present an approach that will scale appropriately, and which is based on neurally implementing a type of Vector Symbolic Architecture (VSA). Specifically, we construct a spiking neural network composed of about 2.5 million neurons that employs a VSA to encode and decode the main lexical relations in WordNet, a semantic network containing over 100,000 concepts (Fellbaum, 1998). We experimentally demonstrate the capabilities of our model by measuring its performance on three tasks which test its ability to accurately traverse the WordNet hierarchy, as well as to decode sentences employing any WordNet term while preserving the original lexical structure. We argue that these results show that our approach is uniquely well-suited to providing a biologically plausible, human-scale account of the structured representations that underwrite cognition.
Abstract: Recent papers have shown the possibility to implement large scale neural network models that perform complex algorithms in a biologically realistic way. However, such models have been simulated on architectures unable to perform real-time simulations. In previous work we presented the possibility to simulate simple models in real-time on the SpiNNaker neuromimetic architecture. However, such models were "static": the algorithm performed was defined at design-time. In this paper we present a novel learning rule, that exploits the peculiarities of the SpiNNaker system, enabling models designed with the Neural Engineering Framework (NEF) to learn transfer functions using a supervised framework. We show that the proposed learning rule, belonging to the Prescribed Error Sensitivity (PES) class, is able to learn, effectively, both linear and non-linear functions.
Abstract: The Neural Engineering Framework (NEF) is a general methodology that allows the building of large-scale, biologically plausible, neural models of cognition. The NEF acts as a neural compiler: once the properties of the neurons, the values to be represented, and the functions to be computed are specified, it solves for the connection weights between components that will perform the desired functions. Importantly, this works not only for feed-forward computations, but also for recurrent connections, allowing for complex dynamical systems including integrators, oscillators, Kalman filters, etc. The NEF also incorporates realistic local error-driven learning rules, allowing for the online adaptation and optimisation of responses. The NEF has been used to model visual attention, inductive reasoning, reinforcement learning and many other tasks. Recently, we used it to build Spaun, the world\textquoteright s largest functional brain model, using 2.5 million neurons to perform eight different cognitive tasks by interpreting visual input and producing hand-written output via a simulated 6-muscle arm. Our open-source software Nengo was used for all of these, and is available at http://nengo.ca, along with tutorials, demos, and downloadable models.
Keywords: large-scale spiking model, oscillator interference, visual motion
Abstract: A prerequisite for the perception of motion in primates is the transformation of varying intensities of light on the retina into an estimation of position, direction and speed of coherent objects. The neuro-computational mechanisms relevant for object feature encoding have been thoroughly explored, with many neurally plausible models able to represent static visual scenes. However, motion estimation requires the comparison of successive scenes through time. Precisely how the necessary neural dynamics arise and how other related neural system components interoperate have yet to be shown in a large-scale, biologically realistic simulation. The proposed model simulates a spiking neural network computation for representing object velocities in cortical areas V1 and middle temporal area (MT). The essential neural dynamics, hypothesized to reside in networks of V1 simple cells, are implemented through recurrent population connections that generate oscillating spatiotemporal tunings. These oscillators produce a resonance response when stimuli move in an appropriate manner in their receptive fields. The simulation shows close agreement between the predicted and actual impulse responses from V1 simple cells using an ideal stimulus. By integrating the activities of like V1 simple cells over space, a local measure of visual pattern velocity can be produced. This measure is also the linear weight of an associated velocity in a retinotopic map of optical flow. As a demonstration, the classic motion stimuli of drifting sinusoidal gratings and variably coherent dots are used as test stimuli and optical flow maps are generated. Vector field representations of this structure may serve as inputs for perception and decision making processes in later brain areas.
Abstract: In a simple reaction-time (RT) task with predictable foreperiods, subjects employ two strategies. They either wait until the cue and then respond, or they time the foreperiod and respond when the cue should occur. Evidence for these performance strategies has been detected in rodents, humans and other primates. A key brain region for implementing these control strategies is the medial prefrontal cortex (mPFC). Neurons in this brain region show changes in firing rates around the start of trials or fire persistently during the foreperiod of simple RT tasks, and exert control over the motor system by influencing firing rates in the motor cortex during the foreperiod activity (Narayanan & Laubach, 2006). Here, we describe a neural circuit model based on the known neuroanatomy that reproduces the observed activity patterns in rat mPFC and exhibits adjustments in the behavioral strategy based on the subject's recent outcomes. A neural circuit based on Singh and Eliasmith, 2006 tracks the behavioural state and the time elapsed in that state. This circuit serves as a top-down controller acting on a neural control system. When the top-down control is not being exerted, the system wait for the cue and responds at cue onset. When the foreperiod can be timed, topdown control is exerted when the behavioral response is predicted to occur. These adjustments can occur at any time and do not require synaptic weight changes.
Abstract: We present a large-scale cognitive neural model called Spaun (Semantic Pointer Architecture: Unified Network), and show simulation results on 6 tasks (digit recognition, tracing from memory, serial working memory, question answering, addition by counting, and symbolic pattern completion). The model consists of 2.3 million spiking neurons whose neural properties, organization, and connectivity match that of the mammalian brain. Input consists of images of handwritten and typed numbers and symbols, and output is the motion of a 2 degree-of-freedom arm that writes the model's responses. Tasks can be presented in any order, with no “rewiring” of the brain for each task. Instead, the model is capable of internal cognitive control (via the basal ganglia), selectively routing information throughout the brain and recruiting different cortical components as needed for each task.
Abstract: The Neural Engineering Framework (NEF) is a general methodology that allows you to build largescale, biologically plausible, neural models of cognition. In particular, it acts as a neural compiler: you specify the properties of the neurons, the values to be represented, and the functions to be computed, and it solves for the connection weights between components that will perform the desired functions. Importantly, this works not only for feed-forward computations, but recurrent connections as well, allowing for complex dynamical systems including integrators, oscillators, Kalman filters, and so on. It also incorporates realistic local error-driven learning rules, allowing for online adaptation and optimization of responses. The NEF has been used to model visual attention, inductive reasoning, reinforcement learning, and many other tasks. Recently, we used it to build Spaun, the world's largest functional brain model, using 2.5 million neurons to perform eight different cognitive tasks by interpreting visual input and producing hand-written output via a simulated 6-muscle arm. Our open-source software Nengo was used for all of these, and is available at http://nengo.ca, along with tutorials, demos, and downloadable models.
Abstract: We expand our existing spiking neuron model of decision making in the cortex and basal ganglia to include local learning on the synaptic connections between the cortex and striatum, modulated by a dopaminergic reward signal. We then compare this model to animal data in the bandit task, which is used to test rodent learning in conditions involving forced choice under rewards. Our results indicate a good match in terms of both behavioral learning results and spike patterns in the ventral striatum. The model successfully generalizes to learning the utilities of multiple actions, and can learn to choose different actions in different states. The purpose of our model is to provide both high-level behavioral predictions and low-level spike timing predictions while respecting known neurophysiology and neuroanatomy.
Abstract: Motor prostheses aim to restore function to disabled patients. Despite compelling proof of concept systems, barriers to clinical translation remain. One challenge is to develop a low-power, fully-implantable system that dissipates only minimal power so as not to damage tissue. To this end, we implemented a Kalman-filter based decoder via a spiking neural network (SNN) and tested it in brain-machine interface (BMI) experiments with a rhesus monkey. The Kalman filter was trained to predict the arm\textquoteright s velocity and mapped on to the SNN using the Neural Engineer- ing Framework (NEF). A 2,000-neuron embedded Matlab SNN implementation runs in real-time and its closed-loop performance is quite comparable to that of the standard Kalman filter. The success of this closed-loop decoder holds promise for hardware SNN implementations of statistical signal processing algorithms on neu- romorphic chips, which may offer power savings necessary to overcome a major obstacle to the successful clinical translation of neural motor prostheses.
Abstract: Inductive reasoning is a fundamental and complex aspect of human intelligence. In particular, how do subjects, given a set of particular examples, generate general descriptions of the rules governing that set? We present a biologically plausible method for accomplishing this task and implement it in a spiking neuron model. We demonstrate the success of this model by applying it to the problem domain of Raven's Progressive Matrices, a widely used tool in the field of intelligence testing. The model is able to generate the rules necessary to correctly solve Raven's items, as well as recreate many of the experimental effects observed in human subjects.
Abstract: Learning is central to the exploration of intelligence. Psychology and machine learning provide high-level explanations of how rational agents learn. Neuroscience provides low-level descriptions of how the brain changes as a result of learning. This thesis attempts to bridge the gap between these two levels of description by solving problems using machine learning ideas, implemented in biologically plausible spiking neural networks with experimentally supported learning rules. We present three novel neural models that contribute to the understanding of how the brain might solve the three main problems posed by machine learning: supervised learning, in which the rational agent has a fine-grained feedback signal, reinforcement learning, in which the agent gets sparse feedback, and unsupervised learning, in which the agents has no explicit environmental feedback. In supervised learning, we argue that previous models of supervised learning in spiking neural networks solve a problem that is less general than the supervised learning problem posed by machine learning. We use an existing learning rule to solve the general supervised learning problem with a spiking neural network. We show that the learning rule can be mapped onto the well-known backpropagation rule used in artificial neural networks. In reinforcement learning, we augment an existing model of the basal ganglia to implement a simple actor-critic model that has a direct mapping to brain areas. The model is used to recreate behavioural and neural results from an experimental study of rats performing a simple reinforcement learning task. In unsupervised learning, we show that the BCM rule, a common learning rule used in unsupervised learning with rate-based neurons, can be adapted to a spiking neural network. We recreate the effects of STDP, a learning rule with strict time dependencies, using BCM, which does not explicitly remember the times of previous spikes. The simulations suggest that BCM is a more general rule than STDP. Finally, we propose a novel learning rule that can be used in all three of these simulations. The existence of such a rule suggests that the three types of learning examined separately in machine learning may not be implemented with separate processes in the brain.
Abstract: We present a novel error-modulated spike-timing-dependent learning rule that utilizes a global error signal and the tuning properties of neurons in a population to learn arbitrary transformations on n-dimensional signals. This rule addresses the gap between low-level spike-timing learning rules modifying individual synaptic weights and higher-level learning schemes that characterize behavioural changes in an animal. The learning rule is first analyzed in a small spiking neural network. Using the encod- ing/decoding framework described by Eliasmith and Anderson (2003), we show that the rule can learn linear and non-linear transformations on n-dimensional signals. The learning rule arrives at a connection weight matrix that differs significantly from the connection weight matrix found analytically by Eliasmith and Anderson\textquoteright s method, but performs similarly well. We then use the learning rule to augment Stewart et al.\textquoteright s biologically plausible imple- mentation of action selection in the basal ganglia (2009). Their implementation forms the \actor " module in the actor-critic reinforcement learning architecture described by Barto (1995). We add a \critic " module, inspired by the physiology of the ventral striatum, that can modulate the model\textquoteright s likelihood of selecting actions based on the current state and the history of rewards obtained as a result of taking certain actions in that state. Despite being a complicated model with several interconnected populations, we are able to use our learning rule without any modifications. As a result, we suggest that this rule provides a unique and biologically plausible characterization of supervised and semi- supervised learning in the brain.
Abstract: Concepts are widely agreed to be the basic constituents of thought. Amongst philosophers and psychologists, however, the question of how concepts are structured has been a longstanding problem and a locus of disagreement. I draw on recent work describing how representational content is ascribed to populations of neurons to develop a novel solution to this problem. Because disputes over the structure of concepts often reflect divergent explanatory goals, I begin by arguing for a set of six criteria that a good theory ought to accommodate. These criteria address philosophical concerns related to content, reference, scope, publicity, and compositionality, and psychological concerns related to categorization phenomena and neural plausibility. Next, I evaluate a number of existing theoretical approaches in relation to these six criteria. I consider classical views that identify concepts with definitions, similarity-based views that identify concepts with prototypes or exemplars, theory-based views that identify concepts with explanatory schemas, and atomistic views that identify concepts with unstructured mental symbols that enter into law-like relations with their referents. I conclude that none of these accounts can satisfactorily accommodate all of the criteria. I then describe the theory of representational content that I employ to motivate a novel account of concept structure. I briefly defend this theory against competitors, and I describe how it can be scaled from the level of basic perceptual representations to the level of highly complex conceptual representations. On the basis of this description, I contend that concepts are structured dynamically through sets of transformations of single source representation, and that the content of a given concept specifies the set of potential transformations it can enter into. I conclude by demonstrating that the ability of this account to meet all of the criteria introduced beforehand. I consider objections to my views throughout.
Abstract: We present a model of attentional routing called the Attentional Routing Circuit (ARC) that extends an existing model of spiking neurons with dendritic nonlinearities. Specifically, we employ the Poirazi et al. (2003) pyramidal neuron in a population coding framework. ARC demonstrates that the dendritic nonlinearities can be exploited to result in selective routing, with a decrease in the number of cells needed by a factor of 5 as compared with a linear dendrite model. Routing of attended information occurs through the modulation of feedforward visual signals by a cortical control signal specifying the location and size of the attended target. The model is fully specified in spiking single cells. Our approach differs from past work on shifter circuits by having more efficient control, and using a more biologically detailed substrate. Our approach differs from existing models that use gain fields by providing precise hypotheses about how the control signals are generated and distributed in a hierarchical model in spiking neurons. Further, the model accounts for numerous experimental findings regarding the timing, strength and extent of attentional modulation in ventral stream areas, and the perceived contrast enhancement of attended stimuli. To further demonstrate the plausibility of ARC, it