121
121
Jun 28, 2018
06/18
by
Baolin Peng; Zhengdong Lu; Hang Li; Kam-Fai Wong
texts
eye 121
favorite 0
comment 0
We propose Neural Reasoner, a framework for neural network-based reasoning over natural language sentences. Given a question, Neural Reasoner can infer over multiple supporting facts and find an answer to the question in specific forms. Neural Reasoner has 1) a specific interaction-pooling mechanism, allowing it to examine multiple facts, and 2) a deep architecture, allowing it to model the complicated logical relations in reasoning tasks. Assuming no particular structure exists in the question...
Topics: Computation and Language, Artificial Intelligence, Computing Research Repository, Learning, Neural...
Source: http://arxiv.org/abs/1508.05508
90
90
Jun 28, 2018
06/18
by
Matthew Lai
texts
eye 90
favorite 0
comment 0
This report presents Giraffe, a chess engine that uses self-play to discover all its domain-specific knowledge, with minimal hand-crafted knowledge given by the programmer. Unlike previous attempts using machine learning only to perform parameter-tuning on hand-crafted evaluation functions, Giraffe's learning system also performs automatic feature extraction and pattern recognition. The trained evaluation function performs comparably to the evaluation functions of state-of-the-art chess engines...
Topics: Artificial Intelligence, Computing Research Repository, Learning, Neural and Evolutionary Computing
Source: http://arxiv.org/abs/1509.01549
50
50
Jun 29, 2018
06/18
by
Naman Agarwal; Zeyuan Allen-Zhu; Brian Bullins; Elad Hazan; Tengyu Ma
texts
eye 50
favorite 0
comment 0
We design a non-convex second-order optimization algorithm that is guaranteed to return an approximate local minimum in time which scales linearly in the underlying dimension and the number of training examples. The time complexity of our algorithm to find an approximate local minimum is even faster than that of gradient descent to find a critical point. Our algorithm applies to a general class of optimization problems including training a neural network and other non-convex objectives arising...
Topics: Data Structures and Algorithms, Machine Learning, Mathematics, Optimization and Control,...
Source: http://arxiv.org/abs/1611.01146
48
48
Jun 28, 2018
06/18
by
Ryan Lowe; Nissan Pow; Iulian Serban; Joelle Pineau
texts
eye 48
favorite 0
comment 0
This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both the multi-turn property of conversations in the Dialog State Tracking Challenge datasets, and the unstructured nature of interactions from...
Topics: Computation and Language, Artificial Intelligence, Computing Research Repository, Learning, Neural...
Source: http://arxiv.org/abs/1506.08909
47
47
Jun 28, 2018
06/18
by
Takayuki Osogami; Makoto Otsuka
texts
eye 47
favorite 0
comment 0
We propose a particularly structured Boltzmann machine, which we refer to as a dynamic Boltzmann machine (DyBM), as a stochastic model of a multi-dimensional time-series. The DyBM can have infinitely many layers of units but allows exact and efficient inference and learning when its parameters have a proposed structure. This proposed structure is motivated by postulates and observations, from biological neural networks, that the synaptic weight is strengthened or weakened, depending on the...
Topics: Statistics, Machine Learning, Neural and Evolutionary Computing, Learning, Computing Research...
Source: http://arxiv.org/abs/1509.08634
42
42
Jun 25, 2018
06/18
by
Chao Dong; Chen Change Loy; Kaiming He; Xiaoou Tang
texts
eye 42
favorite 0
comment 0
We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network. But unlike traditional methods that handle each component...
Topics: Computer Vision and Pattern Recognition, Computing Research Repository, Neural and Evolutionary...
Source: http://arxiv.org/abs/1501.00092
40
40
Jun 29, 2018
06/18
by
Tejas D. Kulkarni; Ardavan Saeedi; Simanta Gautam; Samuel J. Gershman
texts
eye 40
favorite 0
comment 0
Learning robust value functions given raw observations and rewards is now possible with model-free and model-based deep reinforcement learning algorithms. There is a third alternative, called Successor Representations (SR), which decomposes the value function into two components -- a reward predictor and a successor map. The successor map represents the expected future state occupancy from any given state and the reward predictor maps states to scalar rewards. The value function of a state can...
Topics: Machine Learning, Artificial Intelligence, Learning, Statistics, Neural and Evolutionary Computing,...
Source: http://arxiv.org/abs/1606.02396
39
39
Jun 28, 2018
06/18
by
Pierre Baldi; Peter Sadowski
texts
eye 39
favorite 0
comment 0
In a physical neural system, where storage and processing are intimately intertwined, the rules for adjusting the synaptic weights can only depend on variables that are available locally, such as the activity of the pre- and post-synaptic neurons, resulting in local learning rules. A systematic framework for studying the space of local learning rules is obtained by first specifying the nature of the local variables, and then the functional form that ties them together into each learning rule....
Topics: Statistics, Computing Research Repository, Learning, Machine Learning, Neural and Evolutionary...
Source: http://arxiv.org/abs/1506.06472
39
39
Jun 28, 2018
06/18
by
Dzmitry Bahdanau; Jan Chorowski; Dmitriy Serdyuk; Philemon Brakel; Yoshua Bengio
texts
eye 39
favorite 0
comment 0
Many of the current state-of-the-art Large Vocabulary Continuous Speech Recognition Systems (LVCSR) are hybrids of neural networks and Hidden Markov Models (HMMs). Most of these systems contain separate components that deal with the acoustic modelling, language modelling and sequence decoding. We investigate a more direct approach in which the HMM is replaced with a Recurrent Neural Network (RNN) that performs sequence prediction directly at the character level. Alignment between the input...
Topics: Computation and Language, Artificial Intelligence, Computing Research Repository, Learning, Neural...
Source: http://arxiv.org/abs/1508.04395
38
38
Jun 25, 2018
06/18
by
Iñaki Fernández Pérez; Amine Boumaza; François Charpillet
texts
eye 38
favorite 0
comment 0
In this paper, we study the impact of selection methods in the context of on-line on-board distributed evolutionary algorithms. We propose a variant of the mEDEA algorithm in which we add a selection operator, and we apply it in a taskdriven scenario. We evaluate four selection methods that induce different intensity of selection pressure in a multi-robot navigation with obstacle avoidance task and a collective foraging task. Experiments show that a small intensity of selection pressure is...
Topics: Neural and Evolutionary Computing, Multiagent Systems, Artificial Intelligence, Computing Research...
Source: http://arxiv.org/abs/1501.01457
37
37
Jun 27, 2018
06/18
by
Ankit B. Patel; Tan Nguyen; Richard G. Baraniuk
texts
eye 37
favorite 0
comment 0
A grand challenge in machine learning is the development of computational algorithms that match or outperform humans in perceptual inference tasks that are complicated by nuisance variation. For instance, visual object recognition involves the unknown object position, orientation, and scale in object recognition while speech recognition involves the unknown voice pronunciation, pitch, and speed. Recently, a new breed of deep learning algorithms have emerged for high-nuisance inference tasks...
Topics: Machine Learning, Statistics, Computer Vision and Pattern Recognition, Neural and Evolutionary...
Source: http://arxiv.org/abs/1504.00641
37
37
Jun 28, 2018
06/18
by
Hao Yi Ong; Kevin Chavez; Augustus Hong
texts
eye 37
favorite 0
comment 0
We propose a distributed deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is based on the deep Q-network, a convolutional neural network trained with a variant of Q-learning. Its input is raw pixels and its output is a value function estimating future rewards from taking an action given a system state. To distribute the deep Q-network training, we adapt the DistBelief software framework to the context...
Topics: Artificial Intelligence, Computing Research Repository, Distributed, Parallel, and Cluster...
Source: http://arxiv.org/abs/1508.04186
37
37
Jun 28, 2018
06/18
by
Guillaume Desjardins; Karen Simonyan; Razvan Pascanu; Koray Kavukcuoglu
texts
eye 37
favorite 0
comment 0
We introduce Natural Neural Networks, a novel family of algorithms that speed up convergence by adapting their internal representation during training to improve conditioning of the Fisher matrix. In particular, we show a specific example that employs a simple and efficient reparametrization of the neural network weights by implicitly whitening the representation obtained at each layer, while preserving the feed-forward computation of the network. Such networks can be trained efficiently via...
Topics: Statistics, Computing Research Repository, Machine Learning, Learning, Neural and Evolutionary...
Source: http://arxiv.org/abs/1507.00210
35
35
Jun 29, 2018
06/18
by
Scott Wisdom; Thomas Powers; John R. Hershey; Jonathan Le Roux; Les Atlas
texts
eye 35
favorite 0
comment 0
Recurrent neural networks are powerful models for processing sequential data, but they are generally plagued by vanishing and exploding gradient problems. Unitary recurrent neural networks (uRNNs), which use unitary recurrence matrices, have recently been proposed as a means to avoid these issues. However, in previous experiments, the recurrence matrices were restricted to be a product of parameterized unitary matrices, and an open question remains: when does such a parameterization fail to...
Topics: Machine Learning, Learning, Neural and Evolutionary Computing, Computing Research Repository,...
Source: http://arxiv.org/abs/1611.00035
33
33
Jun 29, 2018
06/18
by
Hado van Hasselt; Arthur Guez; Matteo Hessel; Volodymyr Mnih; David Silver
texts
eye 33
favorite 0
comment 0
Most learning algorithms are not invariant to the scale of the function that is being approximated. We propose to adaptively normalize the targets used in learning. This is useful in value-based reinforcement learning, where the magnitude of appropriate value approximations can change over time when we update the policy of behavior. Our main motivation is prior work on learning to play Atari games, where the rewards were all clipped to a predetermined range. This clipping facilitates learning...
Topics: Machine Learning, Artificial Intelligence, Statistics, Learning, Neural and Evolutionary Computing,...
Source: http://arxiv.org/abs/1602.07714
33
33
Jun 28, 2018
06/18
by
Balázs Hidasi; Alexandros Karatzoglou; Linas Baltrunas; Domonkos Tikk
texts
eye 33
favorite 0
comment 0
We apply recurrent neural networks (RNN) on a new domain, namely recommender systems. Real-life recommender systems often face the problem of having to base recommendations only on short session-based data (e.g. a small sportsware website) instead of long user histories (as in the case of Netflix). In this situation the frequently praised matrix factorization approaches are not accurate. This problem is usually overcome in practice by resorting to item-to-item recommendations, i.e. recommending...
Topics: Learning, Neural and Evolutionary Computing, Information Retrieval, Computing Research Repository
Source: http://arxiv.org/abs/1511.06939
33
33
Jun 29, 2018
06/18
by
Paul Merolla; Rathinakumar Appuswamy; John Arthur; Steve K. Esser; Dharmendra Modha
texts
eye 33
favorite 0
comment 0
Recent results show that deep neural networks achieve excellent performance even when, during training, weights are quantized and projected to a binary representation. Here, we show that this is just the tip of the iceberg: these same networks, during testing, also exhibit a remarkable robustness to distortions beyond quantization, including additive and multiplicative noise, and a class of non-linear projections where binarization is just a special case. To quantify this robustness, we show...
Topics: Computer Vision and Pattern Recognition, Neural and Evolutionary Computing, Computing Research...
Source: http://arxiv.org/abs/1606.01981
32
32
Jun 30, 2018
06/18
by
Zachary C. Lipton; Subarna Tripathi
texts
eye 32
favorite 0
comment 0
Generative adversarial networks (GANs) transform latent vectors into visually plausible images. It is generally thought that the original GAN formulation gives no out-of-the-box method to reverse the mapping, projecting images back into latent space. We introduce a simple, gradient-based technique called stochastic clipping. In experiments, for images generated by the GAN, we precisely recover their latent vector pre-images 100% of the time. Additional experiments demonstrate that this method...
Topics: Learning, Machine Learning, Neural and Evolutionary Computing, Statistics, Computing Research...
Source: http://arxiv.org/abs/1702.04782
32
32
Jun 30, 2018
06/18
by
Mohammad Taha Bahadori; Krzysztof Chalupka; Edward Choi; Robert Chen; Walter F. Stewart; Jimeng Sun
texts
eye 32
favorite 0
comment 0
In application domains such as healthcare, we want accurate predictive models that are also causally interpretable. In pursuit of such models, we propose a causal regularizer to steer predictive models towards causally-interpretable solutions and theoretically study its properties. In a large-scale analysis of Electronic Health Records (EHR), our causally-regularized model outperforms its L1-regularized counterpart in causal accuracy and is competitive in predictive performance. We perform...
Topics: Learning, Computing Research Repository, Machine Learning, Neural and Evolutionary Computing,...
Source: http://arxiv.org/abs/1702.02604
31
31
Jun 28, 2018
06/18
by
Carlos Pedro Gonçalves
texts
eye 31
favorite 0
comment 0
Econophysics has developed as a research field that applies the formalism of Statistical Mechanics and Quantum Mechanics to address Economics and Finance problems. The branch of Econophysics that applies of Quantum Theory to Economics and Finance is called Quantum Econophysics. In Finance, Quantum Econophysics' contributions have ranged from option pricing to market dynamics modeling, behavioral finance and applications of Game Theory, integrating the empirical finding, from human decision...
Topics: Physics and Society, Quantitative Finance, Physics, General Finance, Neural and Evolutionary...
Source: http://arxiv.org/abs/1508.06586
31
31
Jun 29, 2018
06/18
by
Albert Zeyer; Patrick Doetsch; Paul Voigtlaender; Ralf Schlüter; Hermann Ney
texts
eye 31
favorite 0
comment 0
We present a comprehensive study of deep bidirectional long short-term memory (LSTM) recurrent neural network (RNN) based acoustic models for automatic speech recognition (ASR). We study the effect of size and depth and train models of up to 8 layers. We investigate the training aspect and study different variants of optimization methods, batching, truncated backpropagation, different regularization techniques such as dropout and $L_2$ regularization, and different gradient clipping variants....
Topics: Learning, Sound, Neural and Evolutionary Computing, Computing Research Repository, Computation and...
Source: http://arxiv.org/abs/1606.06871
30
30
Jun 28, 2018
06/18
by
Patrick O. Glauner
texts
eye 30
favorite 0
comment 0
This thesis describes the design and implementation of a smile detector based on deep convolutional neural networks. It starts with a summary of neural networks, the difficulties of training them and new training methods, such as Restricted Boltzmann Machines or autoencoders. It then provides a literature review of convolutional neural networks and recurrent neural networks. In order to select databases for smile recognition, comprehensive statistics of databases popular in the field of facial...
Topics: Computer Vision and Pattern Recognition, Computing Research Repository, Learning, Neural and...
Source: http://arxiv.org/abs/1508.06535
30
30
Jun 27, 2018
06/18
by
Dhagash Mehta; Crina Grosan
texts
eye 30
favorite 0
comment 0
Function optimization and finding simultaneous solutions of a system of nonlinear equations (SNE) are two closely related and important optimization problems. However, unlike in the case of function optimization in which one is required to find the global minimum and sometimes local minima, a database of challenging SNEs where one is required to find stationary points (extrama and saddle points) is not readily available. In this article, we initiate building such a database of important SNE...
Topics: Mathematical Software, Neural and Evolutionary Computing, Numerical Analysis, Optimization and...
Source: http://arxiv.org/abs/1504.02366
29
29
Jun 29, 2018
06/18
by
Scott Reed; Zeynep Akata; Santosh Mohan; Samuel Tenka; Bernt Schiele; Honglak Lee
texts
eye 29
favorite 1
comment 0
Generative Adversarial Networks (GANs) have recently demonstrated the capability to synthesize compelling real-world images, such as room interiors, album covers, manga, faces, birds, and flowers. While existing models can synthesize images based on global constraints such as a class label or caption, they do not provide control over pose or object location. We propose a new model, the Generative Adversarial What-Where Network (GAWWN), that synthesizes images given instructions describing what...
Topics: Computer Vision and Pattern Recognition, Neural and Evolutionary Computing, Computing Research...
Source: http://arxiv.org/abs/1610.02454
29
29
Jun 28, 2018
06/18
by
Peter Kvam; Joseph Cesario; Jory Schossau; Heather Eisthen; Arend Hintze
texts
eye 29
favorite 0
comment 0
Most research on adaptive decision-making takes a strategy-first approach, proposing a method of solving a problem and then examining whether it can be implemented in the brain and in what environments it succeeds. We present a method for studying strategy development based on computational evolution that takes the opposite approach, allowing strategies to develop in response to the decision-making environment via Darwinian evolution. We apply this approach to a dynamic decision-making problem...
Topics: Neurons and Cognition, Computing Research Repository, Quantitative Biology, Neural and Evolutionary...
Source: http://arxiv.org/abs/1509.05646
29
29
Jun 27, 2018
06/18
by
Giacomo Indiveri; Shih-Chii Liu
texts
eye 29
favorite 0
comment 0
A striking difference between brain-inspired neuromorphic processors and current von Neumann processors architectures is the way in which memory and processing is organized. As Information and Communication Technologies continue to address the need for increased computational power through the increase of cores within a digital processor, neuromorphic engineers and scientists can complement this need by building processor architectures where memory is distributed with the processing. In this...
Topics: Computing Research Repository, Neural and Evolutionary Computing
Source: http://arxiv.org/abs/1506.03264
27
27
Jun 27, 2018
06/18
by
Behnam Neyshabur; Ryota Tomioka; Nathan Srebro
texts
eye 27
favorite 0
comment 0
We investigate the capacity, convexity and characterization of a general family of norm-constrained feed-forward networks.
Topics: Machine Learning, Statistics, Neural and Evolutionary Computing, Learning, Artificial Intelligence,...
Source: http://arxiv.org/abs/1503.00036
26
26
Jun 28, 2018
06/18
by
Hongyuan Mei; Mohit Bansal; Matthew R. Walter
texts
eye 26
favorite 0
comment 0
We propose an end-to-end, domain-independent neural encoder-aligner-decoder model for selective generation, i.e., the joint task of content selection and surface realization. Our model first encodes a full set of over-determined database event records via an LSTM-based recurrent neural network, then utilizes a novel coarse-to-fine aligner to identify the small subset of salient records to talk about, and finally employs a decoder to generate free-form descriptions of the aligned, selected...
Topics: Computation and Language, Artificial Intelligence, Computing Research Repository, Learning, Neural...
Source: http://arxiv.org/abs/1509.00838
26
26
Jun 27, 2018
06/18
by
Henok Mengistu; Joost Huizinga; Jean-Baptiste Mouret; Jeff Clune
texts
eye 26
favorite 0
comment 0
Hierarchical organization -- the recursive composition of sub-modules -- is ubiquitous in biological networks, including neural, metabolic, ecological, and genetic regulatory networks, and in human-made systems, such as large organizations and the Internet. To date, most research on hierarchy in networks has been limited to quantifying this property. However, an open, important question in evolutionary biology is why hierarchical organization evolves in the first place. It has recently been...
Topics: Computing Research Repository, Neural and Evolutionary Computing
Source: http://arxiv.org/abs/1505.06353
25
25
Jun 28, 2018
06/18
by
Anton V. Eremeev
texts
eye 25
favorite 0
comment 0
This manuscript contains an outline of lectures course "Evolutionary Algorithms" read by the author in Omsk State University n.a. F.M.Dostoevsky. The course covers Canonic Genetic Algorithm and various other genetic algorithms as well as evolutioanry algorithms in general. Some facts, such as the Rotation Property of crossover, the Schemata Theorem, GA performance as a local search and "almost surely" convergence of evolutionary algorithms are given with complete proofs. The...
Topics: Neural and Evolutionary Computing, Computing Research Repository
Source: http://arxiv.org/abs/1511.06987
25
25
Jun 30, 2018
06/18
by
Alex Graves; Marc G. Bellemare; Jacob Menick; Remi Munos; Koray Kavukcuoglu
texts
eye 25
favorite 0
comment 0
We introduce a method for automatically selecting the path, or syllabus, that a neural network follows through a curriculum so as to maximise learning efficiency. A measure of the amount that the network learns from each data sample is provided as a reward signal to a nonstationary multi-armed bandit algorithm, which then determines a stochastic syllabus. We consider a range of signals derived from two distinct indicators of learning progress: rate of increase in prediction accuracy, and rate...
Topics: Neural and Evolutionary Computing, Computing Research Repository
Source: http://arxiv.org/abs/1704.03003
25
25
Jun 27, 2018
06/18
by
Oriol Vinyals; Meire Fortunato; Navdeep Jaitly
texts
eye 25
favorite 0
comment 0
We introduce a new neural architecture to learn the conditional probability of an output sequence with elements that are discrete tokens corresponding to positions in an input sequence. Such problems cannot be trivially addressed by existent approaches such as sequence-to-sequence and Neural Turing Machines, because the number of target classes in each step of the output depends on the length of the input, which is variable. Problems such as sorting variable sized sequences, and various...
Topics: Computational Geometry, Statistics, Machine Learning, Learning, Neural and Evolutionary Computing,...
Source: http://arxiv.org/abs/1506.03134
24
24
Jun 28, 2018
06/18
by
Gabriel Makdah
texts
eye 24
favorite 0
comment 0
The brain is a powerful tool used to achieve amazing feats. There have been several significant advances in neuroscience and artificial brain research in the past two decades. This article is a review of such advances, ranging from the concepts of connectionism, to neural network architectures and high-dimensional representations. There have also been advances in biologically inspired cognitive architectures of which we will cite a few. We will be positioning relatively specific models in a...
Topics: Artificial Intelligence, Computing Research Repository, Neural and Evolutionary Computing
Source: http://arxiv.org/abs/1507.01122
24
24
Jun 30, 2018
06/18
by
Juergen Schmidhuber
texts
eye 24
favorite 0
comment 0
In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of...
Topics: Neural and Evolutionary Computing, Computing Research Repository, Learning
Source: http://arxiv.org/abs/1404.7828
23
23
Jun 28, 2018
06/18
by
José C. Pereira; Fernando G. Lobo
texts
eye 23
favorite 0
comment 0
The Parameter-less Genetic Algorithm was first presented by Harik and Lobo in 1999 as an alternative to the usual trial-and-error method of finding, for each given problem, an acceptable set-up of the parameter values of the genetic algorithm. Since then, the same strategy has been successfully applied to create parameter-less versions of other population-based search algorithms such as the Extended Compact Genetic Algorithm and the Hierarchical Bayesian Optimization Algorithm. This report...
Topics: Computing Research Repository, Mathematical Software, Neural and Evolutionary Computing
Source: http://arxiv.org/abs/1506.08694
23
23
Jun 28, 2018
06/18
by
Ankit Kumar; Ozan Irsoy; Peter Ondruska; Mohit Iyyer; James Bradbury; Ishaan Gulrajani; Victor Zhong; Romain Paulus; Richard Socher
texts
eye 23
favorite 0
comment 0
Most tasks in natural language processing can be cast into question answering (QA) problems over language input. We introduce the dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers. Questions trigger an iterative attention process which allows the model to condition its attention on the inputs and the result of previous iterations. These results are then reasoned over in a...
Topics: Computation and Language, Computing Research Repository, Learning, Neural and Evolutionary Computing
Source: http://arxiv.org/abs/1506.07285
23
23
Jun 30, 2018
06/18
by
Nathan Wiebe; Ashish Kapoor; Krysta M. Svore
texts
eye 23
favorite 0
comment 0
In recent years, deep learning has had a profound impact on machine learning and artificial intelligence. At the same time, algorithms for quantum computers have been shown to efficiently solve some problems that are intractable on conventional, classical computers. We show that quantum computing not only reduces the time required to train a deep restricted Boltzmann machine, but also provides a richer and more comprehensive framework for deep learning than classical computing and leads to...
Topics: Quantum Physics, Neural and Evolutionary Computing, Computing Research Repository, Learning
Source: http://arxiv.org/abs/1412.3489
22
22
Jun 27, 2018
06/18
by
Andrej Karpathy; Justin Johnson; Li Fei-Fei
texts
eye 22
favorite 0
comment 0
Recurrent Neural Networks (RNNs), and specifically a variant with Long Short-Term Memory (LSTM), are enjoying renewed interest as a result of successful applications in a wide range of machine learning problems that involve sequential data. However, while LSTMs provide exceptional results in practice, the source of their performance and their limitations remain rather poorly understood. Using character-level language models as an interpretable testbed, we aim to bridge this gap by providing an...
Topics: Computation and Language, Computing Research Repository, Learning, Neural and Evolutionary Computing
Source: http://arxiv.org/abs/1506.02078
22
22
Jun 30, 2018
06/18
by
Peva Blanchard; El Mahdi El Mhamdi; Rachid Guerraoui; Julien Stainer
texts
eye 22
favorite 0
comment 0
The growth of data, the need for scalability and the complexity of models used in modern machine learning calls for distributed implementations. Yet, as of today, distributed machine learning frameworks have largely ignored the possibility of arbitrary (i.e., Byzantine) failures. In this paper, we study the robustness to Byzantine failures at the fundamental level of stochastic gradient descent (SGD), the heart of most machine learning algorithms. Assuming a set of $n$ workers, up to $f$ of...
Topics: Learning, Optimization and Control, Distributed, Parallel, and Cluster Computing, Computing...
Source: http://arxiv.org/abs/1703.02757
22
22
Jun 26, 2018
06/18
by
James J. Q. Yu; Albert Y. S. Lam; Victor O. K. Li
texts
eye 22
favorite 0
comment 0
Evolutionary algorithms (EAs) are very popular tools to design and evolve artificial neural networks (ANNs), especially to train them. These methods have advantages over the conventional backpropagation (BP) method because of their low computational requirement when searching in a large solution space. In this paper, we employ Chemical Reaction Optimization (CRO), a newly developed global optimization method, to replace BP in training neural networks. CRO is a population-based metaheuristics...
Topics: Neural and Evolutionary Computing, Computing Research Repository
Source: http://arxiv.org/abs/1502.00193
22
22
Jun 28, 2018
06/18
by
Yoon Kim; Yacine Jernite; David Sontag; Alexander M. Rush
texts
eye 22
favorite 0
comment 0
We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convolutional neural network (CNN) and a highway network over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM). On the English Penn Treebank the model is on par with the existing state-of-the-art despite having 60% fewer parameters. On languages with rich morphology (Arabic,...
Topics: Computation and Language, Statistics, Computing Research Repository, Machine Learning, Neural and...
Source: http://arxiv.org/abs/1508.06615
22
22
Jun 27, 2018
06/18
by
Song Han; Jeff Pool; John Tran; William J. Dally
texts
eye 22
favorite 0
comment 0
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems. Also, conventional networks fix the architecture before training starts; as a result, training cannot improve the architecture. To address these limitations, we describe a method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections. Our method prunes...
Topics: Computer Vision and Pattern Recognition, Computing Research Repository, Learning, Neural and...
Source: http://arxiv.org/abs/1506.02626
22
22
Jun 27, 2018
06/18
by
Sarath Chandar; Mitesh M. Khapra; Hugo Larochelle; Balaraman Ravindran
texts
eye 22
favorite 0
comment 0
Common Representation Learning (CRL), wherein different descriptions (or views) of the data are embedded in a common subspace, is receiving a lot of attention recently. Two popular paradigms here are Canonical Correlation Analysis (CCA) based approaches and Autoencoder (AE) based approaches. CCA based approaches learn a joint representation by maximizing correlation of the views when projected to the common subspace. AE based methods learn a common representation by minimizing the error of...
Topics: Machine Learning, Statistics, Neural and Evolutionary Computing, Learning, Computing Research...
Source: http://arxiv.org/abs/1504.07225
21
21
Jun 27, 2018
06/18
by
Geoffrey Hinton; Oriol Vinyals; Jeff Dean
texts
eye 21
favorite 0
comment 0
A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets. Caruana and his collaborators have shown that it is possible to compress the knowledge in an...
Topics: Machine Learning, Learning, Computing Research Repository, Statistics, Neural and Evolutionary...
Source: http://arxiv.org/abs/1503.02531
21
21
Jun 28, 2018
06/18
by
Yasir Shoaib; Olivia Das
texts
eye 21
favorite 0
comment 0
In this article, artificial neural networks (ANN) are used for modeling the number of requests received by 1998 FIFA World Cup website. Modeling is done by means of time-series forecasting. The log traces of the website, available through the Internet Traffic Archive (ITA), are processed to obtain two time-series data sets that are used for finding the following measurements: requests/day and requests/second. These are modeled by training and simulating ANN. The method followed to collect and...
Topics: Distributed, Parallel, and Cluster Computing, Computing Research Repository, Neural and...
Source: http://arxiv.org/abs/1507.07204
20
20
Jun 27, 2018
06/18
by
Alex Kendall; Matthew Grimes; Roberto Cipolla
texts
eye 20
favorite 0
comment 0
We present a robust and real-time monocular six degree of freedom relocalization system. Our system trains a convolutional neural network to regress the 6-DOF camera pose from a single RGB image in an end-to-end manner with no need of additional engineering or graph optimisation. The algorithm can operate indoors and outdoors in real time, taking 5ms per frame to compute. It obtains approximately 2m and 6 degree accuracy for large scale outdoor scenes and 0.5m and 10 degree accuracy indoors....
Topics: Computer Vision and Pattern Recognition, Computing Research Repository, Robotics, Neural and...
Source: http://arxiv.org/abs/1505.07427
20
20
Jun 26, 2018
06/18
by
Peter Wittek; Sándor Darányi; Efstratios Kontopoulos; Theodoros Moysiadis; Ioannis Kompatsiaris
texts
eye 20
favorite 0
comment 0
Based on the Aristotelian concept of potentiality vs. actuality allowing for the study of energy and dynamics in language, we propose a field approach to lexical analysis. Falling back on the distributional hypothesis to statistically model word meaning, we used evolving fields as a metaphor to express time-dependent changes in a vector space model by a combination of random indexing and evolving self-organizing maps (ESOM). To monitor semantic drifts within the observation period, an...
Topics: Neural and Evolutionary Computing, Statistics, Computing Research Repository, Learning, Computation...
Source: http://arxiv.org/abs/1502.01753
19
19
Jun 26, 2018
06/18
by
James J. Q. Yu; Victor O. K. Li; Albert Y. S. Lam
texts
eye 19
favorite 0
comment 0
Air pollution monitoring is a very popular research topic and many monitoring systems have been developed. In this paper, we formulate the Bus Sensor Deployment Problem (BSDP) to select the bus routes on which sensors are deployed, and we use Chemical Reaction Optimization (CRO) to solve BSDP. CRO is a recently proposed metaheuristic designed to solve a wide range of optimization problems. Using the real world data, namely Hong Kong Island bus route data, we perform a series of simulations and...
Topics: Neural and Evolutionary Computing, Computing Research Repository
Source: http://arxiv.org/abs/1502.00195
19
19
Jun 29, 2018
06/18
by
Emre Neftci; Charles Augustine; Somnath Paul; Georgios Detorakis
texts
eye 19
favorite 0
comment 0
An ongoing challenge in neuromorphic computing is to devise general and computationally efficient models of inference and learning which are compatible with the spatial and temporal constraints of the brain. One increasingly popular and successful approach is to take inspiration from inference and learning algorithms used in deep neural networks. However, the workhorse of deep learning, the gradient descent Back Propagation (BP) rule, often relies on the immediate availability of network-wide...
Topics: Artificial Intelligence, Neural and Evolutionary Computing, Computing Research Repository
Source: http://arxiv.org/abs/1612.05596
19
19
Jun 30, 2018
06/18
by
Benjamin Doerr; Huu Phuoc Le; Régis Makhmara; Ta Duy Nguyen
texts
eye 19
favorite 0
comment 0
For genetic algorithms using a bit-string representation of length~$n$, the general recommendation is to take $1/n$ as mutation rate. In this work, we discuss whether this is really justified for multimodal functions. Taking jump functions and the $(1+1)$ evolutionary algorithm as the simplest example, we observe that larger mutation rates give significantly better runtimes. For the $\jump_{m,n}$ function, any mutation rate between $2/n$ and $m/n$ leads to a speed-up at least exponential in $m$...
Topics: Neural and Evolutionary Computing, Computing Research Repository
Source: http://arxiv.org/abs/1703.03334