Glass-Box Program Synthesis: A Machine Learning Approach

Recently proposed models which learn to write computer programs from data use either input/output examples or rich execution traces. Instead, we argue that a novel alternative is to use a glass-box loss function, given as a program itself that can be directly inspected. Glass-box optimization covers a wide range of problems, from computing the greatest common divisor of two integers, to learning-to-learn problems. In this paper, we present an intelligent search system which learns, given the partial program and the glass-box problem, the probabilities over the space of programs. We empirically demonstrate that our informed search procedure leads to significant improvements compared to brute-force program search, both in terms of accuracy and time. For our experiments we use rich context free grammars inspired by number theory, text processing, and algebra. Our results show that (i) performing 4 rounds of our framework typically solves about 70% of the target problems, (ii) our framework can improve itself even in domain agnostic scenarios, and (iii) it can solve problems that would be otherwise too slow to solve with brute-force search.

Identification of Causalities in Spatio-temporal Data

This paper contributes to the understanding of strongly coupled spatio-temporal processes by describing a generic method based on Granger causality. The method is validated by the robust identification of causality regimes and of their phase diagram for an urban morphogenesis model that couples network growth with density. The application to the real case study of Greater Paris transportation projects shows a link between territorial dynamics, more particularly of real estate and socio-economic, and the anticipated network growth. We finally discuss potential extensions to other temporal and spatial scales.

DOC: Deep Open Classification of Text Documents

Traditional supervised learning makes the closed-world assumption that the classes appeared in the test data must have appeared in training. This also applies to text learning or text classification. As learning is used increasingly in dynamic open environments where some new/test documents may not belong to any of the training classes, identifying these novel documents during classification presents an important problem. This problem is called open-world classification or open classification. This paper proposes a novel deep learning based approach. It outperforms existing state-of-the-art techniques dramatically.

Stochastic Nonconvex Optimization with Large Minibatches

We study stochastic optimization of nonconvex loss functions, which are typical objectives for training neural networks. We propose stochastic approximation algorithms which optimize a series of regularized, nonlinearized losses on large minibatches of samples, using only first-order gradient information. Our algorithms provably converge to an approximate critical point of the expected objective with faster rates than minibatch stochastic gradient descent, and facilitate better parallelization by allowing larger minibatches.

Understanding a Version of Multivariate Symmetric Uncertainty to assist in Feature Selection

In this paper, we analyze the behavior of the multivariate symmetric uncertainty (MSU) measure through the use of statistical simulation techniques under various mixes of informative and non-informative randomly generated features. Experiments show how the number of attributes, their cardinalities, and the sample size affect the MSU. We discovered a condition that preserves good quality in the MSU under different combinations of these three factors, providing a new useful criterion to help drive the process of dimension reduction.

Active Learning amidst Logical Knowledge

Structured prediction is ubiquitous in applications of machine learning such as knowledge extraction and natural language processing. Structure often can be formulated in terms of logical constraints. We consider the question of how to perform efficient active learning in the presence of logical constraints among variables inferred by different classifiers. We propose several methods and provide theoretical results that demonstrate the inappropriateness of employing uncertainty guided sampling, a commonly used active learning method. Furthermore, experiments on ten different datasets demonstrate that the methods significantly outperform alternatives in practice. The results are of practical significance in situations where labeled data is scarce.

On the regularization of Wasserstein GANs

Since their invention, generative adversarial networks (GANs) have become a popular approach for learning to model a distribution of real (unlabeled) data. Convergence problems during training are overcome by Wasserstein GANs which minimize the distance between the model and the empirical distribution in terms of a different metric, but thereby introduce a Lipschitz constraint into the optimization problem. A simple way to enforce the Lipschitz constraint on the class of functions, which can be modeled by the neural network, is weight clipping. It was proposed that training can be improved by instead augmenting the loss by a regularization term that penalizes the deviation of the gradient of the critic (as a function of the network’s input) from one. We present theoretical arguments why using a weaker regularization term enforcing the Lipschitz constraint is preferable. These arguments are supported by experimental results on toy data sets.

Multi-Rack Distributed Data Storage Networks

The majority of works in distributed storage networks assume a simple network model with a collection of identical storage nodes with the same communication cost between the nodes. In this paper, we consider a realistic multi-rack distributed data storage network and present a code design framework for this model. Considering the cheaper data transmission within the racks, our code construction method is able to locally repair the nodes failure within the same rack by using only the survived nodes in the same rack. However, in the case of severe failure patterns when the information content of the survived nodes is not sufficient to repair the failures, other racks will participate in the repair process. By employing the criteria of our multi-rack storage code, we establish a linear programming bound on the size of the code in order to maximize the code rate.

On Tropical Linear and Integer Programs

We present simple compact proofs of the strong and weak duality theorems of tropical linear programming. It follows that there is no duality gap for a pair of tropical primal-dual problems. This result together with known properties of subeigenvectors enables us to directly solve a special tropical linear program with two-sided constraints. We also study the duality gap in tropical integer linear programming. A direct solution is available for the primal problem. An algorithm of quadratic complexity is presented for the dual problem. A direct solution is available provided that all coefficients of the objective function are integer. This solution yields a good estimate of the optimal objective function value in the general case.

Embodied Evolution in Collective Robotics: A Review

This paper provides an overview of evolutionary robotics techniques applied to on-line distributed evolution for robot collectives — namely, embodied evolution. It provides a definition of embodied evolution as well as a thorough description of the underlying concepts and mechanisms. The paper also presents a comprehensive summary of research published in the field since its inception (1999-2017), providing various perspectives to identify the major trends. In particular, we identify a shift from considering embodied evolution as a parallel search method within small robot collectives (fewer than 10 robots) to embodied evolution as an on-line distributed learning method for designing collective behaviours in swarm-like collectives. The paper concludes with a discussion of applications and open questions, providing a milestone for past and an inspiration for future research.

Analysis of structured Markov processes

Markov processes are popular mathematical models, studied by theoreticians for their intriguing properties, and applied by practitioners for their flexible structure. With this book we teach how to model and analyze Markov processes. We classify Markov processes based on their structural properties, which in turn determine which analytic methods are required for solving them. In doing so, we start in each chapter with specific examples that naturally lead up to general theory and general methods. In this way the reader learns about Markov processes on the job. By studying this book, the reader becomes acquainted with the basic analytic methods that come into play when systems are modeled as structured Markov processes. These basic methods will likely prove useful, in real-time when studying the examples at hand, but more importantly for future encounters with Markov processes not covered in this book. Methods are more important than examples. The methods have a large scope of application, even outside the scope of Markov processes, in areas like probability theory, industrial engineering, mechanical engineering, physics and financial mathematics.

PMV: Pre-partitioned Generalized Matrix-Vector Multiplication for Scalable Graph Mining

How can we analyze enormous networks including the Web and social networks which have hundreds of billions of nodes and edges? Network analyses have been conducted by various graph mining methods including shortest path computation, PageRank, connected component computation, random walk with restart, etc. These graph mining methods can be expressed as generalized matrix-vector multiplication which consists of few operations inspired by typical matrix-vector multiplication. Recently, several graph processing systems based on matrix-vector multiplication or their own primitives have been proposed to deal with large graphs; however, they all have failed on Web-scale graphs due to insufficient memory space or the lack of consideration for I/O costs. In this paper, we propose PMV (Pre-partitioned generalized Matrix-Vector multiplication), a scalable distributed graph mining method based on generalized matrix-vector multiplication on distributed systems. PMV significantly decreases the communication cost, which is the main bottleneck of distributed systems, by partitioning the input graph in advance and judiciously applying execution strategies based on the density of the pre-partitioned sub-matrices. Experiments show that PMV succeeds in processing up to 16x larger graphs than existing distributed memory-based graph mining methods, and requires 9x less time than previous disk-based graph mining methods by reducing I/O costs significantly.

Adaptive Nonparametric Clustering

This paper presents a new approach to non-parametric cluster analysis called Adaptive Weights Clustering (AWC). The idea is to identify the clustering structure by checking at different points and for different scales on departure from local homogeneity. The proposed procedure describes the clustering structure in terms of weights \( w_{ij} \) each of them measures the degree of local inhomogeneity for two neighbor local clusters using statistical tests of ‘no gap’ between them. % The procedure starts from very local scale, then the parameter of locality grows by some factor at each step. The method is fully adaptive and does not require to specify the number of clusters or their structure. The clustering results are not sensitive to noise and outliers, the procedure is able to recover different clusters with sharp edges or manifold structure. The method is scalable and computationally feasible. An intensive numerical study shows a state-of-the-art performance of the method in various artificial examples and applications to text data. Our theoretical study states optimal sensitivity of AWC to local inhomogeneity.

Tensors Come of Age: Why the AI Revolution will help HPC

This article discusses how the automation of tensor algorithms, based on A Mathematics of Arrays and Psi Calculus, and a new way to represent numbers, Unum Arithmetic, enables mechanically provable, scalable, portable, and more numerically accurate software.

Tensor Product Generation Networks

We present a new tensor product generation network (TPGN) that generates natural language descriptions for images. The model has a novel architecture that instantiates a general framework for encoding and processing symbolic structure through neural network computation. This framework is built on Tensor Product Representations (TPRs). We evaluated the proposed TPGN on the MS COCO image captioning task. The experimental results show that the TPGN outperforms the LSTM based state-of-the-art baseline with a significant margin. Further, we show that our caption generation model can be interpreted as generating sequences of grammatical categories and retrieving words by their categories from a plan encoded as a distributed representation.

Output Range Analysis for Deep Neural Networks

Deep neural networks (NN) are extensively used for machine learning tasks such as image classification, perception and control of autonomous systems. Increasingly, these deep NNs are also been deployed in high-assurance applications. Thus, there is a pressing need for developing techniques to verify neural networks to check whether certain user-expected properties are satisfied. In this paper, we study a specific verification problem of computing a guaranteed range for the output of a deep neural network given a set of inputs represented as a convex polyhedron. Range estimation is a key primitive for verifying deep NNs. We present an efficient range estimation algorithm that uses a combination of local search and linear programming problems to efficiently find the maximum and minimum values taken by the outputs of the NN over the given input set. In contrast to recently proposed ‘monolithic’ optimization approaches, we use local gradient descent to repeatedly find and eliminate local minima of the function. The final global optimum is certified using a mixed integer programming instance. We implement our approach and compare it with Reluplex, a recently proposed solver for deep neural networks. We demonstrate the effectiveness of the proposed approach for verification of NNs used in automated control as well as those used in classification.

EDEN: Evolutionary Deep Networks for Efficient Machine Learning

Deep neural networks continue to show improved performance with increasing depth, an encouraging trend that implies an explosion in the possible permutations of network architectures and hyperparameters for which there is little intuitive guidance. To address this increasing complexity, we propose Evolutionary DEep Networks (EDEN), a computationally efficient neuro-evolutionary algorithm which interfaces to any deep neural network platform, such as TensorFlow. We show that EDEN evolves simple yet successful architectures built from embedding, 1D and 2D convolutional, max pooling and fully connected layers along with their hyperparameters. Evaluation of EDEN across seven image and sentiment classification datasets shows that it reliably finds good networks — and in three cases achieves state-of-the-art results — even on a single GPU, in just 6-24 hours. Our study provides a first attempt at applying neuro-evolution to the creation of 1D convolutional networks for sentiment analysis including the optimisation of the embedding layer.

String order parameters for 1d Floquet Symmetry Protected Topological Phases
Collaborative Service Caching for Edge Computing in Dense Small Cell Networks
Fast Vehicle Detection in Aerial Imagery
Multimodal Image Super-Resolution via Joint Sparse Representations induced by Coupled Dictionaries
Two applications of polylog functions and Euler sums
Can you fool AI with adversarial examples on a visual Turing test?
Shadows of characteristic cycles, Verma modules, and positivity of Chern-Schwartz-MacPherson classes of Schubert cells
Comparing Powers of Edge Ideals
Entanglement dynamics in random media
Topology-dependent density optima for efficient simultaneous network exploration
On Extreme Value Index Estimation under Random Censoring
A Simulation Comparison of Estimators of Conditional Extreme Value Index under Right Random Censoring
Decorated Dyck paths, the Delta conjecture, and a new q,t-square
Camera-Aware Multi-Resolution Analysis (CAMRA) for Raw Sensor Data Compression
On the error of a priori sampling: zero forcing sets and propagation time
An application of kissing numbers in sum-product estimates
Image similarity using Deep CNN and Curriculum Learning
The importance of scale in spatially varying coefficient modeling
Network Topology and Communication-Computation Tradeoffs in Decentralized Optimization
BOSS-LDG: A Novel Computational Framework that Brings Together Blue Waters, Open Science Grid, Shifter and the LIGO Data Grid to Accelerate Gravitational Wave Discovery
On the Model Shrinkage Effect of Gamma Process Edge Partition Models
Dynamic Reconfiguration of Mission Parameters in Underwater Human-Robot Collaboration
Estimation of the Hurst Exponent Using Trimean Estimators on Nondecimated Wavelet Coefficients
Evaluating the Sensitivity of Flood Return Levels to Data Length by Maximum Likelihood Estimation
Convex Relaxations for Global Optimization Under Uncertainty Described by Continuous Random Variables
Green Heterogeneous Cloud Radio Access Networks: Potential Techniques, Performance Tradeoffs, and Challenges
Exploiting Dual Connectivity in Heterogeneous Cellular Networks
Closure of resource-bounded randomness notions under polynomial time permutations
On Stein’s Identity and Near-Optimal Estimation in High-dimensional Index Models
Ultra-Dense HetNets Meet Big Data: Green Frameworks, Techniques, and Approaches
TuringMobile: A Turing Machine of Oblivious Mobile Robots with Limited Visibility and its Applications
A Deep Learning Model for Traffic Flow State Classification Based on Smart Phone Sensor Data
Converting Your Thoughts to Texts: Enabling Brain Typing via Deep Feature Learning of EEG Signals
Sensing-Constrained LQG Control
Towards End-to-End Car License Plates Detection and Recognition with Deep Neural Networks
Catching Anomalous Distributed Photovoltaics: An Edge-based Multi-modal Anomaly Detection
UAV and Service Robot Coordination for Indoor Object Search Tasks
Conjugate Phase Retrieval on ${\mathbb C}^M$ by real vectors
Conic Optimization Theory: Convexification Techniques and Numerical Algorithms
Learning a Predictive Model for Music Using PULSE
Simulation-based Estimation and Inference of Production Frontiers
Object-oriented Neural Programming (OONP) for Document Understanding
Learning to Inpaint for Image Compression
Polysemy Detection in Distributed Representation of Word Sense
Bayesian Inference of Network Epidemics
Learning Multi-grid Generative ConvNets by Minimal Contrastive Divergence
Spectral radius of a star with one long arm
Learning to Label Affordances from Simulated and Real Data
Control Problems and Invariant Subspaces for the Sabra Shell Model of Turbulence
Generating Sentences by Editing Prototypes
An enhanced method to compute the similarity between concepts of ontology
Redesigning Bitcoin’s fee market
Signatures of many-body localization in steady states of open quantum systems
Partial matching width and its application to lower bounds for branching programs
Perfect matchings in highly cyclically connected regular graphs
Performance Limitations of Distributed Integral Control in Power Networks Under Noisy Measurements
Improving a Multi-Source Neural Machine Translation Model with Corpus Extension for Low-Resource Languages
In-Place Initializable Arrays
A New Cooperative Framework for Parallel Trajectory-Based Metaheuristics
Input-to-Output Gate to Improve RNN Language Models
On the Benefits of QoS-Differentiated Posted Pricing in Cloud Computing: An Analytical Model
Cubature rules and expected value of some complex functions
Telling Cause from Effect using MDL-based Local and Global Regression
SCARFF: a Scalable Framework for Streaming Credit Card Fraud Detection with Spark
Further studies on square-root boundaries for Bessel processes
UBSegNet: Unified Biometric Region of Interest Segmentation Network
Interplay of interaction and disorder in the steady state of an open quantum system
Frame difference families and resolvable balanced incomplete block designs
Computing Tree Decompositions with FlowCutter: PACE 2017 Submission
Russo-Seymour-Welsh estimates for the Kostlan ensemble of random polynomials
Multi-layer Visualization for Medical Mixed Reality
On $k$-rainbow independent domination in graphs
Graph rigidity for unitarily invariant matrix norms
User and Developer Interaction with Editable and Readable Ontologies
All Classical Adversary Methods are Equivalent for Total Functions
Reachability Switching Games
Selective final state spectroscopy and multifractality in two-component ultracold Bose-Einstein condensates: a numerical study
Optimal Stationary Synchronization of Heterogeneous Linear Multi-Agent Systems
Backtracking strategies for accelerated descent methods with smooth composite objectives
Response to a small external force and fluctuations of a passive particle in a one-dimensional diffusive environment
Interpretable High-Dimensional Inference Via Score Projection with an Application in Neuroimaging
The smallest eigenvalues of Hamming graphs, Johnson graphs and other distance-regular graphs with classical parameters
On the existence of a solution to a spectral estimation problem \emph{à la} Byrnes-Georgiou-Lindquist
AutoEncoder by Forest
Motions of grid-like reflection frameworks
A note on preconditioning weighted linear least squares, with consequences for weakly-constrained variational data assimilation
Visibility in the vacant set of the Brownian interlacements and the Brownian excursion process
On the separation conjecture in Avoider-Enforcer games
MDP environments for the OpenAI Gym
Geodesics Toward Corners in First Passage Percolation
Automated sub-cortical brain structure segmentation combining spatial and deep convolutional features
Multi-Person Brain Activity Recognition via Comprehensive EEG Signal Analysis
Exploring cascading outages and weather via processing historic data
Distributed Information Bottleneck Method for Discrete and Gaussian Sources
From time-series to complex networks: Application to the cerebrovascular flow patterns in atrial fibrillation
Beyond opening up the black box: Investigating the role of algorithmic systems in Wikipedian organizational culture
Generalized Sparse and Low-Rank Optimization for Ultra-Dense Networks
Region-Based Image Retrieval Revisited
BCS-type second-order phase transition of classical Langmuir wave system
Discrete Choice and Rational Inattention: a General Equivalence Result
Integration of Japanese Papers Into the DBLP Data Set
Joint Detection and Recounting of Abnormal Events by Learning Deep Generic Knowledge
Interactive Coding for Markovian Protocols
Automatic Error Analysis of Human Motor Performance for Interactive Coaching in Virtual Reality
Rao-Blackwellization to give Improved Estimates in Multi-List Studies
V-cycle multigrid algorithms for discontinuous Galerkin methods on non-nested polytopic meshes
Modelling reporting delays for disease surveillance data
On cubic graphical regular representations of finite simple groups
Quasitrivial semigroups: characterizations and enumerations
Activated Random Walk on a cycle