Exterior Distance Function

We introduce and study exterior distance function (EDF) and correspondent exterior point method (EPM) for convex optimization. The EDF is a classical Lagrangian for an equivalent problem obtained from the initial one by monotone transformation of both the objective function and the constraints. The constraints transformation is scaled by a positive scaling parameter. Thus, the EDF is a particular realization of the Nonlinear Rescaling (NR) principle. Along with the ‘center’, the EDF has two extra tools: the barrier (scaling) parameter and the vector of Lagrange multipliers. We show that EPM generates primal – dual sequence, which converges to the primal – dual solution in value under minimum assumption on the input data. Moreover, the convergence is taking place under any fixed interior point as a ‘center’ and any fixed positive scaling parameter, just due to the Lagrange multipliers update. If the second order sufficient optimality condition is satisfied, then the EPM converges with Q-linear rate under any fixed interior point as a ‘center’ and any fixed, but large enough positive scaling parameter.

Simulation optimization: A review of algorithms and applications

Simulation Optimization (SO) refers to the optimization of an objective function subject to constraints, both of which can be evaluated through a stochastic simulation. To address specific features of a particular simulation—discrete or continuous decisions, expensive or cheap simulations, single or multiple outputs, homogeneous or heterogeneous noise—various algorithms have been proposed in the literature. As one can imagine, there exist several competing algorithms for each of these classes of problems. This document emphasizes the difficulties in simulation optimization as compared to mathematical programming, makes reference to state-of-the-art algorithms in the field, examines and contrasts the different approaches used, reviews some of the diverse applications that have been tackled by these methods, and speculates on future directions in the field.

Domain reduction techniques for global NLP and MINLP optimization

Optimization solvers routinely utilize presolve techniques, including model simplification, reformulation and domain reduction techniques. Domain reduction techniques are especially important in speeding up convergence to the global optimum for challenging nonconvex nonlinear programming (NLP) and mixed-integer nonlinear programming (MINLP) optimization problems. In this work, we survey the various techniques used for domain reduction of NLP and MINLP optimization problems. We also present a computational analysis of the impact of these techniques on the performance of various widely available global solvers on a collection of 1740 test problems.

Hierarchical Model for Long-term Video Prediction

Video prediction has been an active topic of research in the past few years. Many algorithms focus on pixel-level predictions, which generates results that blur and disintegrate within a few frames. In this project, we use a hierarchical approach for long-term video prediction. We aim at estimating high-level structure in the input frame first, then predict how that structure grows in the future. Finally, we use an image analogy network to recover a realistic image from the predicted structure. Our method is largely adopted from the work by Villegas et al. The method is built with a combination of LSTMs and analogy-based convolutional auto-encoder networks. Additionally, in order to generate more realistic frame predictions, we also adopt adversarial loss. We evaluate our method on the Penn Action dataset, and demonstrate good results on high-level long-term structure prediction.

DE-PACRR: Exploring Layers Inside the PACRR Model

Recent neural IR models have demonstrated deep learning’s utility in ad-hoc information retrieval. However, deep models have a reputation for being black boxes, and the roles of a neural IR model’s components may not be obvious at first glance. In this work, we attempt to shed light on the inner workings of a recently proposed neural IR model, namely the PACRR model, by visualizing the output of intermediate layers and by investigating the relationship between intermediate weights and the ultimate relevance score produced. We highlight several insights, hoping that such insights will be generally applicable.

Topometric Localization with Deep Learning

Compared to LiDAR-based localization methods, which provide high accuracy but rely on expensive sensors, visual localization approaches only require a camera and thus are more cost-effective while their accuracy and reliability typically is inferior to LiDAR-based methods. In this work, we propose a vision-based localization approach that learns from LiDAR-based localization methods by using their output as training data, thus combining a cheap, passive sensor with an accuracy that is on-par with LiDAR-based localization. The approach consists of two deep networks trained on visual odometry and topological localization, respectively, and a successive optimization to combine the predictions of these two networks. We evaluate the approach on a new challenging pedestrian-based dataset captured over the course of six months in varying weather conditions with a high degree of noise. The experiments demonstrate that the localization errors are up to 10 times smaller than with traditional vision-based localization methods.

When Neurons Fail

We view a neural network as a distributed system of which neurons can fail independently, and we evaluate its robustness in the absence of any (recovery) learning phase. We give tight bounds on the number of neurons that can fail without harming the result of a computation. To determine our bounds, we leverage the fact that neural activation functions are Lipschitz-continuous. Our bound is on a quantity, we call the \textit{Forward Error Propagation}, capturing how much error is propagated by a neural network when a given number of components is failing, computing this quantity only requires looking at the topology of the network, while experimentally assessing the robustness of a network requires the costly experiment of looking at all the possible inputs and testing all the possible configurations of the network corresponding to different failure situations, facing a discouraging combinatorial explosion. We distinguish the case of neurons that can fail and stop their activity (crashed neurons) from the case of neurons that can fail by transmitting arbitrary values (Byzantine neurons). Interestingly, as we show in the paper, our bound can easily be extended to the case where synapses can fail. We show how our bound can be leveraged to quantify the effect of memory cost reduction on the accuracy of a neural network, to estimate the amount of information any neuron needs from its preceding layer, enabling thereby a boosting scheme that prevents neurons from waiting for unnecessary signals. We finally discuss the trade-off between neural networks robustness and learning cost.

Fast Algorithms for Learning Latent Variables in Graphical Models

We study the problem of learning latent variables in Gaussian graphical models. Existing methods for this problem assume that the precision matrix of the observed variables is the superposition of a sparse and a low-rank component. In this paper, we focus on the estimation of the low-rank component, which encodes the effect of marginalization over the latent variables. We introduce fast, proper learning algorithms for this problem. In contrast with existing approaches, our algorithms are manifestly non-convex. We support their efficacy via a rigorous theoretical analysis, and show that our algorithms match the best possible in terms of sample complexity, while achieving computational speed-ups over existing methods. We complement our theory with several numerical experiments.

Exploring Generalization in Deep Learning

With a goal of understanding what drives generalization in deep networks, we consider several recently suggested explanations, including norm-based control, sharpness and robustness. We study how these measures can ensure generalization, highlighting the importance of scale normalization, and making a connection between sharpness and PAC-Bayes theory. We then investigate how well the measures explain different observed phenomena.

Orthogonal Symmetric Chain Decompositions of Hypercubes
Symmetric Chain Decompositions of Products of Posets with Long Chains
Semidefinite Programming and Nash Equilibria in Bimatrix Games
New insights into non-central beta distributions
Group Synchronization on Grids
Illuminating Pedestrians via Simultaneous Detection & Segmentation
Coverage Probability Fails to Ensure Reliable Inference
MolecuLeNet: A continuous-filter convolutional neural network for modeling quantum interactions
Empirical priors and posterior concentration rates for a monotone density
Neural Question Answering at BioASQ 5B
Parareal Algorithm Implementation and Simulation in Julia
Detecting Small Signs from Large Images
Using Frame Theoretic Convolutional Gridding for Robust Synthetic Aperture Sonar Imaging
Invariant Causal Prediction for Nonlinear Models
Learning Local Feature Aggregation Functions with Backpropagation
Treewidth Bounds for Planar Graphs Using Three-Sided Brambles
Corrigendum for ‘Second-order reflected backward stochastic differential equations’ and ‘Second-order BSDEs with general reflection and game options under uncertainty’
Robust Sonar ATR Through Bayesian Pose Corrected Sparse Classification
Developing Bug-Free Machine Learning Systems With Formal Mathematics
Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study
The Minor Fall, the Major Lift: Inferring Emotional Valence of Musical Chords through Lyrics
Relating Complexity-theoretic Parameters with SAT Solver Performance
Do Deep Neural Networks Suffer from Crowding?
SUNNY-CP and the MiniZinc Challenge
Self-Sustaining Caching Stations: Towards Cost-Effective 5G-Enabled Vehicular Networks
Dense Non-rigid Structure-from-Motion Made Easy – A Spatial-Temporal Smoothness based Solution
Refined Cyclic Sieving on Words for the Major Index Statistic
Preservation of quantum Fisher information and geometric phase of a single qubit system in a dissipative reservoir through the addition of qubits
An Isomorphism between Lyapunov Exponents and Shannon’s Channel Capacity
A combinatorial proof of the smoothness of catalecticant schemes associated to complete intersections
Laplace deconvolution in the presence of indirect long-memory data
A Unified approach for Conventional Zero-shot, Generalized Zero-shot and Few-shot Learning
Fast and accurate classification of echocardiograms using deep learning
To slow, or not to slow? New science in sub-second networks
Fast and robust tensor decomposition with applications to dictionary learning
Proceedings of the First International Workshop on Deep Learning and Music
A stable Langevin model with diffusive-reflective boundary conditions
Memory-augmented Chinese-Uyghur Neural Machine Translation
Material Recognition CNNs and Hierarchical Planning for Biped Robot Locomotion on Slippery Terrain
Deviation inequalities for convex functions motivated by the Talagrand conjecture
Large-scale Datasets: Faces with Partial Occlusions and Pose Variations in the Wild
Sensitivity analysis for network aggregative games
Mixing time of an unaligned Metropolis algorithm on the square
Controlled Tactile Exploration and Haptic Object Recognition
Two-Stage Hybrid Day-Ahead Solar Forecasting
PasMoQAP: A Parallel Asynchronous Memetic Algorithm for solving the Multi-Objective Quadratic Assignment Problem
Beyond Moore-Penrose Part II: The Sparse Pseudoinverse
PSK Precoding in Multi-User MISO Systems
Minimum BER Precoding in 1-Bit Massive MIMO Systems
Power- and Spectral Efficient Communication System Design Using 1-Bit Quantization
Independent motion detection with event-driven cameras
MMSE precoder for massive MIMO using 1-bit quantization
DFE/THP duality for FBMC with highly frequency selective channels
Spatial Coding Based on Minimum BER in 1-Bit Massive MIMO Systems
Typical Approximation Performance for Maximum Coverage Problem
Spectral shaping with low resolution signals
On efficiently solving the subproblems of a level-set method for fused lasso problems
Fountain Codes under Maximum Likelihood Decoding
Beamforming and Scheduling for mmWave Downlink Sparse Virtual Channels With Non-Orthogonal and Orthogonal Multiple Access
Hamilton-Jacobi equations for optimal control on networks with entry or exit costs
Extrinsic Gaussian processes for regression and classification on manifolds
Centralized and Distributed Sparsification for Low-Complexity Message Passing Algorithm in C-RAN Architectures
Computing denumerants in numerical 3-semigroups
Dynamics of a planar Coulomb gas
Equilibrium large deviations for mean-field systems with translation invariance
The Complexity of Counting Surjective Homomorphisms and Compactions
A decentralized approach to multi-agent MILPs: finite-time feasibility and performance guarantees
Auto-Encoder Guided GAN for Chinese Calligraphy Synthesis
Landscape of Configurational Density of States for Discrete Large Systems
NOMA based Random Access with Multichannel ALOHA
On the R-superlinear convergence of the KKT residues generated by the augmented Lagrangian method for convex composite conic programming
Approximate Reflection Symmetry in a Point Set: Theory and Algorithm with an Application
NOMA: Principles and Recent Results
Recurrent Residual Learning for Action Recognition
A universal law for Voronoi cell volumes in infinitely large maps
Forecasting and Granger Modelling with Non-linear Dynamical Dependencies
Gabor frames and deep scattering networks in audio processing
Evolution of quantum entanglement with disorder in fractional quantum Hall liquids
archivist: An R Package for Managing, Recording and Restoring Data Analysis Results
The Second Leaper Theorem
A special case of completion invariance for the $c_2$ invariant of a graph
Large deviations for stochastic models of two-dimensional second grade fluids driven by Lévy noise
Hypergraphs with vanishing Turán density in uniformly dense hypergraphs
Rate-Distortion Classification for Self-Tuning IoT Networks
Unsupervised Feature Selection Based on Space Filling Concept
Critical properties of disordered XY model on sparse random graphs
Constant composition codes derived from linear codes
Detecting in-plane tension induced crystal plasticity transition with nanoindentation
Determinants of Random Block Hankel Matrices
Invariant components of synergy, redundancy, and unique information among three variables
Cross-Country Skiing Gears Classification using Deep Learning
Subspace Clustering with the Multivariate-t Distribution
Classical Music Clustering Based on Acoustic Features
Reexamining Low Rank Matrix Factorization for Trace Norm Regularization
The multipartite Ramsey number for the 3-path of length three
Robust and Efficient Parametric Spectral Estimation in Atomic Force Microscopy
Training a Fully Convolutional Neural Network to Route Integrated Circuits
Combinatorial approach to detection of fixed points, periodic orbits, and symbolic dynamics
Graphs that contain multiply transitive matchings