A Neural Framework for Generalized Topic Models

Topic models for text corpora comprise a popular family of methods that have inspired many extensions to encode properties such as sparsity, interactions with covariates, and the gradual evolution of topics. In this paper, we combine certain motivating ideas behind variations on topic models with modern techniques for variational inference to produce a flexible framework for topic modeling that allows for rapid exploration of different models. We first discuss how our framework relates to existing models, and then demonstrate that it achieves strong performance, with the introduction of sparsity controlling the trade off between perplexity and topic coherence.


Latent Geometry and Memorization in Generative Models

It can be difficult to tell whether a trained generative model has learned to generate novel examples or has simply memorized a specific set of outputs. In published work, it is common to attempt to address this visually, for example by displaying a generated example and its nearest neighbor(s) in the training set (in, for example, the L2 metric). As any generative model induces a probability density on its output domain, we propose studying this density directly. We first study the geometry of the latent representation and generator, relate this to the output density, and then develop techniques to compute and inspect the output density. As an application, we demonstrate that ‘memorization’ tends to a density made of delta functions concentrated on the memorized examples. We note that without first understanding the geometry, the measurement would be essentially impossible to make.


Predictive State Recurrent Neural Networks

We present a new model, called Predictive State Recurrent Neural Networks (PSRNNs), for filtering and prediction in dynamical systems. PSRNNs draw on insights from both Recurrent Neural Networks (RNNs) and Predictive State Representations (PSRs), and inherit advantages from both types of models. Like many successful RNN architectures, PSRNNs use (potentially deeply composed) bilinear transfer functions to combine information from multiple sources, so that one source can act as a gate for another. These bilinear functions arise naturally from the connection to state updates in Bayes filters like PSRs, in which observations can be viewed as gating belief states. We show that PSRNNs can be learned effectively by combining backpropogation through time (BPTT) with an initialization based on a statistically consistent learning algorithm for PSRs called two-stage regression (2SR). We also show that PSRNNs can be can be factorized using tensor decomposition, reducing model size and suggesting interesting theoretical connections to existing multiplicative architectures such as LSTMs. We applied PSRNNs to 4 datasets, and showed that we outperform several popular alternative approaches to modeling dynamical systems in all cases.


Time-Based Label Refinements to Discover More Precise Process Models

Process mining is a research field focused on the analysis of event data with the aim of extracting insights related to dynamic behavior. Applying process mining techniques on data from smart home environments has the potential to provide valuable insights in (un)healthy habits and to contribute to ambient assisted living solutions. Finding the right event labels to enable the application of process mining techniques is however far from trivial, as simply using the triggering sensor as the label for sensor events results in uninformative models that allow for too much behavior (overgeneralizing). Refinements of sensor level event labels suggested by domain experts have been shown to enable discovery of more precise and insightful process models. However, there exists no automated approach to generate refinements of event labels in the context of process mining. In this paper we propose a framework for the automated generation of label refinements based on the time attribute of events, allowing us to distinguish behaviourally different instances of the same event type based on their time attribute. We show on a case study with real life smart home event data that using automatically generated refined labels in process discovery, we can find more specific, and therefore more insightful, process models. We observe that one label refinement could have an effect on the usefulness of other label refinements when used together. Therefore, we explore four strategies to generate useful combinations of multiple label refinements and evaluate those on three real life smart home event logs.


Stabilizing Training of Generative Adversarial Networks through Regularization

Deep generative models based on Generative Adversarial Networks (GANs) have demonstrated impressive sample quality but in order to work they require a careful choice of architecture, parameter initialization, and selection of hyper-parameters. This fragility is in part due to a dimensional mismatch between the model distribution and the true distribution, causing their density ratio and the associated f-divergence to be undefined. We overcome this fundamental limitation and propose a new regularization approach with low computational cost that yields a stable GAN training procedure. We demonstrate the effectiveness of this approach on several datasets including common benchmark image generation tasks. Our approach turns GAN models into reliable building blocks for deep learning.


Multimodal Machine Learning: A Survey and Taxonomy

Our experience of the world is multimodal – we see objects, hear sounds, feel texture, smell odors, and taste flavors. Modality refers to the way in which something happens or is experienced and a research problem is characterized as multimodal when it includes multiple such modalities. In order for Artificial Intelligence to make progress in understanding the world around us, it needs to be able to interpret such multimodal signals together. Multimodal machine learning aims to build models that can process and relate information from multiple modalities. It is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential. Instead of focusing on specific multimodal applications, this paper surveys the recent advances in multimodal machine learning itself and presents them in a common taxonomy. We go beyond the typical early and late fusion categorization and identify broader challenges that are faced by multimodal machine learning, namely: representation, translation, alignment, fusion, and co-learning. This new taxonomy will enable researchers to better understand the state of the field and identify directions for future research.


An Efficient Algorithm for Bayesian Nearest Neighbours

K-Nearest Neighbours (k-NN) is a popular classification and regression algorithm, yet one of its main limitations is the difficulty in choosing the number of neighbours. We present a Bayesian algorithm to compute the posterior probability distribution for k given a target point within a data-set, efficiently and without the use of Markov Chain Monte Carlo (MCMC) methods or simulation – alongside an exact solution for distributions within the exponential family. The central idea is that data points around our target are generated by the same probability distribution, extending outwards over the appropriate, though unknown, number of neighbours. Once the data is projected onto a distance metric of choice, we can transform the choice of k into a change-point detection problem, for which there is an efficient solution: we recursively compute the probability of the last change-point as we move towards our target, and thus de facto compute the posterior probability distribution over k. Applying this approach to both a classification and a regression UCI data-sets, we compare favourably and, most importantly, by removing the need for simulation, we are able to compute the posterior probability of k exactly and rapidly. As an example, the computational time for the Ripley data-set is a few milliseconds compared to a few hours when using a MCMC approach.


Learning to Optimize: Training Deep Neural Networks for Wireless Resource Management

For decades, optimization has played a central role in addressing wireless resource management problems such as power control and beamformer design. However, these algorithms often require a considerable number of iterations for convergence, which poses challenges for real-time processing. In this work, we propose a new learning-based approach for wireless resource management. The key idea is to treat the input and output of a resource allocation algorithm as an unknown non-linear mapping and to use a deep neural network (DNN) to approximate it. If the non-linear mapping can be learned accurately and effectively by a DNN of moderate size, then such DNN can be used for resource allocation in almost \emph{real time}, since passing the input through a DNN to get the output only requires a small number of simple operations. In this work, we first characterize a class of `learnable algorithms’ and then design DNNs to approximate some algorithms of interest in wireless communications. We use extensive numerical simulations to demonstrate the superior ability of DNNs for approximating two considerably complex algorithms that are designed for power allocation in wireless transmit signal design, while giving orders of magnitude speedup in computational time.


A Sampling Theory Perspective of Graph-based Semi-supervised Learning

Graph-based methods have been quite successful in solving unsupervised and semi-supervised learning problems, as they provide a means to capture the underlying geometry of the dataset. It is often desirable for the constructed graph to satisfy two properties: first, data points that are similar in the feature space should be strongly connected on the graph, and second, the class label information should vary smoothly with respect to the graph, where smoothness is measured using the spectral properties of the graph Laplacian matrix. Recent works have justified some of these smoothness conditions by showing that they are strongly linked to the semi-supervised smoothness assumption and its variants. In this work, we reinforce this connection by viewing the problem from a graph sampling theoretic perspective, where class indicator functions are treated as bandlimited graph signals (in the eigenvector basis of the graph Laplacian) and label prediction as a bandlimited reconstruction problem. Our approach involves analyzing the bandwidth of class indicator signals generated from statistical data models with separable and nonseparable classes. These models are quite general and mimic the nature of most real-world datasets. Our results show that in the asymptotic limit, the bandwidth of any class indicator is also closely related to the geometry of the dataset. This allows one to theoretically justify the assumption of bandlimitedness of class indicator signals, thereby providing a sampling theoretic interpretation of graph-based semi-supervised classification.


Classification regions of deep neural networks

The goal of this paper is to analyze the geometric properties of deep neural network classifiers in the input space. We specifically study the topology of classification regions created by deep networks, as well as their associated decision boundary. Through a systematic empirical investigation, we show that state-of-the-art deep nets learn connected classification regions, and that the decision boundary in the vicinity of datapoints is flat along most directions. We further draw an essential connection between two seemingly unrelated properties of deep networks: their sensitivity to additive perturbations in the inputs, and the curvature of their decision boundary. The directions where the decision boundary is curved in fact remarkably characterize the directions to which the classifier is the most vulnerable. We finally leverage a fundamental asymmetry in the curvature of the decision boundary of deep nets, and propose a method to discriminate between original images, and images perturbed with small adversarial examples. We show the effectiveness of this purely geometric approach for detecting small adversarial perturbations in images, and for recovering the labels of perturbed images.


Direct Multitype Cardiac Indices Estimation via Joint Representation and Regression Learning

Plan3D: Viewpoint and Trajectory Optimization for Aerial Multi-View Stereo Reconstruction

Diagonal Rescaling For Neural Networks

Convergent Tree-Backup and Retrace with Function Approximation

Smoothing Method for Approximate Extensive-Form Perfect Equilibrium

Operation Frames and Clubs in Kidney Exchange

Overcommitment in Cloud Services — Bin packing with Chance Constraints

On Wideness and Stability

Real-Time Background Subtraction Using Adaptive Sampling and Cascade of Gaussians

Rates of convergence for inexact Krasnosel’skii-Mann iterations in Banach spaces

Together We Know How to Achieve: An Epistemic Logic of Know-How

A central limit theorem for an omnibus embedding of random dot product graphs

On Star Coloring of Splitting Graphs

Shared Memory Parallel Subgraph Enumeration

Parallel Space-Time Kernel Density Estimation

Pose Guided Person Image Generation

Unsupervised Feature Learning for Writer Identification and Writer Retrieval

Covering complete graphs by monochromatically bounded sets

Centralized vs Decentralized Multi-Agent Guesswork

Capacity Scaling of Cellular Networks: Impact of Bandwidth, Infrastructure Density and Number of Antennas

Analog Beam Tracking in Linear Antenna Arrays: Convergence, Optimality, and Performance

Tensor rank is not multiplicative under the tensor product

Distributed Robust Subspace Recovery

Reconfiguration graphs of shortest paths

Discovering Reliable Approximate Functional Dependencies

Optimal Experimental Design Using A Consistent Bayesian Approach

Approximate and Stochastic Greedy Optimization

Approximation of Ruin Probabilities via Erlangized Scale Mixtures

Dual Based DSP Bidding Strategy and its Application

Tractable Post-Selection Maximum Likelihood Inference for the Lasso

Text-Independent Speaker Verification Using 3D Convolutional Neural Networks

From dimers to webs

Hierarchical Cellular Automata for Visual Saliency

On Two Unsolved Problems Concerning Matching Covered Graphs

Equivalences Between Network Codes With Link Errors and Index Codes With Side Information Errors

Deep Learning for Lung Cancer Detection: Tackling the Kaggle Data Science Bowl 2017 Challenge

Human Trajectory Prediction using Spatially aware Deep Attention Models

Effective Sampling: Fast Segmentation Using Robust Geometric Model Fitting

Duel and sweep algorithm for order-preserving pattern matching

Taste or Addiction?: Using Play Logs to Infer Song Selection Motivation

Equilibria in Sequential Allocation

Joint Sparse Recovery With Semisupervised MUSIC

Quantum entropy and complexity

Algorithmic clothing: hybrid recommendation, from street-style-to-shop

Discovery of statistical equivalence classes using computer algebra

Spectral Heat Content for Lévy Processes

Graphical model inference with unobserved variable via latent tree aggregation

Predicting Human Interaction via Relative Attention Model

On Time-Bandwidth Product of Multi-Soliton Pulses

Performance Framework for Sparse Random Linear Network Coding in Broadcast Networks

Zero-Shot Learning with Generative Latent Prototype Model

An update on non-Hamiltonian $\frac{5}{4}$-tough maximal planar graphs

Learning Robust Features with Incremental Auto-Encoders

PL-SLAM: a Stereo SLAM System through the Combination of Points and Line Segments

Ancestral distributions in the coalescent

Performance of Viterbi Decoding on Interleaved Rician Fading Channels

Quantile function approximation using regularly varying functions

New Variants of Pattern Matching with Constants and Variables

The geometry of multi-marginal Skorokhod Embedding

BP-LED decoding algorithm for LDPC codes over AWGN channels

On infinite divisibility of a class of two-dimensional vectors in the second Wiener chaos

Two characteristic polynomials corresponding to graphical networks over min-plus algebra

ASR error management for improving spoken language understanding

Biomedical Event Trigger Identification Using Bidirectional Recurrent Neural Network Based Models

Inapproximability of VC Dimension and Littlestone’s Dimension

Towards meaningful physics from generative models

Beyond Gaussian Approximation: Bootstrap for Maxima of Sums of Independent Random Vectors

Fully Automatic Segmentation and Objective Assessment of Atrial Scars for Longstanding Persistent Atrial Fibrillation Patients Using Late Gadolinium-Enhanced MRI

Updated guidelines, updated curriculum: The GAISE College Report and introductory statistics for the modern student

Glass Transition in Supercooled Liquids with Medium Range Crystalline Order

On Two LZ78-style Grammars: Compression Bounds and Compressed-Space Computation

Expansion and contraction functors on matriods

On vertex types of graphs

An Inverse Problem for Infinitely Divisible Moving Average Random Fields

Logical and Inequality Implications for Reducing the Size and Complexity of Quadratic Unconstrained Binary Optimization Problems

Residual Expansion Algorithm: Fast and Effective Optimization for Nonconvex Least Squares Problems

Forbidden induced subposets in the grid

Analysis of universal adversarial perturbations

Bayesian GAN

Differentially private significance tests for regression coefficients

FRAMR-EMR: Framework for Prognostic Predictive Model Development Using Electronic Medical Record Data with a Case Study in Osteoarthritis Risk

Rational Fair Consensus in the GOSSIP Model

A polarity theory for sets of desirable gambles

An alternative ranking for national soccer teams based on strength parameters

The maximal subgroups and the complexity of the flow semigroup of finite (di)graphs

Risk-Sensitive Cooperative Games for Human-Machine Systems

Detecting and Explaining Crisis

Enhancement of SSD by concatenating feature maps for object detection

Fourier Phase Retrieval: Uniqueness and Algorithms

Estimation of Genetic Risk Function with Covariates in the Presence of Missing Genotypes

Random matrix products when the top Lyapunov exponent is simple

Extracting 3D Vascular Structures from Microscopy Images using Convolutional Recurrent Networks

Nearly Semiparametric Efficient Estimation of Quantile Regression

Approximating Constrained Minimum Cost Input-Output Selection for Generic Arbitrary Pole Placement in Structured Systems

Learning a Robust Society of Tracking Parts

Combinatorial Multi-Armed Bandits with Filtered Feedback

End-to-end Global to Local CNN Learning for Hand Pose Recovery in Depth data

Gossip in a Smartphone Peer-to-Peer Network

Coverage and Spectral Efficiency of Indoor mmWave Networks with Ceiling-Mounted Access Points

New Optimal Binary Sequences with Period $4p$ via Interleaving Ding-Helleseth-Lam Sequences

A General Convergence Result for the Exponentiated Gradient Method

Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration

Multidesigns for a graph pair of order 6

Multiresolution Priority Queues

A greedy approximation algorithm for the minimum (2,2)-connected dominating set problem

Learning Causal Structures Using Regression Invariance

A Demazure crystal construction for Schubert polynomials

The border support rank of two-by-two matrix multiplication is seven

Style Transfer from Non-Parallel Text by Cross-Alignment

Helping News Editors Write Better Headlines: A Recommender to Improve the Keyword Contents & Shareability of News Headlines

Advertisements