Efficient Online Learning for Optimizing Value of Information: Theory and Application to Interactive Troubleshooting

We consider the optimal value of information (VoI) problem, where the goal is to sequentially select a set of tests with a minimal cost, so that one can efficiently make the best decision based on the observed outcomes. Existing algorithms are either heuristics with no guarantees, or scale poorly (with exponential run time in terms of the number of available tests). Moreover, these methods assume a known distribution over the test outcomes, which is often not the case in practice. We propose an efficient sampling-based online learning framework to address the above issues. First, assuming the distribution over hypotheses is known, we propose a dynamic hypothesis enumeration strategy, which allows efficient information gathering with strong theoretical guarantees. We show that with sufficient amount of samples, one can identify a near-optimal decision with high probability. Second, when the parameters of the hypotheses distribution are unknown, we propose an algorithm which learns the parameters progressively via posterior sampling in an online fashion. We further establish a rigorous bound on the expected regret. We demonstrate the effectiveness of our approach on a real-world interactive troubleshooting application and show that one can efficiently make high-quality decisions with low cost.


ParaGraphE: A Library for Parallel Knowledge Graph Embedding

Knowledge graph embedding aims at translating the knowledge graph into numerical representations by transforming the entities and relations into con- tinuous low-dimensional vectors. Recently, many methods [1, 5, 3, 2, 6] have been proposed to deal with this problem, but existing single-thread implemen- tations of them are time-consuming for large-scale knowledge graphs. Here, we design a unified parallel framework to parallelize these methods, which achieves a significant time reduction without in uencing the accuracy. We name our framework as ParaGraphE, which provides a library for parallel knowledge graph embedding. The source code can be downloaded from https: //github.com/LIBBLE/LIBBLE-MultiThread/tree/master/ParaGraphE.


Improving Document Clustering by Eliminating Unnatural Language

Technical documents contain a fair amount of unnatural language, such as tables, formulas, pseudo-codes, etc. Unnatural language can be an important factor of confusing existing NLP tools. This paper presents an effective method of distinguishing unnatural language from natural language, and evaluates the impact of unnatural language detection on NLP tasks such as document clustering. We view this problem as an information extraction task and build a multiclass classification model identifying unnatural language components into four categories. First, we create a new annotated corpus by collecting slides and papers in various formats, PPT, PDF, and HTML, where unnatural language components are annotated into four categories. We then explore features available from plain text to build a statistical model that can handle any format as long as it is converted into plain text. Our experiments show that removing unnatural language components gives an absolute improvement in document clustering up to 15%. Our corpus and tool are publicly available.


Complexity of sampling as an order parameter

Many-body quantum state tomography with neural networks

Generalised additive mixed models for dynamic analysis in linguistics: a practical introduction

Layered black-box, behavioral interconnection perspective and applications to the problem of communication with fidelity criteria, Part I: i.i.d. sources

Layered black-box, behavioral interconnection perspective and applications to the problem of communication with fidelity criteria, Part II: stationary sources satisfying ψ-mixing criterion

Hadamard Equiangular Tight Frames

Illuminant Estimation using Ensembles of Multivariate Regression Trees

A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers

Concentration Bounds for Two Timescale Stochastic Approximation with Applications to Reinforcement Learning

Non-crossing Monotone Paths and Binary Trees in Edge-ordered Complete Geometric Graphs

The Interactive Sum Choice Number of Graphs

3D Vision Guided Robotic Charging Station for Electric and Plug-in Hybrid Vehicles

On the honeycomb conjecture for a class of minimal convex partitions

A distributed primal-dual algorithm for computation of generalized Nash equilibria with shared affine coupling constraints via operator splitting methods

Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting

Convolutional Low-Resolution Fine-Grained Classification

Model-Free Based Digital Control for Magnetic Measurements

Conscious and controlling elements in combinatorial group testing problems

Mobile Unmanned Aerial Vehicles (UAVs) for Energy-Efficient Internet of Things Communications

Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play

On trees with real rooted independence polynomial

Aggregation of Classifiers: A Justifiable Information Granularity Approach

On Perfect Matchings in Matching Covered Graphs

General expression for component-size distribution in infinite configuration networks

A Multitaper, Causal Decomposition for Stochastic, Multivariate Time Series: Application to High-Frequency Calcium Imaging Data

A Local Algorithm for the Sparse Spanning Graph Problem

Large Scale Evolution of Convolutional Neural Networks Using Volunteer Computing

End-to-end optimization of goal-driven and visually grounded dialogue systems

Lower Bounds and Algorithm for Partially Replicated Causally Consistent Shared Memory

Many-Body Localization in Spin Chain Systems with Quasiperiodic Fields

Families in posets minimizing the number of comparable pairs

Cost-complexity pruning of random forests

Proof of Luck: an Efficient Blockchain Consensus Protocol

Legal Question Answering using Ranking SVM and Deep Convolutional Neural Network

On Convergence Rate of a Continuous-Time Distributed Self-Appraisal Model with Time-Varying Relative Interaction Matrices

Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing

Minimax Regret Bounds for Reinforcement Learning

Refining Image Categorization by Exploiting Web Images and General Corpus

A New and Practical Design of Cancellable Biometrics: Index-of-Max Hashing

Multiobjective optimization to a TB-HIV/AIDS coinfection optimal control problem

Renormalization of the two-dimensional stochastic nonlinear wave equation

Using Human Brain Activity to Guide Machine Learning

Neobility at SemEval-2017 Task 1: An Attention-based Sentence Similarity Model

Products of random walks on finite groups with moderate growth

Global and Local Information Based Deep Network for Skin Lesion Segmentation

Database Learning: Toward a Database that Becomes Smarter Every Time

Model selection and parameter inference in phylogenetics using Nested Sampling

Multi-User Millimeter Wave Channel Estimation Using Generalized Block OMP Algorithm

Using Reinforcement Learning for Demand Response of Domestic Hot Water Buffers: a Real-Life Demonstration

Accelerated and Inexact Soft-Impute for Large-Scale Matrix and Tensor Completion

Pretty $k$-clean monomial ideals and $k$-decomposable multicomplexes

Data Delivery by Mobile Agents with Energy Constraints over a fixed path

Girsanov reweighting for path ensembles and Markov state models

Grüneisen model for melts

Occupation times for the finite buffer fluid queue with phase-type OFF-times

Steganographic Generative Adversarial Networks

Transient sequences in a hypernetwork generated by an adaptive network of spiking neurons

Dynamic Erdős-Rényi graphs

VieM v1.00 — Vienna Mapping and Sparse Quadratic Assignment User Guide

An Induced Natural Selection Heuristic for Evaluating Optimal Bayesian Experimental Designs

Arrovian Aggregation via Pairwise Utilitarianism

A new approximation to the geometric-arithmetic index

New lower bounds for the Geometric-Arithmetic index

Fluctuations in percolation of sparse complex networks

Convolutional Neural Network on Three Orthogonal Planes for Dynamic Texture Classification

Concerning partition regular matrices

Clustering of Gamma-Ray bursts through kernel principal component analysis

Configuration spaces of graphs with certain permitted collisions

A Compressive Method for Centralized PSD Map Construction with Imperfect Reporting Channel

Shift Aggregate Extract Networks

Concatenated LDPC-Polar Codes Decoding Through Belief Propagation

The nature and origin of heavy tails in retweet activity

Maximal rank in matrix spaces via graph matchings

Improving TSP tours using dynamic programming over tree decomposition

Combining Contrast Invariant L1 Data Fidelities with Nonlinear Spectral Image Decomposition

Fraternal Twins: Unifying Attacks on Machine Learning and Digital Watermarking

Betti numbers of complexes with highly Connected links

Jante’s law process

Quantum Spectral Clustering through a Biased Phase Estimation Algorithm

From visual words to a visual grammar: using language modelling for image classification

Ultimate Positivity of Diagonals of Quasi-rational Functions

A monodromy graph approach to the piecewise polynomiality of mixed double Hurwitz numbers

Convolutional neural network architecture for geometric matching

Linear-Time Algorithm for Maximum-Cardinality Matching on Cocomparability Graphs

Destruction of Anderson localization in quantum nonlinear Schrödinger lattices

Forbidden Families of Minimal Quadratic and Cubic Configurations

Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval

Nonparametric intensity estimation from indirect point process observations under unknown error distribution

Semantic-level Decentralized Multi-Robot Decision-Making using Probabilistic Macro-Observations

Scalable Accelerated Decentralized Multi-Robot Policy Search in Continuous Observation Spaces

Anisotropic-Scale Junction Detection and Matching for Indoor Images

Modeling and Analysis of Non-Orthogonal MBMS Transmission in Heterogeneous Networks

Two Dimensional Translation-Invariant Probability Distributions: Approximations, Characterizations and No-Go Theorems

Distributed Mechanism Design with Learning Guarantees

Replicable Parallel Branch and Bound Search

Low Complexity Beamforming Training Method for mmWave Communications

End-to-End Learning for Structured Prediction Energy Networks

Upper bounds for the Holevo quantity and their use

Distant total sum distinguishing index of graphs

On Skorokhod Embeddings and Poisson Equations

Segmented and Directional Impact Detection for Parked Vehicles using Mobile Devices

Obstructions for three-coloring and list three-coloring $H$-free graphs

One more remark on the adjoint polynomial

Gaussian process regression for forecasting battery state of health

Enhancing Coexistence in the Unlicensed Band with Massive MIMO

SVDNet for Pedestrian Retrieval

Attitude and Gyro Bias Estimation Using GPS and IMU Measurements

Bayesian Sketch Learning for Program Synthesis

Testing and non-linear preconditioning of the proximal point method

The challenge of decentralized marketplaces

Sizes of Pentagonal Clusters in Fullerenes

Spontaneous symmetry breaking due to the trade-off between attractive and repulsive couplings

Learning Robust Hash Codes for Multiple Instance Image Retrieval

Simultaneous Wireless Information and Power Transfer in MISO Full-Duplex Systems

Advertisements