A Bayesian Splitotic Theory For Nonparametric Models

We develop a set of scalable Bayesian inference procedures for a general class of nonparametric regression models based on embarrassingly parallel MCMC. Specifically, we first perform independent nonparametric Bayesian inference on each subset split from a massive dataset, and then aggregate those results into global counterparts. By partitioning the dataset carefully, we show that our aggregated inference results obtain the oracle rule in the sense that they are equivalent to those obtained directly from the massive data (which are computationally prohibitive in practice, though). For example, the aggregated credible sets achieve desirable credibility level and frequentist coverage possessed by the oracle counterparts (with similar radius). The oracle matching phenomenon occurs due to the nice geometric structures of the infinite-dimensional parameter space. A technical by-product is a new version of uniformly consistent test that applies to a general regression model under Sobolev norm.

A Comprehensive Perspective on the Pilot-Job Abstraction

This paper offers a comprehensive analysis of the Pilot-Job abstraction assessing its evolution, properties, and implementation as multiple Pilot-Job software systems. Pilot-Job systems play an important role in supporting distributed scientific computing. They are used to consume more than 700 million CPU hours a year by the Open Science Grid communities, and by processing up to 5 million jobs a week for the ATLAS experiment on the Worldwide LHC Computing Grid. With the increasing importance of task-level parallelism in high-performance computing, Pilot-Job systems are also witnessing an adoption beyond traditional domains. Notwithstanding the growing impact on scientific research, there is no agreement upon a definition of Pilot-Job system and no clear understanding of the underlying Pilot abstraction and paradigm. This lack of foundational understanding has lead to a proliferation of unsustainable Pilot-Job implementations with no shared best practices or interoperability, ultimately hindering a realization of the full impact of Pilot-Jobs. This paper offers the conceptual tools to promote this fundamental understanding while critically reviewing the state of the art of Pilot-Job implementations. The five main contributions of this paper are: (i) an analysis of the motivations and evolution of the Pilot-Job abstraction; (ii) an outline of the minimal set of distinguishing functionalities; (iii) the definition of a core vocabulary to reason consistently about Pilot-Jobs; (iv) the description of core and auxiliary properties of Pilot-Job systems; and (v) a critical review of the current state of the art of their implementations. These contributions are brought together to illustrate the generality of the Pilot-Job paradigm, to discuss some challenges in distributed computing that it addresses and future opportunities.

Cascade Learning by Optimally Partitioning

Cascaded AdaBoost classifier is a well-known efficient object detection algorithm. The cascade structure has many parameters to be determined. Most of existing cascade learning algorithms are designed by assigning detection rate and false positive rate to each stage either dynamically or statically. Their objective functions are not directly related to minimum computation cost. These algorithms are not guaranteed to have optimal solution in the sense of minimizing computation cost. On the assumption that a strong classifier is given, in this paper we propose an optimal cascade learning algorithm (we call it iCascade) which iteratively partitions the strong classifiers into two parts until predefined number of stages are generated. iCascade searches the optimal number ri of weak classifiers of each stage i by directly minimizing the computation cost of the cascade. Theorems are provided to guarantee the existence of the unique optimal solution. Theorems are also given for the proposed efficient algorithm of searching optimal parameters ri. Once a new stage is added, the parameter ri for each stage decreases gradually as iteration proceeds, which we call decreasing phenomenon. Moreover, with the goal of minimizing computation cost, we develop an effective algorithm for setting the optimal threshold of each stage classifier. In addition, we prove in theory why more new weak classifiers are required compared to the last stage. Experimental results on face detection demonstrate the effectiveness and efficiency of the proposed algorithm.

ESDF: Ensemble Selection using Diversity and Frequency

Recently ensemble selection for consensus clustering has emerged as a research problem in Machine Intelligence. Normally consensus clustering algorithms take into account the entire ensemble of clustering, where there is a tendency of generating a very large size ensemble before computing its consensus. One can avoid considering the entire ensemble and can judiciously select few partitions in the ensemble without compromising on the quality of the consensus. This may result in an efficient consensus computation technique and may save unnecessary computational overheads. The ensemble selection problem addresses this issue of consensus clustering. In this paper, we propose an efficient method of ensemble selection for a large ensemble. We prioritize the partitions in the ensemble based on diversity and frequency. Our method selects top K of the partitions in order of priority, where K is decided by the user. We observe that considering jointly the diversity and frequency helps in identifying few representative partitions whose consensus is qualitatively better than the consensus of the entire ensemble. Experimental analysis on a large number of datasets shows our method gives better results than earlier ensemble selection methods.

Learning Word Meta-Embeddings by Using Ensembles of Embedding Sets

Word embeddings — distributed representations for words — in deep learning are beneficial for many tasks in Natural Language Processing (NLP). However, different embedding sets vary greatly in quality and characteristics of the captured semantics. Instead of relying on a more advanced algorithm for embedding learning, this paper proposes an ensemble approach of combining different public embedding sets with the aim of learning meta-embeddings. Experiments on word similarity and analogy tasks and on part-of-speech (POS) tagging show better performance of meta-embeddings compared to individual embedding sets. One advantage of meta-embeddings is that they have increased coverage of the vocabulary. We will release our meta-embeddings publicly.

ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R

We introduce the C++ application and R package ranger. The software is a fast implementation of random forests for high dimensional data. Ensembles of classification, regression and survival trees are supported. We describe the implementation, provide examples, validate the package with a reference implementation, and compare runtime and memory usage with other implementations. The new software proves to scale best with the number of features, samples, trees, and features tried for splitting. Finally, we show that ranger is the fastest and most memory efficient implementation of random forests to analyze data on the scale of a genome-wide association study.

Reasoning in complex environments with the SelectScript declarative language

SelectScript is an extendable, adaptable, and declarative domain-specific language aimed at information retrieval from simulation environments and robotic world models in an SQL-like manner. In this work we have extended the language in two directions. First, we have implemented hierarchical queries; second, we improve efficiency enabling manual design space exploration on different ‘search’ strategies. As depicted in the teaser above, we demonstrate the applicability of such extensions in two application problems; the basic language concepts are explained by solving the classical problem of the Towers of Hanoi and then a common path planning problem in a complex 3D environment is implemented.

Supervised learning of sparse context reconstruction coefficients for data representation and classification

Context of data points, which is usually defined as the other data points in a data set, has been found to play important roles in data representation and classification. In this paper, we study the problem of using context of a data point for its classification problem. Our work is inspired by the observation that actually only very few data points are critical in the context of a data point for its representation and classification. We propose to represent a data point as the sparse linear combination of its context, and learn the sparse context in a supervised way to increase its discriminative ability. To this end, we proposed a novel formulation for context learning, by modeling the learning of context parameter and classifier in a unified objective, and optimizing it with an alternative strategy in an iterative algorithm. Experiments on three benchmark data set show its advantage over state-of-the-art context-based data representation and classification methods.

A new lower bound for the Towers of Hanoi problem

Categorifying the tensor product of the Kirillov-Reshetikhin crystal $B^{1,1}$ and a fundamental crystal

Chip-firing on general invertible matrices

Compact Routing Messages in Self-Healing Trees

Completely Positive Tensors and Multi-Hypergraphs

Confounder Adjustment in Multiple Hypotheses Testing

Cramér’s theorem is atypical

Deep clustering: Discriminative embeddings for segmentation and separation

Distributed Deep Q-Learning

End-to-End Attention-based Large Vocabulary Speech Recognition

Equitable colorings of complete multipartite graphs

Ergodic theory of the symmetric inclusion process

Finite phylogenetic complexity of $Z_p$ and invariants for $Z_3$

Generalized quasirandom properties of graphs

Hierarchical mutual information for the comparison of hierarchical community structures in complex networks

High order exactly divergence-free Hybrid Discontinuous Galerkin Methods for unsteady incompressible flows

Implied volatility in strict local martingale models

Interface dynamics of a metastable mass-conserving spatially extended diffusion

Investigating Galaxy-Filament Alignments in Hydrodynamic Simulations using Density Ridges

Issues in the Multiple Try Metropolis mixing

Jump activity estimation for pure-jump semimartingales via self-normalized statistics

Learning Modulo Theories for preference elicitation in hybrid domains

Macroeconomic Forecasting and Variable Selection with a Very Large Number of Predictors: A Penalized Regression Approach

Mechanism Design via Dantzig-Wolfe Decomposition

Mixed mode oscillation suppression states in coupled oscillators

Multifractal analysis of electronic states on random Voronoi-Delaunay lattices

Non-Stationary Gaussian Process Regression with Hamiltonian Monte Carlo

On the Distributed Computation of Fractional Connected Dominating Set Packings

Partitioning Strategies for Load Balancing Subgraph-centric Distributed Graph Processing

Periods of Iterated Rational Functions over a Finite Field

Probabilistic Modelling of Morphologically Rich Languages

Reflective Oracles: A Foundation for Classical Game Theory

Scalable Bayesian Non-Negative Tensor Factorization for Massive Count Data

Scalable Out-of-Sample Extension of Graph Embeddings Using Deep Neural Networks

Stability of gas measures under perturbations and discretizations

Stable Degeneracies for Ising Models

State-space models’ dirty little secrets: even simple linear Gaussian models can have estimation problems

Tight Bounds for MIS in Multichannel Radio Networks

Transverse and longitudinal vibrations in amorphous silicon

Traveling cluster pairs in a system of phase oscillators with positive and negative couplings under periodic driving

Uniqueness of the Smith max-stable process: A characterization of the normal distribution

Zero-Truncated Poisson Tensor Factorization for Massive Binary Tensors