On the Complexity of Semantic Integration of OWL Ontologies

We propose a new mechanism for integration of OWL ontologies using semantic import relations. In contrast to the standard OWL importing, we do not require all axioms of the imported ontologies to be taken into account for reasoning tasks, but only their logical implications over a chosen signature. This property comes natural in many ontology integration scenarios, especially when the number of ontologies is large. In this paper, we study the complexity of reasoning over ontologies with semantic import relations and establish a range of tight complexity bounds for various fragments of OWL.

Dykstra’s Algorithm, ADMM, and Coordinate Descent: Connections, Insights, and Extensions

We study connections between Dykstra’s algorithm for projecting onto an intersection of convex sets, the augmented Lagrangian method of multipliers or ADMM, and block coordinate descent. We prove that coordinate descent for a regularized regression problem, in which the (separable) penalty functions are seminorms, is exactly equivalent to Dykstra’s algorithm applied to the dual problem. ADMM on the dual problem is also seen to be equivalent, in the special case of two sets, with one being a linear subspace. These connections, aside from being interesting in their own right, suggest new ways of analyzing and extending coordinate descent. For example, from existing convergence theory on Dykstra’s algorithm over polyhedra, we discern that coordinate descent for the lasso problem converges at an (asymptotically) linear rate. We also develop two parallel versions of coordinate descent, based on the Dykstra and ADMM connections.

Bayesian Group Decisions: Algorithms and Complexity

Many important real-world decision-making problems involve interactions of individuals with purely informational externalities, for example, in jury deliberations, expert committees, etc. We model such interactions of rational agents in a group, where they receive private information and act based on that information while also observing other people’s beliefs or actions. As a Bayesian agent attempts to infer the truth from her sequence of observations of actions of others and her own private signal, she recursively refines her belief on the signals that other players could have observed and actions that they could have taken given that other players are also rational. The existing literature addresses asymptotic properties of Bayesian group decisions (important questions such as convergence to consensus and learning). In this work, we address the computations that the Bayesian agent should undertake to realize the optimal actions at every decision epoch. We use the iterated eliminations of infeasible signals (IEIS) to model the thinking process as well as the calculations of a Bayesian agent in a group decision scenario. We show that IEIS algorithm runs in exponential time; however, when the group structure is a partially ordered set the Bayesian calculations simplify and polynomial-time computation of the Bayesian recommendations is possible. We next shift attention to the case where agents reveal their beliefs (instead of actions) at every decision epoch. We analyze the computational complexity of the Bayesian belief formation in groups and show that it is NP-hard. We also investigate the factors underlying this computational complexity and show how belief calculations simplify in special network structures or cases with strong inherent symmetries. We finally give insights about the statistical efficiency (optimality) of the beliefs and its relations to computational efficiency.

Scalable and Efficient Construction of Suffix Array with MapReduce and In-Memory Data Store System

Suffix Array (SA) is a cardinal data structure in many pattern matching applications, including data compression, plagiarism detection and sequence alignment. However, as the volumes of data increase abruptly, the construction of SA is not amenable to the current large-scale data processing frameworks anymore due to its intrinsic proliferation of suffixes during the construction. That is, ameliorating the performance by just adding the resources to the frameworks becomes less cost- effective, even having the severe diminishing returns. At issue now is whether we can permit SA construction to be more scalable and efficient for the everlasting accretion of data by creating a radical shift in perspective. Regarding TeraSort [1] as our baseline, we first demonstrate the fragile scalability of TeraSort and investigate what causes it through the experiments on the sequence alignment of a grouper (i.e., the SA construc- tion used in bioinformatics). As such, we propose a scheme that amalgamates the distributed key-value store system into MapReduce to leverage the in-memory queries about suffixes. Rather than handling the communication of suffixes, MapReduce is in charge of the communication of their indexes, which means better capacity for more data. It significantly abates the required disk space for constructing SA and better utilizes the memory, which in turn improves the scalability radically. We also examine the efficiency of our scheme in terms of memory and show it outperforms TeraSort. At last, our scheme can complete the pair- end sequencing and alignment with two input files without any degradation on scalability, and can accommodate the suffixes of nearly 6.7 TB in a small cluster composed of 16 nodes and Gigabit Ethernet without any compression.

Efficient Parallel Methods for Deep Reinforcement Learning

We propose a novel framework for efficient parallelization of deep reinforcement learning algorithms, enabling these algorithms to learn from multiple actors on a single machine. The framework is algorithm agnostic and can be applied to on-policy, off-policy, value based and policy gradient based algorithms. Given its inherent parallelism, the framework can be efficiently implemented on a GPU, allowing the usage of powerful models while significantly reducing training time. We demonstrate the effectiveness of our framework by implementing an advantage actor-critic algorithm on a GPU, using on-policy experiences and employing synchronous updates. Our algorithm achieves state-of-the-art performance on the Atari domain after only a few hours of training. Our framework thus opens the door for much faster experimentation on demanding problem domains. Our implementation is open-source and is made public at https://…/paac.

Machine learning methods for multimedia information retrieval

In this thesis we examined several multimodal feature extraction and learning methods for retrieval and classification purposes. We reread briefly some theoretical results of learning in Section 2 and reviewed several generative and discriminative models in Section 3 while we described the similarity kernel in Section 4. We examined different aspects of the multimodal image retrieval and classification in Section 5 and suggested methods for identifying quality assessments of Web documents in Section 6. In our last problem we proposed similarity kernel for time-series based classification. The experiments were carried over publicly available datasets and source codes for the most essential parts are either open source or released. Since the used similarity graphs (Section 4.2) are greatly constrained for computational purposes, we would like to continue work with more complex, evolving and capable graphs and apply for different problems such as capturing the rapid change in the distribution (e.g. session based recommendation) or complex graphs of the literature work. The similarity kernel with the proper metrics reaches and in many cases improves over the state-of-the-art. Hence we may conclude generative models based on instance similarities with multiple modes is a generally applicable model for classification and regression tasks ranging over various domains, including but not limited to the ones presented in this thesis. More generally, the Fisher kernel is not only unique in many ways but one of the most powerful kernel functions. Therefore we may exploit the Fisher kernel in the future over widely used generative models, such as Boltzmann Machines [Hinton et al., 1984], a particular subset, the Restricted Boltzmann Machines and Deep Belief Networks [Hinton et al., 2006]), Latent Dirichlet Allocation [Blei et al., 2003] or Hidden Markov Models [Baum and Petrie, 1966] to name a few.

Detecting Statistical Interactions from Neural Network Weights

Interpreting deep neural networks can enable new applications for predictive modeling where both accuracy and interpretability are required. In this paper, we examine the underlying structure of a deep neural network to interpret the statistical interactions it captures. Our key observation is that any input features that interact with each other must follow strongly weighted connections to a common hidden unit before the final output. We propose a novel framework for detecting feature interactions of arbitrary order by interpreting neural network weights. Our framework, which we call Neural Interaction Detector (NID), accurately identifies meaningful interactions without an exhaustive search on an exponential solution space of interaction candidates. Empirical evaluation on both synthetic and real-world data showed the effectiveness of NID, which can uncover interactions omitted by other methods in orders of magnitude less time.

Online Learning Via Regularized Frequent Directions

Online Newton step algorithms usually achieve good performance with less training samples than first order methods, but require higher space and time complexity in each iteration. In this paper, we develop a new sketching strategy called regularized frequent direction (RFD) to improve the performance of online Newton algorithms. Unlike the standard frequent direction (FD) which only maintains a sketching matrix, the RFD introduces a regularization term additionally. The regularization provides an adaptive stepsize for update, which makes the algorithm more stable. The RFD also reduces the approximation error of FD with almost the same cost and makes the online learning more robust to hyperparameters. Empirical studies demonstrate that our approach outperforms sate-of-the-art second order online learning algorithms.

Active Learning for Graph Embedding

Graph embedding provides an efficient solution for graph analysis by converting the graph into a low-dimensional space which preserves the structure information. In contrast to the graph structure data, the i.i.d. node embedding can be processed efficiently in terms of both time and space. Current semi-supervised graph embedding algorithms assume the labelled nodes are given, which may not be always true in the real world. While manually label all training data is inapplicable, how to select the subset of training data to label so as to maximize the graph analysis task performance is of great importance. This motivates our proposed active graph embedding (AGE) framework, in which we design a general active learning query strategy for any semi-supervised graph embedding algorithm. AGE selects the most informative nodes as the training labelled nodes based on the graphical information (i.e., node centrality) as well as the learnt node embedding (i.e., node classification uncertainty and node embedding representativeness). Different query criteria are combined with the time-sensitive parameters which shift the focus from graph based query criteria to embedding based criteria as the learning progresses. Experiments have been conducted on three public data sets and the results verified the effectiveness of each component of our query strategy and the power of combining them using time-sensitive parameters. Our code is available online at: https://…/AGE.

Layerwise Systematic Scan: Deep Boltzmann Machines and Beyond

For Markov chain Monte Carlo methods, one of the greatest discrepancies between theory and system is the scan order – while most theoretical development on the mixing time analysis deals with random updates, real-world systems are implemented with systematic scans. We bridge this gap for models that exhibit a bipartite structure, including, most notably, the Restricted/Deep Boltzmann Machine. The de facto implementation for these models scans variables in a layerwise fashion. We show that the Gibbs sampler with a layerwise alternating scan order has its relaxation time (in terms of epochs) no larger than that of a random-update Gibbs sampler (in terms of variable updates). We also construct examples to show that this bound is asymptotically tight. Through standard inequalities, our result also implies a comparison on the mixing times.

Emotion in Reinforcement Learning Agents and Robots: A Survey

This article provides the first survey of computational models of emotion in reinforcement learning (RL) agents. The survey focuses on agent/robot emotions, and mostly ignores human user emotions. Emotions are recognized as functional in decision-making by influencing motivation and action selection. Therefore, computational emotion models are usually grounded in the agent’s decision making architecture, of which RL is an important subclass. Studying emotions in RL-based agents is useful for three research fields. For machine learning (ML) researchers, emotion models may improve learning efficiency. For the interactive ML and human-robot interaction (HRI) community, emotions can communicate state and enhance user investment. Lastly, it allows affective modelling (AM) researchers to investigate their emotion theories in a successful AI agent class. This survey provides background on emotion theory and RL. It systematically addresses 1) from what underlying dimensions (e.g., homeostasis, appraisal) emotions can be derived and how these can be modelled in RL-agents, 2) what types of emotions have been derived from these dimensions, and 3) how these emotions may either influence the learning efficiency of the agent or be useful as social signals. We also systematically compare evaluation criteria, and draw connections to important RL sub-domains like (intrinsic) motivation and model-based RL. In short, this survey provides both a practical overview for engineers wanting to implement emotions in their RL agents, and identifies challenges and directions for future emotion-RL research.

Mosquito Detection with Neural Networks: The Buzz of Deep Learning

Many real-world time-series analysis problems are characterised by scarce data. Solutions typically rely on hand-crafted features extracted from the time or frequency domain allied with classification or regression engines which condition on this (often low-dimensional) feature vector. The huge advances enjoyed by many application domains in recent years have been fuelled by the use of deep learning architectures trained on large data sets. This paper presents an application of deep learning for acoustic event detection in a challenging, data-scarce, real-world problem. Our candidate challenge is to accurately detect the presence of a mosquito from its acoustic signature. We develop convolutional neural networks (CNNs) operating on wavelet transformations of audio recordings. Furthermore, we interrogate the network’s predictive power by visualising statistics of network-excitatory samples. These visualisations offer a deep insight into the relative informativeness of components in the detection problem. We include comparisons with conventional classifiers, conditioned on both hand-tuned and generic features, to stress the strength of automatic deep feature learning. Detection is achieved with performance metrics significantly surpassing those of existing algorithmic methods, as well as marginally exceeding those attained by individual human experts.

Constrained Bayesian Networks: Theory, Optimization, and Applications

We develop the theory and practice of an approach to modelling and probabilistic inference in causal networks that is suitable when application-specific or analysis-specific constraints should inform such inference or when little or no data for the learning of causal network structure or probability values at nodes are available. Constrained Bayesian Networks generalize a Bayesian Network such that probabilities can be symbolic, arithmetic expressions and where the meaning of the network is constrained by finitely many formulas from the theory of the reals. A formal semantics for constrained Bayesian Networks over first-order logic of the reals is given, which enables non-linear and non-convex optimisation algorithms that rely on decision procedures for this logic, and supports the composition of several constrained Bayesian Networks. A non-trivial case study in arms control, where few or no data are available to assess the effectiveness of an arms inspection process, evaluates our approach. An open-access prototype implementation of these foundations and their algorithms uses the SMT solver Z3 as decision procedure, leverages an open-source package for Bayesian inference to symbolic computation, and is evaluated experimentally.

Exploiting the Structure via Sketched Gradient Algorithms

Sketched gradient algorithms have been recently introduced for efficiently solving the large-scale constrained Least-squares regressions. In this paper we provide novel convergence analysis for the basic method {\it Gradient Projection Classical Sketch} (GPCS) to reveal the fast linear convergence rate of GPCS towards a vicinity of the solution thanks to the intrinsic low-dimensional geometric structure of the solution prompted by constraint set. Similar to our analysis we observe computational and sketch size trade-offs in numerical experiments. Hence we justify that the combination of gradient methods and the sketching technique is a way of designing efficient algorithms which can actively exploit the low-dimensional structure to accelerate computation in large scale data regression and signal processing applications.

Probabilistic Matrix Factorization for Automated Machine Learning

In order to achieve state-of-the-art performance, modern machine learning techniques require careful data pre-processing and hyperparameter tuning. Moreover, given the ever increasing number of machine learning models being developed, model selection is becoming increasingly important. Automating the selection and tuning of machine learning pipelines consisting of data pre-processing methods and machine learning models, has long been one of the goals of the machine learning community. In this paper, we tackle this meta-learning task by combining ideas from collaborative filtering and Bayesian optimization. Using probabilistic matrix factorization techniques and acquisition functions from Bayesian optimization, we exploit experiments performed in hundreds of different datasets to guide the exploration of the space of possible pipelines. In our experiments, we show that our approach quickly identifies high-performing pipelines across a wide range of datasets, significantly outperforming the current state-of-the-art.

Curiosity-driven Exploration by Self-supervised Prediction

In many real-world scenarios, rewards extrinsic to the agent are extremely sparse, or absent altogether. In such cases, curiosity can serve as an intrinsic reward signal to enable the agent to explore its environment and learn skills that might be useful later in its life. We formulate curiosity as the error in an agent’s ability to predict the consequence of its own actions in a visual feature space learned by a self-supervised inverse dynamics model. Our formulation scales to high-dimensional continuous state spaces like images, bypasses the difficulties of directly predicting pixels, and, critically, ignores the aspects of the environment that cannot affect the agent. The proposed approach is evaluated in two environments: VizDoom and Super Mario Bros. Three broad settings are investigated: 1) sparse extrinsic reward, where curiosity allows for far fewer interactions with the environment to reach the goal; 2) exploration with no extrinsic reward, where curiosity pushes the agent to explore more efficiently; and 3) generalization to unseen scenarios (e.g. new levels of the same game) where the knowledge gained from earlier experience helps the agent explore new places much faster than starting from scratch. Demo video and code available at https://…/noreward-rl

Deep Learning Microscopy

Progression of Decomposed Local-Effect Action Theories

A catalogue of 4-regular matchstick graphs with 63 – 70 vertices

Hadamard partitioned difference families

Person Re-Identification by Deep Joint Learning of Multi-Loss Classification

Fundamental Limits of DNA Storage Systems

On the weak Roman domination number of lexicographic product graphs

Graphs with $α_1$ and $τ_1$ both large

Gabor Filter Assisted Energy Efficient Fast Learning Convolutional Neural Networks

cpr: An R Package For Finding Parsimonious B-Spline Regression Models via Control Polygon Reduction and Control Net Reduction

Multiple Target, Multiple Type Filtering in RFS Framework

Inference on Breakdown Frontiers

Advancing Consumer Adoption of Blockchain Applications

Characterizing Time Series via Complexity-Entropy Curves

On structure testing for component covariance matrices of a high-dimensional mixture

A hierarchical renormalization model: some properties and open questions

ShortFuse: Biomedical Time Series Representations in the Presence of Structured Information

Asymptotic distribution of fixed points of pattern-avoiding involutions

Benchmark for Complex Answer Retrieval

Automatically Redundant Features Removal for Unsupervised Feature Selection via Sparse Feature Graph

Recurrence Analysis of Vegetation Time Series and Phase Transitions in Mediterranean Rangelands

Learning Semantic Correspondences in Technical Documentation

Tests for comparing time-invariant and time-varying spectra based on the Anderson-Darling statistic

Combination of Hidden Markov Random Field and Conjugate Gradient for Brain Image Segmentation

Extracting urban impervious surface from GF-1 imagery using one-class classifiers

Deep neural networks on graph signals for brain imaging analysis

Which Broadcast Abstraction Captures $k$-Set Agreement?

Revisiting IM2GPS in the Deep Learning Era

Annotating and Modeling Empathy in Spoken Conversations

Sublogarithmic Distributed Algorithms for Lóvasz Local lemma, and the Complexity Hierarchy

On disjoint $(v,k,k-1)$ difference families

Entanglement production by evolution operator

A note on intrinsic Conditional Autoregressive models for disconnected graphs

Blind Regression via Nearest Neighbors under Latent Variable Models

A shortcut for IMEX methods: integrate the residual explicitly

Fast GPU-Based Seismogram Simulation from Microseismic Events in Marine Environments Using Heterogeneous Velocity Models

On differences between DP-coloring and list coloring

Singular Spectrum and Recent Results on Hierarchical Operators

Awareness improves problem-solving performance

Learning task structure via sparsity grouped multitask learning

Talking to Your TV: Context-Aware Voice Search with Hierarchical Recurrent Neural Networks

Evaluation complexity bounds for smooth constrained nonlinear optimisation using scaled KKT conditions, high-order models and the criticality measure $χ$

Faster and Simpler Distributed Algorithms for Testing and Correcting Graph Properties in the CONGEST-Model

Enumerative properties of Grid-Associahedra

The Kuramoto model on power law graphs

Full-Duplex Massive MIMO Relaying Systems with Low-Resolution ADCs

Spatial-Temporal Union of Subspaces for Multi-body Non-rigid Structure-from-Motion

Discovery and visualization of structural biomarkers from MRI using transport-based morphometry

Gland Segmentation in Histopathology Images Using Random Forest Guided Boundary Construction

Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization

A Closed-Form Model for Image-Based Distant Lighting

GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data

Locally finite trees and the topological minor relation

Analog Beamsteering for Flexible Hybrid Beamforming Design in Mmwave Communications

Impact of Major RF Impairments on mm-wave Communications using OFDM Waveforms

Sequential Hybrid Beamforming Design for Multi-Link mmwave Communication

Vizing-type bounds for graphs with induced subgraph restrictions

On the subsemigroup complex of an aperiodic Brandt semigroup

A simplex-type algorithm for continuous linear programs with constant coefficients

Interpolated Schur multiple zeta values

Relaxation heuristics for the set multicover problem with generalized upper bound constraints

Musical Instrument Recognition Using Their Distinctive Characteristics in Artificial Neural Networks

On the maximal halfspace depth of permutation-invariant distributions on the simplex

Small ball probabilities for certain gaussian fields

Irregular Recovery and Unequal Locality for Locally Recoverable Codes with Availability

Minimax Risk for Missing Mass Estimation

Laplacian Spectra of Regular Graph Transformations

A Correspondence Relaxation Approach for 3D Shape Reconstruction

Discrete-Continuous Splitting for Weakly Supervised Learning

Acyclic edge-coloring of planar graphs: $Δ$ colors suffice when $Δ$ is large

Max-min Fair Beamforming for SWIPT Systems with Non-linear EH Model

Information Leakage Games

Discrete Sequential Prediction of Continuous Actions for Deep RL

Joint Modeling of Content and Discourse Relations in Dialogues

Winning on the Merits: The Joint Effects of Content and Style on Debate Outcomes

Generalizing a partition theorem of Andrews

May-Wigner transition in large random dynamical systems

The variance of the $\ell_p^n$-norm of the Gaussian vector, and Dvoretzky’s theorem

Learning-aided Stochastic Network Optimization with Imperfect State Prediction

Capacity of Some Index Coding Problems with Symmetric Neighboring Interference

A matrix approach to Yang multiplication theorem

Interior polynomial for signed bipartite graphs and the HOMFLY polynomial

AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles

Statistical Physics and Information Theory Perspectives on Linear Inverse Problems

Berthil Cepstrum: a Novel Vibration Analysis Method based on Marginal Hilbert Spectrum Applied to Artificial Motor Aging

Single Image Super-Resolution Using Multi-Scale Convolutional Neural Network

Simulated Penetration Testing and Mitigation Analysis

A novel conductivity mechanism of highly disordered carbon systems based on an investigation of graph zeta function

Bandit Regret Scaling with the Effective Loss Range

Quantifying Aspect Bias in Ordinal Ratings using a Bayesian Approach

Learning Semantics for Image Annotation

Generative Adversarial Networks for Multimodal Representation Learning in Video Hyperlinking

Assembling sequences of DNA using an on-line algorithm based on DeBruijn graphs

Kernel Truncated Regression Representation for Robust Subspace Clustering

Tuning Modular Networks with Weighted Losses for Hand-Eye Coordination

Extended T-process Regression Models

A Perceptually Weighted Rank Correlation Indicator for Objective Image Quality Assessment

Poincaré and logarithmic Sobolev constants for metastable Markov chains via capacitary inequalities

Joint Design of Multi-Tap Analog Cancellation and Digital Beamforming for Reduced Complexity Full Duplex MIMO Systems

Cherlin’s conjecture for sporadic simple groups

Sensitivity of directed networks to the addition and pruning of edges and vertices

An Alternative Globalization Strategy for Unconstrained Optimization

Diffusion in time-dependent random media and the Kardar-Parisi-Zhang equation

Effective Capacity of Licensed-Assisted Access in Unlicensed Spectrum for 5G: From Theory to Application

Generalized eigenvalue problems for meet and join matrices on semilattices

Continuous Diffraction of Molecules and Disordered Molecular Crystals

Determinantal point process mixtures via spectral density approach

Representation learning of drug and disease terms for drug repositioning

Enumeration of meanders and Masur-Veech volumes

Total variation distance between stochastic polynomials and invariance principles

Convex Coupled Matrix and Tensor Completion

ResumeVis: A Visual Analytics System to Discover Semantic Information in Semi-structured Resume Data

Design of a Very Compact CNN Classifier for Online Handwritten Chinese Character Recognition Using DropWeight and Global Pooling

New bounds on the number of n-queens configurations

Spatial forecast of aggregated electrical load for long-term horizon

Hanani-Tutte for approximating maps of graphs

A pedestrian hopping model and traffic light scheduling for pedestrian-vehicle mixed-flow networks

Joint Transmission with Limited Backhaul Connectivity

Strategically knowing how

A Novel Transmission Scheme for the $K$-user Broadcast Channel with Delayed CSIT

A Deterministic Sparse FFT for Functions with Structured Fourier Sparsity

Comparison of Maximum Likelihood and GAN-based training of Real NVPs

Extending Defensive Distillation

Non-separable covariance models for spatio-temporal data, with applications to neural encoding analysis

Learning from Clinical Judgments: Semi-Markov-Modulated Marked Hawkes Processes for Risk Prognosis

Gorenstein simplices with a given $δ$-polynomial

View-invariant Gait Recognition through Genetic Template Segmentation

Unimodal probability distributions for ordinal classification

Convergence analysis of a family of robust Kalman filters based on the contraction principle

Understanding Information Transmission in Complex Networks

Non-universality in the erosion of tilted landscapes

Constructing a Consensus Phylogeny from a Leaf-Removal Distance

Equidistributions of Mahonian statistics over pattern avoiding permutations

Quantitative stochastic homogenization and large-scale regularity

Back to RGB: 3D tracking of hands and hand-object interactions based on short-baseline stereo

Optimal hypothesis testing for stochastic block models with growing degrees

Comparing Titles vs. Full-text for Multi-Label Classification of Scientific Papers and News Articles

Single-cluster PHD filter methods for joint multi-object filtering and parameter estimation

Exploiting the Pruning Power of Strong Local Consistencies Through Parallelization

Conflict-free connection numbers of line graphs

Scalable Mobile Crowdsensing via Peer-to-Peer Data Sharing

Boundary regularity of stochastic PDEs

Maximum Selection and Ranking under Noisy Comparisons