Learning a Policy for Opportunistic Active Learning

Active learning identifies data points to label that are expected to be the most useful in improving a supervised model. Opportunistic active learning incorporates active learning into interactive tasks that constrain possible queries during interactions. Prior work has shown that opportunistic active learning can be used to improve grounding of natural language descriptions in an interactive object retrieval task. In this work, we use reinforcement learning for such an object retrieval task, to learn a policy that effectively trades off task completion with model improvement that would benefit future tasks.

Differentially Private Change-Point Detection

The change-point detection problem seeks to identify distributional changes at an unknown change-point k* in a stream of data. This problem appears in many important practical settings involving personal data, including biosurveillance, fault detection, finance, signal detection, and security systems. The field of differential privacy offers data analysis tools that provide powerful worst-case privacy guarantees. We study the statistical problem of change-point detection through the lens of differential privacy. We give private algorithms for both online and offline change-point detection, analyze these algorithms theoretically, and provide empirical validation of our results.

Generalize Symbolic Knowledge With Neural Rule Engine

Neural-symbolic learning aims to take the advantages of both neural networks and symbolic knowledge to build better intelligent systems. As neural networks have dominated the state-of-the-art results in a wide range of NLP tasks, it attracts considerable attention to improve the performance of neural models by integrating symbolic knowledge. Different from existing works, this paper investigates the combination of these two powerful paradigms from the knowledge-driven side. We propose Neural Rule Engine (NRE), which can learn knowledge explicitly from logic rules and then generalize them implicitly with neural networks. NRE is implemented with neural module networks in which each module represents an action of the logic rule. The experiments show that NRE could greatly improve the generalization abilities of logic rules with a significant increase on recall. Meanwhile, the precision is still maintained at a high level.

IEA: Inner Ensemble Average within a convolutional neural network

Ensemble learning is a method of combining multiple trained models to improve the model accuracy. We introduce the usage of such methods, specifically ensemble average inside Convolutional Neural Networks (CNNs) architectures. By Inner Average Ensemble (IEA) of multiple convolutional neural layers (CNLs) replacing the single CNLs inside the CNN architecture, the accuracy of the CNN increased. A visual and a similarity score analysis of the features generated from IEA explains why it boosts the model performance. Empirical results using different benchmarking datasets and well-known deep model architectures shows that IEA outperforms the ordinary CNL used in CNNs.

Nested multi-instance classification

There are classification tasks that take as inputs groups of images rather than single images. In order to address such situations, we introduce a nested multi-instance deep network. The approach is generic in that it is applicable to general data instances, not just images. The network has several convolutional neural networks grouped together at different stages. This primarily differs from other previous works in that we organize instances into relevant groups that are treated differently. We also introduce a method to replace instances that are missing which successfully creates neutral input instances and consistently outperforms standard fill-in methods in real world use cases. In addition, we propose a method for manual dropout when a whole group of instances is missing that allows us to use richer training data and obtain higher accuracy at the end of training. With specific pretraining, we find that the model works to great effect on our real world and public datasets in comparison to baseline methods, justifying the different treatment among groups of instances.

Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation

Deep learning models have consistently outperformed traditional machine learning models in various classification tasks, including image classification. As such, they have become increasingly prevalent in many real world applications including those where security is of great concern. Such popularity, however, may attract attackers to exploit the vulnerabilities of the deployed deep learning models and launch attacks against security-sensitive applications. In this paper, we focus on a specific type of data poisoning attack, which we refer to as a {\em backdoor injection attack}. The main goal of the adversary performing such attack is to generate and inject a backdoor into a deep learning model that can be triggered to recognize certain embedded patterns with a target label of the attacker’s choice. Additionally, a backdoor injection attack should occur in a stealthy manner, without undermining the efficacy of the victim model. Specifically, we propose two approaches for generating a backdoor that is hardly perceptible yet effective in poisoning the model. We consider two attack settings, with backdoor injection carried out either before model training or during model updating. We carry out extensive experimental evaluations under various assumptions on the adversary model, and demonstrate that such attacks can be effective and achieve a high attack success rate (above 90\%) at a small cost of model accuracy loss (below 1\%) with a small injection rate (around 1\%), even under the weakest assumption wherein the adversary has no knowledge either of the original training data or the classifier model.

Understanding Latent Factors Using a GWAP

Recommender systems relying on latent factor models often appear as black boxes to their users. Semantic descriptions for the factors might help to mitigate this problem. Achieving this automatically is, however, a non-straightforward task due to the models’ statistical nature. We present an output-agreement game that represents factors by means of sample items and motivates players to create such descriptions. A user study shows that the collected output actually reflects real-world characteristics of the factors.

Analyze Unstructured Data Patterns for Conceptual Representation

Online news media provides aggregated news and stories from different sources all over the world and up-to-date news coverage. The main goal of this study is to have a solution that considered as a homogeneous source for the news and to represent the news in a new conceptual framework. Furthermore, the user can easily find different updated news in a fast way through the designed interface. The Mobile App implementation is based on modeling the multi-level conceptual analysis discipline. Discovering main concepts of any domain is captured from the hidden unstructured data that are analyzed by the proposed solution. Concepts are discovered through analyzing data patterns to be structured into a tree-based interface for easy navigation for the end user, through the discovered news concepts. Our final experiment results showing that analyzing the news before displaying to the end-user and restructuring the final output in a conceptual multilevel structure, that producing new display frame for the end user to find the related information to his interest.

Reasoning about Actions and State Changes by Injecting Commonsense Knowledge

Comprehending procedural text, e.g., a paragraph describing photosynthesis, requires modeling actions and the state changes they produce, so that questions about entities at different timepoints can be answered. Although several recent systems have shown impressive progress in this task, their predictions can be globally inconsistent or highly improbable. In this paper, we show how the predicted effects of actions in the context of a paragraph can be improved in two ways: (1) by incorporating global, commonsense constraints (e.g., a non-existent entity cannot be destroyed), and (2) by biasing reading with preferences from large-scale corpora (e.g., trees rarely move). Unlike earlier methods, we treat the problem as a neural structured prediction task, allowing hard and soft constraints to steer the model away from unlikely predictions. We show that the new model significantly outperforms earlier systems on a benchmark dataset for procedural text comprehension (+8% relative gain), and that it also avoids some of the nonsensical predictions that earlier systems make.

Retrieval-Based Neural Code Generation

In models to generate program source code from natural language, representing this code in a tree structure has been a common approach. However, existing methods often fail to generate complex code correctly due to a lack of ability to memorize large and complex structures. We introduce ReCode, a method based on subtree retrieval that makes it possible to explicitly reference existing code examples within a neural code generation model. First, we retrieve sentences that are similar to input sentences using a dynamic-programming-based sentence similarity scoring method. Next, we extract n-grams of action sequences that build the associated abstract syntax tree. Finally, we increase the probability of actions that cause the retrieved n-gram action subtree to be in the predicted code. We show that our approach improves the performance on two code generation tasks by up to +2.6 BLEU.

Modeling OWL with Rules: The ROWL Protege Plugin

In our experience, some ontology users find it much easier to convey logical statements using rules rather than OWL (or description logic) axioms. Based on recent theoretical developments on transformations between rules and description logics, we develop ROWL, a Protege plugin that allows users to enter OWL axioms by way of rules; the plugin then automatically converts these rules into OWL DL axioms if possible, and prompts the user in case such a conversion is not possible without weakening the semantics of the rule.

Rule-based OWL Modeling with ROWLTab Protege Plugin

It has been argued that it is much easier to convey logical statements using rules rather than OWL (or description logic (DL)) axioms. Based on recent theoretical developments on transformations between rules and DLs, we have developed ROWLTab, a Protege plugin that allows users to enter OWL axioms by way of rules; the plugin then automatically converts these rules into OWL 2 DL axioms if possible, and prompts the user in case such a conversion is not possible without weakening the semantics of the rule. In this paper, we present ROWLTab, together with a user evaluation of its effectiveness compared to entering axioms using the standard Protege interface. Our evaluation shows that modeling with ROWLTab is much quicker than the standard interface, while at the same time, also less prone to errors for hard modeling tasks.

An Introduction to Inductive Statistical Inference — from Parameter Estimation to Decision-Making

These lecture notes aim at a post-Bachelor audience with a backgound at an introductory level in Applied Mathematics and Applied Statistics. They discuss the logic and methodology of the Bayes-Laplace approach to inductive statistical inference that places common sense and the guiding lines of the scientific method at the heart of systematic analyses of quantitative-empirical data. Following an exposition of exactly solvable cases of single- and two-parameter estimation, the main focus is laid on Markov Chain Monte Carlo (MCMC) simulations on the basis of Gibbs sampling and Hamiltonian Monte Carlo sampling of posterior joint probability distributions for regression parameters occurring in generalised linear models. The modelling of fixed as well as of varying effects (varying intercepts) is considered, and the simulation of posterior predictive distributions is outlined. The issues of model comparison with Bayes factors and the assessment of models’ relative posterior predictive accuracy with information entropy-based criteria DIC and WAIC are addressed. Concluding, a conceptual link to the behavioural subjective expected utility representation of a single decision-maker’s choice behaviour in static one-shot decision problems is established. Codes for MCMC simulations of multi-dimensional posterior joint probability distributions with the JAGS and Stan packages implemented in the statistical software R are provided. The lecture notes are fully hyperlinked. They direct the reader to original scientific research papers and to pertinent biographical information.

Gaussian Mixture Generative Adversarial Networks for Diverse Datasets, and the Unsupervised Clustering of Images

Generative Adversarial Networks (GANs) have been shown to produce realistically looking synthetic images with remarkable success, yet their performance seems less impressive when the training set is highly diverse. In order to provide a better fit to the target data distribution when the dataset includes many different classes, we propose a variant of the basic GAN model, called Gaussian Mixture GAN (GM-GAN), where the probability distribution over the latent space is a mixture of Gaussians. We also propose a supervised variant which is capable of conditional sample synthesis. In order to evaluate the model’s performance, we propose a new scoring method which separately takes into account two (typically conflicting) measures – diversity vs. quality of the generated data. Through a series of empirical experiments, using both synthetic and real-world datasets, we quantitatively show that GM-GANs outperform baselines, both when evaluated using the commonly used Inception Score, and when evaluated using our own alternative scoring method. In addition, we qualitatively demonstrate how the \textit{unsupervised} variant of GM-GAN tends to map latent vectors sampled from different Gaussians in the latent space to samples of different classes in the data space. We show how this phenomenon can be exploited for the task of unsupervised clustering, and provide quantitative evaluation showing the superiority of our method for the unsupervised clustering of image datasets. Finally, we demonstrate a feature which further sets our model apart from other GAN models: the option to control the quality-diversity trade-off by altering, post-training, the probability distribution of the latent space. This allows one to sample higher quality and lower diversity samples, or vice versa, according to one’s needs.

A Unified Analysis of Stochastic Momentum Methods for Deep Learning

Stochastic momentum methods have been widely adopted in training deep neural networks. However, their theoretical analysis of convergence of the training objective and the generalization error for prediction is still under-explored. This paper aims to bridge the gap between practice and theory by analyzing the stochastic gradient (SG) method, and the stochastic momentum methods including two famous variants, i.e., the stochastic heavy-ball (SHB) method and the stochastic variant of Nesterov’s accelerated gradient (SNAG) method. We propose a framework that unifies the three variants. We then derive the convergence rates of the norm of gradient for the non-convex optimization problem, and analyze the generalization performance through the uniform stability approach. Particularly, the convergence analysis of the training objective exhibits that SHB and SNAG have no advantage over SG. However, the stability analysis shows that the momentum term can improve the stability of the learned model and hence improve the generalization performance. These theoretical insights verify the common wisdom and are also corroborated by our empirical analysis on deep learning.

Towards Reproducible Empirical Research in Meta-Learning

Meta-learning is increasingly used to support the recommendation of machine learning algorithms and their configurations. Such recommendations are made based on meta-data, consisting of performance evaluations of algorithms on prior datasets, as well as characterizations of these datasets. These characterizations, also called meta-features, describe properties of the data which are predictive for the performance of machine learning algorithms trained on them. Unfortunately, despite being used in a large number of studies, meta-features are not uniformly described and computed, making many empirical studies irreproducible and hard to compare. This paper aims to remedy this by systematizing and standardizing data characterization measures used in meta-learning, and performing an in-depth analysis of their utility. Moreover, it presents MFE, a new tool for extracting meta-features from datasets and identify more subtle reproducibility issues in the literature, proposing guidelines for data characterization that strengthen reproducible empirical research in meta-learning.

Reinforcement Learning Testbed for Power-Consumption Optimization
Learning End-to-End Goal-Oriented Dialog with Multiple Answers
On the Performance of a Relay-Assisted Multi-Hop Asymmetric FSO/RF Communication System over Negative Exponential atmospheric turbulence with the effect of pointing error
Gallai-Ramsey numbers of $C_{10}$ and $C_{12}$
QuasarNET: Human-level spectral classification and redshifting with Deep Neural Networks
On the Wiener Index of Uniform Unicyclic Hypergraphs
A study of integer sorting on multicores
Symbolic regression based genetic approximations of the Colebrook equation for flow friction
Semi-Metrification of the Dynamic Time Warping Distance
Centroid estimation based on symmetric KL divergence for Multinomial text classification problem
ABHY Associahedra and Newton polytopes of $F$-polynomials for finite type cluster algebras
Submodular Maximization with Packing Constraints in Parallel
MemComputing Integer Linear Programming
Grammar Induction with Neural Language Models: An Unusual Replication
Interpretable Intuitive Physics Model
Correcting Length Bias in Neural Machine Translation
Fast and accessible first-principles calculations of vibrational properties of materials
Group calibration is a byproduct of unconstrained learning
Consistent Sampling with Replacement
Note on the group edge irregularity strength of graphs
Adaptative significance levels in normal mean hypothesis testing
Model Predictive Control for Regular Linear Systems
Hard Non-Monotonic Attention for Character-Level Transduction
Physically-inspired Gaussian processes for transcriptional regulation in Drosophila melanogaster
Recommendation Through Mixtures of Heterogeneous Item Relationships
The Impact of Preprocessing on Deep Representations for Iris Recognition on Unconstrained Environments
The Fundamental Morphism Theorem in the Categories of Graphs & Graph Reconstruction
Theoretical Linear Convergence of Unfolded ISTA and its Practical Weights and Thresholds
AAD: Adaptive Anomaly Detection through traffic surveillance videos
Improved Upper Bounds for Gallai-Ramsey Numbers of Odd Cycles
Zero-Shot Adaptive Transfer for Conversational Language Understanding
Quadratic Discriminant Analysis under Moderate Dimension
A polynomial-time algorithm for median-closed semilinear constraints
Super-Resolution for Hyperspectral and Multispectral Image Fusion Accounting for Seasonal Spectral Variability
Rational Neural Networks for Approximating Jump Discontinuities of Graph Convolution Operator
The generalized connectivity of some regular graphs
Towards Effective Deep Embedding for Zero-Shot Learning
Discriminative Learning of Similarity and Group Equivariant Representations
DCSM Protocol for Content Transfer in Deep Space Network
Decentralized Detection with Robust Information Privacy Protection
Differential and integral invariants under Mobius transformation
Artifacts Detection and Error Block Analysis from Broadcasted Videos
Maximum likelihood estimator and its consistency for an $(L,1)$ random walk in a parametric random environment
CNN-PS: CNN-based Photometric Stereo for General Non-Convex Surfaces
Robust Wireless Body Area Networks Coexistence: A Game Theoretic Approach to Time-Division MAC
Profiling and Improving the Duty-Cycling Performance of Linux-based IoT Devices
Optimality conditions for approximate Pareto solutions of a nonsmooth vector optimization problem with an infinite number of constraints
DP-ADMM: ADMM-based Distributed Learning with Differential Privacy
OWLAx: A Protege Plugin to Support Ontology Axiomatization through Diagramming
Geometric Kinematic Control of a Spherical Rolling Robot
Story Ending Generation with Incremental Encoding and Commonsense Knowledge
Space-Time Block Coding Based Beamforming for Beam Squint Compensation
A combinatorial property of flows on a cycle
ExpIt-OOS: Towards Learning from Planning in Imperfect Information Games
A Divergence Proof for Latuszynski’s Counter-Example Approaching Infinity with Probability ‘Near’ One
Learning Neural Templates for Text Generation
Bipartite Ramsey numbers of large cycles
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis
Reducing post-surgery recovery bed occupancy through an analytical prediction model
The real-time reactive surgical case sequencing problem
Baidu Apollo Auto-Calibration System – An Industry-Level Data-Driven and Learning based Vehicle Longitude Dynamic Calibrating Algorithm
Recognizing Generating Subgraphs in Graphs without Cycles of Lengths 6 and 7
The reactive multiple operating room surgical case sequencing problem
Direct Output Connection for a High-Rank Language Model
Dense Scene Flow from Stereo Disparity and Optical Flow
VirtualIdentity: Privacy-Preserving User Profiling
Optimal Control of the Linear Wave Equation by Time-Depending BV-Controls: A Semi-Smooth Newton Approach
Time-Reversal of Coalescing Diffusive Flows and Weak Convergence of Localized Disturbance Flows
Uncovering intimate and casual relationships from mobile phone communication
A Variational Feature Encoding Method of 3D Object for Probabilistic Semantic SLAM
Minimal inference from incomplete 2×2-tables
Optimal shrinkage covariance matrix estimation under random sampling from elliptical distributions
Sensitivity, Affine Transforms and Quantum Communication Complexity
Towards a Better Metric for Evaluating Question Generation Systems
Pronoun Translation in English-French Machine Translation: An Analysis of Error Types
Leadership in Singleton Congestion Games: What is Hard and What is Easy
Outage Probability of Millimeter Wave Cellular Uplink with Truncated Power Control
Minimal forward random point attractors need not exist
A List of Problems on the Reverse Mathematics of Ramsey Theory on the Rado Graph and on Infinite, Finitely Branching Trees
Automated Scene Flow Data Generation for Training and Verification
Learning to adapt: a meta-learning approach for speaker adaptation
Deciding Robust Feasibility and Infeasibility Using a Set Containment Approach: An Application to Stationary Passive Gas Network Operations
Comparative Studies of Detecting Abusive Language on Twitter
Capacity of Locally Recoverable Codes
Multi-Source Syntactic Neural Machine Translation
Acquiring Annotated Data with Cross-lingual Explicitation for Implicit Discourse Relation Classification
Hybrid Joint Diagonalization Algorithms
Self-stabilizing Overlays for high-dimensional Monotonic Searchability
Diagrammatic proof of the large $N$ melonic dominance in the SYK model
Fully Dynamic MIS in Uniformly Sparse Graphs
PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors
Asymptotically Optimal Codes Correcting Fixed-Length Duplication Errors in DNA Storage Systems
A Coordinate-Free Construction of Scalable Natural Gradient
A categorification of biclosed sets of strings
Large-Scale Cover Song Detection in Digital Music Libraries Using Metadata, Lyrics and Audio Features
A structure theorem for stochastic processes indexed by the discrete hypercube
An Exponential Cox-Ingersoll-Ross Process as Discounting Factor
Asymptotic opitmality of degree-greedy discovering of independent sets in Configuration Model graphs
Pathwise Uniqueness for SDEs with Singular Drift and Nonconstant Diffusion: A simple proof
Algorithms and Bounds for Drawing Directed Graphs
Parametric Topology Optimization with Multi-Resolution Finite Element Models
Robot_gym: accelerated robot training through simulation in the cloud with ROS and Gazebo
Improved approximation algorithms for hitting 3-vertex paths
Metallic glasses for spintronics: anomalous temperature dependence and giant enhancement of inverse spin Hall effect
High-Performance Multi-Mode Ptychography Reconstruction on Distributed GPUs
Asymptotic analysis of the Friedkin-Johnsen model when the matrix of the susceptibility weights approaches the identity matrix
$K_4$-subdivisions have the edge-Erdös-Pósa property
Deep Chronnectome Learning via Full Bidirectional Long Short-Term Memory Networks for MCI Diagnosis
Learning End-to-end Autonomous Driving using Guided Auxiliary Supervision
Lyashko-Looijenga morphisms and primitive factorizations of the Coxeter element
Modeling Empathy and Distress in Reaction to News Stories
A Radix-M Construction for Complementary Sets
Local bounds for stochastic reaction diffusion equations
Accelerating Parallel Tempering: Quantile Tempering Algorithm (QuanTA)
On Subadditive Duality for Conic Mixed-Integer Programs
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation
Ramsey problems for Berge hypergraphs
Geometry of $\ell_p^n$-balls: Classical results and recent developments
Bifurcations in the time-delayed Kuramoto model of coupled oscillators: Exact results
iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection