Event Schema Induction using Tensor Factorization with Back-off

The goal of Event Schema Induction(ESI) is to identify schemas of events from a corpus of documents. For example, given documents from the sports domain, we would like to infer that win(WinningPlayer, Trophy, OpponentPlayer, Location) is an important event schema for this domain. Automatic discovery of such event schemas is an important first step towards building domain-specific Knowledge Graphs (KGs). ESI has been the focus of some prior research, with generative models achieving the best performance. In this paper,we propose TFB, a tensor factorization-based method with back-off for ESI. TFB solves a novel objective to factorize Open Information Extraction (OpenIE) tuples for inducing binary schemas. Event schemas are induced out of this set of binary schemas by solving a constrained clique problem. To the best of our knowledge this is the first application of tensor factorization for the ESI problem. TFB outperforms current state-of-the-art by 52 (absolute) points gain in accuracy, while achieving 90x speedup on average. We hope to make all the code and datasets used in the paper publicly available upon publication of the paper.

A causal framework for explaining the predictions of black-box sequence-to-sequence models

We interpret the predictions of any black-box structured input-structured output model around a specific input-output pair. Our method returns an ‘explanation” consisting of groups of input-output tokens that are causally related. Our method infers these dependencies by querying the model with perturbed inputs, generating a graph over tokens from the responses, and solving a partitioning problem to select the most relevant components. We focus the general approach on sequence-to-sequence problems, adopting a variational autoencoder to yield meaningful input perturbations. We test our method across several NLP sequence generation tasks.

Significance of Disk Failure Prediction in Datacenters

Modern datacenters assemble a very large number of disk drives under a single roof. Even if economic and technical factors where to make individual drives more reliable (which is not at all clear, given the commoditization of the technology), their sheer numbers combined with their ever increasing utilization in a well-balanced design makes achieving storage reliability a major challenge. In this paper, we assess the challenge of storage system reliability in the modern datacenter, and demonstrate how good disk failure prediction models can significantly improve the reliability of such systems.

A Tutorial on Thompson Sampling

Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may improve future performance. The algorithm addresses a broad range of problems in a computationally efficient manner and is therefore enjoying wide use. This tutorial covers the algorithm and its application, illustrating concepts through a range of examples, including Bernoulli bandit problems, shortest path problems, dynamic pricing, recommendation, active learning with neural networks, and reinforcement learning in Markov decision processes. Most of these problems involve complex information structures, where information revealed by taking an action informs beliefs about other actions. We will also discuss when and why Thompson sampling is or is not effective and relations to alternative algorithms.

Image Segmentation Algorithms Overview

The technology of image segmentation is widely used in medical image processing, face recognition pedestrian detection, etc. The current image segmentation techniques include region-based segmentation, edge detection segmentation, segmentation based on clustering, segmentation based on weakly-supervised learning in CNN, etc. This paper analyzes and summarizes these algorithms of image segmentation, and compares the advantages and disadvantages of different algorithms. Finally, we make a prediction of the development trend of image segmentation with the combination of these algorithms.

Sampling of Temporal Networks: Methods and Biases

Temporal networks have been increasingly used to model a diversity of systems that evolve in time; for example human contact structures over which dynamic processes such as epidemics take place. A fundamental aspect of real-life networks is that they are sampled within temporal and spatial frames. Furthermore, one might wish to subsample networks to reduce their size for better visualization or to perform computationally intensive simulations. The sampling method may affect the network structure and thus caution is necessary to generalize results based on samples. In this paper, we study four sampling strategies applied to a variety of real-life temporal networks. We quantify the biases generated by each sampling strategy on a number of relevant statistics such as link activity, temporal paths and epidemic spread. We find that some biases are common in a variety of networks and statistics, but one strategy, uniform sampling of nodes, shows improved performance in most scenarios. Our results help researchers to better design network data collection protocols and to understand the limitations of sampled temporal network data.

Object-Oriented Software for Functional Data

This paper introduces the funData R package as an object-oriented implementation of functional data. It implements a unified framework for dense univariate and multivariate functional data on one- and higher dimensional domains as well as for irregularly sampled functional data. The aim of this package is to provide a user-friendly, self-contained core toolbox for functional data, including important functionalities for creating, accessing and modifying functional data objects, that can serve as a basis for other packages. The package further contains a full simulation toolbox, which is a useful feature when implementing and testing new methodological developments. Based on the theory of object-oriented data analysis, it is shown why it is natural to implement functional data in an object-oriented manner. The classes and methods provided by funData are illustrated in many examples using two freely available datasets. The MFPCA package, which implements multivariate functional principal component analysis, is presented as an example for an advanced methodological package that uses the funData package as a basis, including a case study with real data. Both packages are publicly available on GitHub and CRAN.

Learning Loss Functions for Semi-supervised Learning via Discriminative Adversarial Networks

We propose discriminative adversarial networks (DAN) for semi-supervised learning and loss function learning. Our DAN approach builds upon generative adversarial networks (GANs) and conditional GANs but includes the key differentiator of using two discriminators instead of a generator and a discriminator. DAN can be seen as a framework to learn loss functions for predictors that also implements semi-supervised learning in a straightforward manner. We propose instantiations of DAN for two different prediction tasks: classification and ranking. Our experimental results on three datasets of different tasks demonstrate that DAN is a promising framework for both semi-supervised learning and learning loss functions for predictors. For all tasks, the semi-supervised capability of DAN can significantly boost the predictor performance for small labeled sets with minor architecture changes across tasks. Moreover, the loss functions automatically learned by DANs are very competitive and usually outperform the standard pairwise and negative log-likelihood loss functions for both semi-supervised and supervised learning.

Text Summarization Techniques: A Brief Survey

In recent years, there has been a explosion in the amount of text data from a variety of sources. This volume of text is an invaluable source of information and knowledge which needs to be effectively summarized to be useful. In this review, the main approaches to automatic text summarization are described. We review the different processes for summarization and describe the effectiveness and shortcomings of the different methods.

Zero-Shot Deep Domain Adaptation
Facilitated exclusion process
Graph Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting
A tale of stars and cliques
Lattice paths with catastrophes
End-to-End Learning of Semantic Grasping
High-Performance FPGA Implementation of Equivariant Adaptive Separation via Independence Algorithm for Independent Component Analysis
Simple Classification using Binary Data
Robust Doubly Protected Estimators for Quantiles with Missing Data
Limit Theorem for Non-Linear Langevin Equations Driven by Lévy Noise
Well-Founded Operators for Normal Hybrid MKNF Knowledge Bases
Relaxation time and critical slowing down of a spin-torque oscillator
Long-Term Memory Networks for Question Answering
Controllability and Stabilizability Analysis of Signed Consensus Networks
An ADMM Approach to the Problem of Nash Equilibrium Seeking
Signed graphs and the freeness of the Weyl subarrangements of type $B_{\ell}$
Load Balancing in the Non-Degenerate Slowdown Regime
Higher-order congruence relations on affine moment graphs I
A Generalised Seizure Prediction with Convolutional Neural Networks for Intracranial and Scalp Electroencephalogram Data Analysis
Local Large deviation: A McMillian Theorem for Coloured Random Graph Processes
Phase transitions in the $q$-coloring of random hypergraphs
Dynamical pruning of binary trees with applications to inviscid Burgers equation
On the Compactness, Efficiency, and Representation of 3D Convolutional Networks: Brain Parcellation as a Pretext Task
Capacity of Wireless Distributed Storage Systems with Broadcast Repair
Shared-memory Graph Truss Decomposition
Further results on the expected hitting time, the cover cost and the related invariants of graphs
Infinite families of 2-designs from GA_1(q) actions
The hydrodynamic limit of a randomized load balancing network
The totally nonnegative Grassmannian is a ball
A Robust t-process Regression Model with Independent Errors
Efficient Adjoint Computation for Wavelet and Convolution Operators
Automatic Classification of Bright Retinal Lesions via Deep Network Features
A Nested Attention Neural Hybrid Model for Grammatical Error Correction
Data-Driven Loop Invariant Inference with Automatic Feature Synthesis
Matrix-Based Characterization of the Motion and Wrench Uncertainties in Robotic Manipulators
Networked Fairness in Cake Cutting
Stanley sequences with odd character
A note on some variations of the $γ$-graph
DroneCells: Improving 5G Spectral Efficiency using Drone-mounted Flying Base Stations
Weakly distance-regular digraphs whose attached association schemes are regular
Testing Forecast Accuracy of Expectiles and Quantiles with the Extremal Consistent Loss Functions
Exhaustive search for sparse variable selection in linear regression
Structured H-infinity control of infinite dimensional systems
Redundancy implies robustness for bang-bang strategies
Treatment Effects for Which Population?: Sampling Design and External Validity
Controlling a Population
External Evaluation of Event Extraction Classifiers for Automatic Pathway Curation: An extended study of the mTOR pathway
A spatiotemporal model with visual attention for video classification
Structured Matrix Estimation and Completion
The geometry of hyperbolic lines in polar spaces
Enumerating Lambda Terms by Weighted Length of Their De Bruijn Representation
AE regularity of interval matrices
Deep Discrete Hashing with Self-supervised Pairwise Labels
Sparse Approximation of 3D Meshes using the Spectral Geometry of the Hamiltonian Operator
A moderate deviation principle for 2D stochastic primitive equations
A Technical Note: Two-Step PECE Methods for Approximating Solutions To First- and Second-Order ODEs
Global Optimization with Orthogonality Constraints via Stochastic Diffusion on Manifold
Critical link of self-similarity and visualisation for jump-diffusions driven by $α$-stable noise
SigNet: Convolutional Siamese Network for Writer Independent Offline Signature Verification
On the shape of random Pólya structures
Classification of geometrical objects by integrating currents and functional data analysis. An application to a 3D database of Spanish child population
Change of Measures for Compound Renewal Processes with Applications to Premium Calculation Principles
Secure Symmetric Private Information Retrieval from Colluding Databases with Adversaries
Deep Character-Level Click-Through Rate Prediction for Sponsored Search
Circular-shift Linear Network Coding
Interpreting and using CPDAGs with background knowledge
Methods for finding leader–follower equilibria with multiple followers
Applying Parabolic Peterson: Affine Algebras and the Quantum Cohomology of the Grassmannian
Real eigenvalues in the non-Hermitian Anderson model
A bi-dimensional finite mixture model for longitudinal data subject to dropout
Threshold driven contagion on weighted networks
Comptage probabiliste sur la frontière de Furstenberg
Binary strings of length $n$ with $x$ zeros and longest $k$-runs of zeros
On subexponential parameterized algorithms for Steiner Tree and Directed Subset TSP on planar graphs
Design and Processing of Invertible Orientation Scores of 3D Images for Enhancement of Complex Vasculature
A multi-layer image representation using Regularized Residual Quantization: application to compression and denoising
Infinite-server queues with Hawkes arrival processes
Gröbner Bases for Schubert Codes
Few Non-derogatory Directed Graphs from Directed Cycles
Learning human behaviors from motion capture by adversarial imitation
Analysis and Control of a Non-Standard Hyperbolic PDE Traffic Flow Model
The Stellar tree: a Compact Representation for Simplicial Complexes and Beyond
Radio-flaring Ultracool Dwarf Population Synthesis
Mendelian randomization with fine-mapped genetic data: choosing from large numbers of correlated instrumental variables
Interference Mitigation via Relaying
A Lower Bound Technique for Communication in BSP
Computational Models of Tutor Feedback in Language Acquisition
The Impact of Model Assumptions in Scalar-on-Image Regression
The 2017 Hands in the Million Challenge on 3D Hand Pose Estimation
Excluded minors for the class of split matroids
Generative Adversarial Models for People Attribute Recognition in Surveillance
Repairing Multiple Failures for Scalar MDS Codes
GPU-Accelerated Algorithms for Compressed Signals Recovery with Application to Astronomical Imagery Deblurring
Hidden Truncation Hyperbolic Distributions, Finite Mixtures Thereof, and Their Application for Clustering
Stability Diagram and Critical Time Delay for the Kuramoto Model with Heterogeneous Interaction Delays
Expected intrinsic volumes and facet numbers of random beta-polytopes
Covering and 2-packing numbers in graphs
Fair Personalization
Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task
Optimisation of the lowest Robin eigenvalue in the exterior of a compact set, II: non-convex domains and higher dimensions
Strong Uniqueness of Singular Stochastic Delay Equations
A parallel corpus of Python functions and documentation strings for automated code documentation and code generation
Non-smooth Non-convex Bregman Minimization: Unification and new Algorithms
Emergence of Locomotion Behaviours in Rich Environments
Orderly generation of Butson Hadamard matrices