We present Sounding Board, a social chatbot that won the 2017 Amazon Alexa Prize. The system architecture consists of several components including spoken language processing, dialogue management, language generation, and content management, with emphasis on user-centric and content-driven design. We also share insights gained from large-scale online logs based on 160,000 conversations with real-world users.
Social media for news consumption is becoming increasingly popular due to its easy access, fast dissemination, and low cost. However, social media also enable the wide propagation of ‘fake news’, i.e., news with intentionally false information. Fake news on social media poses significant negative societal effects, and also presents unique challenges. To tackle the challenges, many existing works exploit various features, from a network perspective, to detect and mitigate fake news. In essence, news dissemination ecosystem involves three dimensions on social media, i.e., a content dimension, a social dimension, and a temporal dimension. In this chapter, we will review network properties for studying fake news, introduce popular network types and how these networks can be used to detect and mitigation fake news on social media.
Statistical analysis on object data presents many challenges. Basic summaries such as means and variances are difficult to compute. We apply ideas from topology to study object data. We present a framework for using persistence landscapes to vectorize object data and perform statistical analysis. We apply to this pipeline to some biological images that were previously shown to be challenging to study using shape theory. Surprisingly, the most persistent features are shown to be ‘topological noise’ and the statistical analysis depends on the less persistent features which we refer to as the ‘geometric signal’. We also describe the first steps to a new approach to using topology for object data analysis, which applies topology to distributions on object spaces.
In the low rank matrix completion (LRMC) problem, the low rank assumption means that the columns (or rows) of the matrix to be completed are points on a low-dimensional linear algebraic variety. This paper extends this thinking to cases where the columns are points on a low-dimensional nonlinear algebraic variety, a problem we call Low Algebraic Dimension Matrix Completion (LADMC). Matrices whose columns belong to a union of subspaces (UoS) are an important special case. We propose a LADMC algorithm that leverages existing LRMC methods on a tensorized representation of the data. For example, a second-order tensorization representation is formed by taking the outer product of each column with itself, and we consider higher order tensorizations as well. This approach will succeed in many cases where traditional LRMC is guaranteed to fail because the data are low-rank in the tensorized representation but not in the original representation. We also provide a formal mathematical justification for the success of our method. In particular, we show bounds of the rank of these data in the tensorized representation, and we prove sampling requirements to guarantee uniqueness of the solution. Interestingly, the sampling requirements of our LADMC algorithm nearly match the information theoretic lower bounds for matrix completion under a UoS model. We also provide experimental results showing that the new approach significantly outperforms existing state-of-the-art methods for matrix completion in many situations.
This paper focuses on a novel problem, i.e., transplanting a category-and-task-specific neural network to a generic, distributed network without strong supervision. Like playing LEGO blocks, incrementally constructing a generic network by asynchronously merging specific neural networks is a crucial bottleneck for deep learning. Suppose that the pre-trained specific network contains a module $f$ to extract features of the target category, and the generic network has a module $g$ for a target task, which is trained using other categories except for the target category. Instead of using numerous training samples to teach the generic network a new category, we aim to learn a small adapter module to connect $f$ and $g$ to accomplish the task on a target category in a weakly-supervised manner. The core challenge is to efficiently learn feature projections between the two connected modules. We propose a new distillation algorithm, which exhibited superior performance. Our method without training samples even significantly outperformed the baseline with 100 training samples.
Modern information society depends on reliable functionality of information systems infrastructure, while at the same time the number of cyber-attacks has been increasing over the years and damages have been caused. Furthermore, graphs can be used to show paths than can be exploited by attackers to intrude into systems and gain unauthorized access through vulnerability exploitation. This paper presents a method that builds attack graphs using data supplied from the maritime supply chain infrastructure. The method delivers all possible paths that can be exploited to gain access. Then, a recommendation system is utilized to make predictions about future attack steps within the network. We show that recommender systems can be used in cyber defense by predicting attacks. The goal of this paper is to identify attack paths and show how a recommendation method can be used to classify future cyber-attacks in terms of risk management. The proposed method has been experimentally evaluated and validated, with the results showing that it is both practical and effective.
Most environmental phenomena, such as wind profiles, ozone concentration and sunlight distribution under a forest canopy, exhibit nonstationary dynamics i.e. phenomenon variation change depending on the location and time of occurrence. Non-stationary dynamics pose both theoretical and practical challenges to statistical machine learning algorithms aiming to accurately capture the complexities governing the evolution of such processes. In this paper, we address the sampling aspects of the problem of learning nonstationary spatio-temporal models, and propose an efficient yet simple algorithm – LISAL. The core idea in LISAL is to learn two models using Gaussian processes (GPs) wherein the first is a nonstationary GP directly modeling the phenomenon. The second model uses a stationary GP representing a latent space corresponding to changes in dynamics, or the nonstationarity characteristics of the first model. LISAL involves adaptively sampling the latent space dynamics using information theory quantities to reduce the computational cost during the learning phase. The relevance of LISAL is extensively validated using multiple real world datasets.
In many signal processing and machine learning applications, datasets containing private information are held at different locations, requiring the development of distributed privacy-preserving algorithms. Tensor and matrix factorizations are key components of many processing pipelines. In the distributed setting, differentially private algorithms suffer because they introduce noise to guarantee privacy. This paper designs new and improved distributed and differentially private algorithms for two popular matrix and tensor factorization methods: principal component analysis (PCA) and orthogonal tensor decomposition (OTD). The new algorithms employ a correlated noise design scheme to alleviate the effects of noise and can achieve the same noise level as the centralized scenario. Experiments on synthetic and real data illustrate the regimes in which the correlated noise allows performance matching with the centralized setting, outperforming previous methods and demonstrating that meaningful utility is possible while guaranteeing differential privacy.
A regularized vector autoregressive hidden semi-Markov model is developed to analyze multivariate financial time series with switching data generating regimes. Furthermore, an augmented EM algorithm is proposed for parameter estimation by embedding regularized estimators for the state-dependent covariance matrices and autoregression matrices in the M-step. The performance of the proposed regularized estimators is evaluated both in the simulation experiments and on the New York Stock Exchange financial portfolio data.
Most real world phenomena such as sunlight distribution under a forest canopy, minerals concentration, stock valuation, exhibit nonstationary dynamics i.e. phenomenon variation changes depending on the locality. Nonstationary dynamics pose both theoretical and practical challenges to statistical machine learning algorithms that aim to accurately capture the complexities governing the evolution of such processes. Typically the nonstationary dynamics are modeled using nonstationary Gaussian Process models (NGPS) that employ local latent dynamics parameterization to correspondingly model the nonstationary real observable dynamics. Recently, an approach based on most likely induced latent dynamics representation attracted research community’s attention for a while. The approach could not be employed for large scale real world applications because learning a most likely latent dynamics representation involves maximization of marginal likelihood of the observed real dynamics that becomes intractable as the number of induced latent points grows with problem size. We have established a direct relationship between informativeness of the induced latent dynamics and the marginal likelihood of the observed real dynamics. This opens up the possibility of maximizing marginal likelihood of observed real dynamics indirectly by near optimally maximizing entropy or mutual information gain on the induced latent dynamics using greedy algorithms. Therefore, for an efficient yet accurate inference, we propose to build an induced latent dynamics representation using a novel algorithm LISAL that adaptively maximizes entropy or mutual information on the induced latent dynamics and marginal likelihood of observed real dynamics in an iterative manner. The relevance of LISAL is validated using real world sensing datasets.
We present a system for online probabilistic event forecasting. We assume that a user is interested in detecting and forecasting event patterns, given in the form of regular expressions. Our system can consume streams of events and forecast when the pattern is expected to be fully matched. As more events are consumed, the system revises its forecasts to reflect possible changes in the state of the pattern. The framework of Pattern Markov Chains is used in order to learn a probabilistic model for the pattern, with which forecasts with guaranteed precision may be produced, in the form of intervals within which a full match is expected. Experimental results from real-world datasets are shown and the quality of the produced forecasts is explored, using both precision scores and two other metrics: spread, which refers to the ‘focusing resolution’ of a forecast (interval length), and distance, which captures how early a forecast is reported.