Network Clocks: Detecting the Temporal Scale of Information Diffusion

Information diffusion models typically assume a discrete timeline in which an information token spreads in the network. Since users in real-world networks vary significantly in their intensity and periods of activity, our objective in this work is to answer: How to determine a temporal scale that best agrees with the observed information propagation within a network? A key limitation of existing approaches is that they aggregate the timeline into fixed-size windows, which may not fit all network nodes’ activity periods. We propose the notion of a heterogeneous network clock: a mapping of events to discrete timestamps that best explains their occurrence according to a given cascade propagation model. We focus on the widely-adopted independent cascade (IC) model and formalize the optimal clock as the one that maximizes the likelihood of all observed cascades. The single optimal clock (OC) problem can be solved exactly in polynomial time. However, we prove that learning multiple optimal clocks(kOC), corresponding to temporal patterns of groups of network nodes, is NP-hard. We propose scalable solutions that run in almost linear time in the total number of cascade activations and discuss approximation guarantees for each variant. Our algorithms and their detected clocks enable improved cascade size classification (up to 8 percent F1 lift) and improved missing cascade data inference (0.15 better recall). We also demonstrate that the network clocks exhibit consistency within the type of content diffusing in the network and are robust with respect to the propagation probability parameters of the IC model.

Probability Reversal and the Disjunction Effect in Reasoning Systems

Data based judgments go into artificial intelligence applications but they undergo paradoxical reversal when seemingly unnecessary additional data is provided. Examples of this are Simpson’s reversal and the disjunction effect where the beliefs about the data change once it is presented or aggregated differently. Sometimes the significance of the difference can be evaluated using statistical tests such as Pearson’s chi-squared or Fisher’s exact test, but this may not be helpful in threshold-based decision systems that operate with incomplete information. To mitigate risks in the use of algorithms in decision-making, we consider the question of modeling of beliefs. We argue that evidence supports that beliefs are not classical statistical variables and they should, in the general case, be considered as superposition states of disjoint or polar outcomes. We analyze the disjunction effect from the perspective of the belief as a quantum vector.

Streamlined Deployment for Quantized Neural Networks

Running Deep Neural Network (DNN) models on devices with limited computational capability is a challenge due to large compute and memory requirements. Quantized Neural Networks (QNNs) have emerged as a potential solution to this problem, promising to offer most of the DNN accuracy benefits with much lower computational cost. However, harvesting these benefits on existing mobile CPUs is a challenge since operations on highly quantized datatypes are not natively supported in most instruction set architectures (ISAs). In this work, we first describe a streamlining flow to convert all QNN inference operations to integer ones. Afterwards, we provide techniques based on processing one bit position at a time (bit-serial) to show how QNNs can be efficiently deployed using common bitwise operations. We demonstrate the potential of QNNs on mobile CPUs with microbenchmarks and on a quantized AlexNet, which is 3.5x faster than an optimized 8-bit baseline.

Variational Reasoning for Question Answering with Knowledge Graph

Knowledge graph is known to be helpful for the task of question answering (QA), since it provides well-structured relational information between entities, and allows one to further infer indirect facts. However, it is challenging to build QA systems which can learn to reason over knowledge graphs based on question-answer pairs alone. First, when people ask questions, their expressions are noisy (for example, typos in texts, or variations in pronunciations), which is non- trivial for the QA system to match those mentioned entities to the knowledge graph. Second, many questions require multi-hop logic reasoning over the knowledge graph to retrieve the answers. To address these challenges, we propose a novel and unified deep learning architecture, and an end-to-end variational learning algorithm which can handle noise in questions, and learn multi-hop reasoning simultaneously. Our method achieves state-of-the-art performance on a recent benchmark dataset in the literature. We also derive a series of new benchmark datasets, including questions for multi-hop reasoning, questions paraphrased by neural translation model, and questions in human voice. Our method yields very promising results on all these challenging datasets.

Query Completion Using Bandits for Engines Aggregation

Assisting users by suggesting completed queries as they type is a common feature of search systems known as query auto-completion. A query auto-completion engine may use prior signals and available information (e.g., user is anonymous, user has a history, user visited the site before the search or not, etc.) in order to improve its recommendations. There are many possible strategies for query auto-completion and a challenge is to design one optimal engine that considers and uses all available information. When different strategies are used to produce the suggestions, it becomes hard to rank these heterogeneous suggestions. An alternative strategy could be to aggregate several engines in order to enhance the diversity of recommendations by combining the capacity of each engine to digest available information differently, while keeping the simplicity of each engine. The main objective of this research is therefore to find such mixture of query completion engines that would beat any engine taken alone. We tackle this problem under the bandits setting and evaluate four strategies to overcome this challenge. Experiments conducted on three real datasets show that a mixture of engines can outperform a single engine.

HitFraud: A Broad Learning Approach for Collective Fraud Detection in Heterogeneous Information Networks

On electronic game platforms, different payment transactions have different levels of risk. Risk is generally higher for digital goods in e-commerce. However, it differs based on product and its popularity, the offer type (packaged game, virtual currency to a game or subscription service), storefront and geography. Existing fraud policies and models make decisions independently for each transaction based on transaction attributes, payment velocities, user characteristics, and other relevant information. However, suspicious transactions may still evade detection and hence we propose a broad learning approach leveraging a graph based perspective to uncover relationships among suspicious transactions, i.e., inter-transaction dependency. Our focus is to detect suspicious transactions by capturing common fraudulent behaviors that would not be considered suspicious when being considered in isolation. In this paper, we present HitFraud that leverages heterogeneous information networks for collective fraud detection by exploring correlated and fast evolving fraudulent behaviors. First, a heterogeneous information network is designed to link entities of interest in the transaction database via different semantics. Then, graph based features are efficiently discovered from the network exploiting the concept of meta-paths, and decisions on frauds are made collectively on test instances. Experiments on real-world payment transaction data from Electronic Arts demonstrate that the prediction performance is effectively boosted by HitFraud with fast convergence where the computation of meta-path based features is largely optimized. Notably, recall can be improved up to 7.93% and F-score 4.62% compared to baselines.

Weighted Orthogonal Components Regression Analysis

In the multiple linear regression setting, we propose a general framework, termed weighted orthogonal components regression (WOCR), which encompasses many known methods as special cases, including ridge regression and principal components regression. WOCR makes use of the monotonicity inherent in orthogonal components to parameterize the weight function. The formulation allows for efficient determination of tuning parameters and hence is computationally advantageous. Moreover, WOCR offers insights for deriving new better variants. Specifically, we advocate weighting components based on their correlations with the response, which leads to enhanced predictive performance. Both simulated studies and real data examples are provided to assess and illustrate the advantages of the proposed methods.

Recursive Exponential Weighting for Online Non-convex Optimization

In this paper, we investigate the online non-convex optimization problem which generalizes the classic {online convex optimization problem by relaxing the convexity assumption on the cost function. For this type of problem, the classic exponential weighting online algorithm has recently been shown to attain a sub-linear regret of O(\sqrt{T\log T}). In this paper, we introduce a novel recursive structure to the online algorithm to define a recursive exponential weighting algorithm that attains a regret of O(\sqrt{T}), matching the well-known regret lower bound. To the best of our knowledge, this is the first online algorithm with provable O(\sqrt{T}) regret for the online non-convex optimization problem.

Conflict management in information fusion with belief functions

In Information fusion, the conflict is an important concept. Indeed, combining several imperfect experts or sources allows conflict. In the theory of belief functions, this notion has been discussed a lot. The mass appearing on the empty set during the conjunctive combination rule is generally considered as conflict, but that is not really a conflict. Some measures of conflict have been proposed and some approaches have been proposed in order to manage this conflict or to decide with conflicting mass functions. We recall in this chapter some of them and we propose a discussion to consider the conflict in information fusion with the theory of belief functions.

Learning from a lot: Empirical Bayes in high-dimensional prediction settings

Empirical Bayes is a versatile approach to `learn from a lot’ in two ways: first, from a large number of variables and second, from a potentially large amount of prior information, e.g. stored in public repositories. We review applications of a variety of empirical Bayes methods to a broad spectrum of prediction methods including penalized regression, random forest, linear discriminant analysis, and Bayesian models with sparse or dense priors. We discuss `formal’ empirical Bayes methods which maximize the marginal likelihood, but also more informal approaches based on other data summaries. We contrast empirical Bayes to cross-validation and full Bayes, and discuss hybrid approaches. To study the relation between the quality of an empirical Bayes estimator and p, the number of variables, we derive the expected mean squared error of a simple empirical Bayes estimator in a linear model setting. We argue that empirical Bayes is particularly useful when the prior contains multiple parameters, modeling a priori information on variables, termed `co-data’. In particular, we present two novel examples that allow for co-data. First, a Bayesian spike-and-slab setting that facilitates inclusion of multiple co-data sources and types; second, a hybrid empirical Bayes-full Bayes ridge regression approach for estimation of the posterior predictive interval.

Action Schema Networks: Generalised Policies with Deep Learning

In this paper, we introduce the Action Schema Network (ASNet): a neural network architecture for learning generalised policies for probabilistic planning problems. By mimicking the relational structure of planning problems, ASNets are able to adopt a weight-sharing scheme which allows the network to be applied to any problem from a given planning domain. This allows the cost of training the network to be amortised over all problems in that domain. Further, we propose a training method which balances exploration and supervised training on small problems to produce a policy which remains robust when evaluated on larger problems. In experiments, we show that ASNet’s learning capability allows it to significantly outperform traditional non-learning planners in several challenging domains.

Approximate Integration of streaming data

We approximate analytic queries on streaming data with a weighted reservoir sampling. For a stream of tuples of a Datawarehouse we show how to approximate some OLAP queries. For a stream of graph edges from a Social Network, we approximate the communities as the large connected components of the edges in the reservoir. We show that for a model of random graphs which follow a power law degree distribution, the community detection algorithm is a good approximation. Given two streams of graph edges from two Sources, we define the {\em Community Correlation} as the fraction of the nodes in communities in both streams. Although we do not store the edges of the streams, we can approximate the Community Correlation and define the {\em Integration of two streams}. We illustrate this approach with Twitter streams, associated with TV programs.

Learning with Opponent-Learning Awareness

Multi-agent settings are quickly gathering importance in machine learning. Beyond a plethora of recent work on deep multi-agent reinforcement learning, hierarchical reinforcement learning, generative adversarial networks and decentralized optimization can all be seen as instances of this setting. However, the presence of multiple learning agents in these settings renders the training problem non-stationary and often leads to unstable training or undesired final results. We present Learning with Opponent-Learning Awareness (LOLA), a method that reasons about the anticipated learning of the other agents. The LOLA learning rule includes an additional term that accounts for the impact of the agent’s policy on the anticipated parameter update of the other agents. We show that the LOLA update rule can be efficiently calculated using an extension of the likelihood ratio policy gradient update, making the method suitable for model-free reinforcement learning. This method thus scales to large parameter and input spaces and nonlinear function approximators. Preliminary results show that the encounter of two LOLA agents leads to the emergence of tit-for-tat and therefore cooperation in the infinitely iterated prisoners’ dilemma, while independent learning does not. In this domain, LOLA also receives higher payouts compared to a naive learner, and is robust against exploitation by higher order gradient-based methods. Applied to infinitely repeated matching pennies, only LOLA agents converge to the Nash equilibrium. We also apply LOLA to a grid world task with an embedded social dilemma using deep recurrent policies. Again, by considering the learning of the other agent, LOLA agents learn to cooperate out of selfish interests.

Tight Semi-Nonnegative Matrix Factorization

The nonnegative matrix factorization is a widely used, flexible matrix decomposition, finding applications in biology, image and signal processing and information retrieval, among other areas. Here we present a related matrix factorization. A multi-objective optimization problem finds conical combinations of templates that approximate a given data matrix. The templates are chosen so that as far as possible only the initial data set can be represented this way. However, the templates are not required to be nonnegative nor convex combinations of the original data.

A Review of Evaluation Techniques for Social Dialogue Systems

In contrast with goal-oriented dialogue, social dialogue has no clear measure of task success. Consequently, evaluation of these systems is notoriously hard. In this paper, we review current evaluation methods, focusing on automatic metrics. We conclude that turn-based metrics often ignore the context and do not account for the fact that several replies are valid, while end-of-dialogue rewards are mainly hand-crafted. Both lack grounding in human perceptions.

Merge and Select: Visualization of a likelihood based k-sample adaptive fusing and model selection

In this article we introduce Merge and Select – a methodology – and factorMerger – an R package – for exploration and visualization of k-group comparisons. Comparison of k-groups is one of the most important issues in exploratory analyses and it has zillions of applications. The classical solution is to test a null hypothesis that observations from all groups come from the same distribution. If the global null hypothesis is rejected a more detailed analysis of differences among pairs of groups is performed. The traditional approach is to use pairwise post hoc tests in order to verify which groups differ significantly. However, this approach fails with large number of groups in both interpretation and visualization layer. The Merge and Select methodology solves this problem by using easy to understand description of LRT based similarity among groups.

A Learning Approach to Secure Learning

Deep Neural Networks (DNNs) have been shown to be vulnerable against adversarial examples, which are data points cleverly constructed to fool the classifier. Such attacks can be devastating in practice, especially as DNNs are being applied to ever increasing critical tasks like image recognition in autonomous driving. In this paper, we introduce a new perspective on the problem. We do so by first defining robustness of a classifier to adversarial exploitation. Next, we show that the problem of adversarial example generation and defense both can be posed as learning problems, which are duals of each other. We also show formally that our defense aims to increase robustness of the classifier. We demonstrate the efficacy of our techniques by experimenting with the MNIST and CIFAR-10 datasets.

Combinatorics of cyclic shifts in plactic, hypoplactic, sylvester, Baxter, and related monoids
Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits
Addressee and Response Selection in Multi-Party Conversations with Speaker Interaction RNNs
Conjectured Nordhaus-Gaddum bound for the inertia of a graph
Oriented Hypergraphic Matrix-tree Type Theorems and Bidirected Minors via Boolean Order Ideals
On the number of equilibria with a given number of unstable directions
Discovering Potential Correlations via Hypercontractivity
Edge-colouring planar graphs with precoloured edges
Enemy At the Gateways: A Game Theoretic Approach to Proxy Distribution
SSE Lossless Compression Method for the Text of the Insignificance of the Lines Order
Induced 2-degenerate Subgraphs of Triangle-free Planar Graphs
A Visualization of the Classical Musical Tradition
Elementary proof of congruences involving sum of binomial coefficients
Winding of simple walks on the square lattice
Data Sketches for Disaggregated Subset Sum and Frequent Item Estimation
Information Design in Crowdfunding under Thresholding Policies
Shifting Mean Activation Towards Zero with Bipolar Activation Functions
Multi-scale Forest Species Recognition Systems for Reduced Cost
Parallelizing Linear Recurrent Neural Nets Over Sequence Length
ENORM: A Framework For Edge NOde Resource Management
How does a locally constrained quantum system localize?
On the stochastic decision problems with backward stochastic viability property
Multivariate Density Modeling for Retirement Finance
A convergence frame for inexact nonconvex and nonsmooth algorithms and its applications to several iterations
Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging
Nonstandard Methods in Ramsey Theory and Combinatorial Number Theory
Setpoint Tracking with Partially Observed Loads
Performance of Test Supermartingale Confidence Intervals for the Success Probability of Bernoulli Trials
On the number of linear hypergraphs of large girth
Promotion on Generalized Oscillating Tableaux and Web Rotation
Pre-training Neural Networks with Human Demonstrations for Deep Reinforcement Learning
The infinite Atlas process: Convergence to equilibrium
Microscopic description of Log and Coulomb gases
A Constrained, Weighted-L1 Minimization Approach for Joint Discovery of Heterogeneous Neural Connectivity Graphs
Robust Transmission for Massive MIMO Downlink with Imperfect CSI
Joint Learning of Set Cardinality and State Distribution
Pole Placement Approach to Coherent Passive Reservoir Engineering for Storing Quantum Information
Delay, memory, and messaging tradeoffs in distributed service systems
Nonsubsampled Graph Filter Banks and Distributed Implementation
Co-training for Demographic Classification Using Deep Learning from Label Proportions
Empower Sequence Labeling with Task-Aware Neural Language Model
On the rarity of several disjoint polymers in Brownian last passage percolation
Meta Networks for Neural Style Transfer
A patchwork quilt sewn from Brownian fabric: regularity of polymer weight profiles in Brownian last passage percolation
EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples
Modulus of continuity of polymer weight profiles in Brownian last passage percolation
Sketch-pix2seq: a Model to Generate Sketches of Multiple Categories
Mitigating Overexposure in Viral Marketing
cqrReg: An R Package for Quantile and Composite Quantile Regression and Variable Selection
Models and Framework for Adversarial Attacks on Complex Adaptive Systems
Data offloading in mobile edge computing: A coalitional game based pricing approach
On the decay of correlations in the random field Ising model
Notes on Discrete Compound Poisson Point Process and Its Concentration Inequalities
Scheduling Two Agents on a Single Machine: A Parameterized Analysis of NP-hard Problems
One-Bit Sphere Decoding for Uplink Massive MIMO Systems with One-Bit ADCs
On The Parameterized Tractability of the Just-In-Time Flow-Shop Scheduling Problem
Computing the Shapley Value in Allocation Problems: Approximations and Bounds, with an Application to the Italian VQR Research Assessment Program
On labeling Android malware signatures using minhashing and further classification with Structural Equation Models
The stochastic order of probability measures on ordered metric spaces
Fractional matching preclusion number of graphs
The Graovac-Pisanski index of a connected bipartite graph is an integer number
On the Clar Number of Benzenoid Graphs
Particle Filters and Data Assimilation
The Bouchaud-Anderson model with double-exponential potential
Upper Bound of Bayesian Generalization Error in Stochastic Matrix Factorization
Long time behavior of the master equation in mean-field game theory
Assessing State-of-the-Art Sentiment Models on State-of-the-Art Sentiment Datasets
Breaking Bellman’s Curse of Dimensionality: Efficient Kernel Gradient Temporal Difference
Numerical Study of Polynomial Feedback Laws for a Bilinear Control Problem
Proceedings of the 25th International Symposium on Graph Drawing and Network Visualization (GD 2017)
Power-Efficient and Secure WPCNs with Hardware Impairments and Non-Linear EH Circuit
Two-step benchmarking: Setting more realistically achievable targets in DEA
A Comparison of Public Causal Search Packages on Linear, Gaussian Data with No Latent Variables
Spectral Efficiency of Multi-User Adaptive Cognitive Radio Networks
Dialogue Act Sequence Labeling using Hierarchical encoder with CRF
Generation and properties of nut graphs
On the Generation of Initial Contexts for Effective Deadlock Detection
$L^\infty$-admissibility for analytic semigroups and applications to input-to-state stability
Lower Bounds for Approximating Graph Parameters via Communication Complexity
Flexible End-to-End Dialogue System for Knowledge Grounded Conversation
Strong local optimality for generalized L 1 optimal control problems
Moderate deviations for critical curie-weiss model
Delay-robust control design for two heterodirectional linear coupled hyperbolic PDEs
Enumeration of permutations avoiding a triple of 4-letter patterns is all done
Finite connected components in infinite directed and multiplex networks with arbitrary degree distributions
Accelerating Analytical Processing in MVCC using Fine-Granular High-Frequency Virtual Snapshotting
Estimation of the marginal expected shortfall under asymptotic independence
Disagreement percolation for marked Gibbs point processes
Area anomaly and generalized drift of iterated sums for hidden Markov walks
Densely tracking sequences of 3D face scans
A Survey of Calibration Methods for Optical See-Through Head-Mounted Displays
Reading Scene Text with Attention Convolutional Sequence Modeling
Automated Cloud Provisioning on AWS using Deep Reinforcement Learning
Equilibration and diffusion for a dynamical Lorentz gas
A short-term planning model for high-speed train assignment and maintenance scheduling
Generating OWA weights using truncated distributions
GLAD: Global-Local-Alignment Descriptor for Pedestrian Retrieval
Bayesian Sparse Global-Local Shrinkage Regression for Grouped Variables
Dipolar-glass behaviour of an insulating film containing nanogranular Fe particles
Model Selection Confidence Sets by Likelihood Ratio Testing
Flexible Network Binarization with Layer-wise Priority
Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection
Natural Language Inference over Interaction Space
Gaussian multiplicative chaos through the lens of the 2D Gaussian free field
Zeros of the deformed exponential function
Linguistic Features of Genre and Method Variation in Translation: A Computational Perspective
Impact of User Height on the Coverage of 3D Beamforming-Enabled Massive MIMO Systems
Multi-ordered Lasserre hierarchy for large scale polynomial optimization in real and complex variables
Random cover times using the Poisson cylinder process
Layered Space-Time Index Coding
Neural Network Based Nonlinear Weighted Finite Automata
Solving nonlinear optimal control problems with state and control delays by shooting methods combined with numerical continuation on the delays
Similarity Embedding Network for Unsupervised Sequential Pattern Learning by Playing Music Puzzle Games
The Power of Synchronisation: Formal Analysis of Power Consumption in Networks of Pulse-Coupled Oscillators
Scalable and Efficient Statistical Inference with Estimating Functions in the MapReduce Paradigm for Big Data
An Efficient Evolutionary Based Method For Image Segmentation
Monte Carlo methods for massively parallel computers
A Tutorial on Deep Learning for Music Information Retrieval
Identifiability of dynamical networks: which nodes need be measured?
On Early-stage Debunking Rumors on Twitter: Leveraging the Wisdom of Weak Learners
Maximum matchings and minimum dominating sets in Apollonian networks and extended Tower of Hanoi graphs
An Inversion-Based Learning Approach for Improving Impromptu Trajectory Tracking of Robots with Non-Minimum Phase Dynamics
Exploiting skeletal structure in computer vision annotation with Benders decomposition
Controllability and lack of controllability with smooth controls in viscoelasticity via moment methods
Power in High-Dimensional Testing Problems
Waring’s Problem in Finite Rings
Local spectral expansion approach to high dimensional expanders part I: Descent of spectral gaps
A polynomial bound for the arithmetic $k$-cycle removal lemma in vector spaces
Directed migration of microscale swimmers by an array of shaped obstacles: modeling and shape optimization
Transmit Antenna Selection for Physical-Layer Network Coding Based on Euclidean Distance
Alternating minimization and alternating descent over nonconvex sets