**varbvs: Fast Variable Selection for Large-scale Regression**

We introduce varbvs, a suite of functions written in R and MATLAB for regression analysis of large-scale data sets using Bayesian variable selection methods. We have developed numerical optimization algorithms based on variational approximation methods that make it feasible to apply Bayesian variable selection to very large data sets. With a focus on examples from genome-wide association studies, we demonstrate that varbvs scales well to data sets with hundreds of thousands of variables and thousands of samples, and has features that facilitate rapid data analyses. Moreover, varbvs allows for extensive model customization, which can be used to incorporate external information into the analysis. We expect that the combination of an easy-to-use interface and robust, scalable algorithms for posterior computation will encourage broader use of Bayesian variable selection in areas of applied statistics and computational biology. The most recent R and MATLAB source code is available for download at Github (

https://…/varbvs ), and the R package can be installed from CRAN (

https://…/package=varbvs ).

**Unsupervised Machine Learning for Networking: Techniques, Applications and Research Challenges**

While machine learning and artificial intelligence have long been applied in networking research, the bulk of such works has focused on supervised learning. Recently there has been a rising trend of employing unsupervised machine learning using unstructured raw network data to improve network performance and provide services such as traffic engineering, anomaly detection, Internet traffic classification, and quality of service optimization. The interest in applying unsupervised learning techniques in networking emerges from their great success in other fields such as computer vision, natural language processing, speech recognition, and optimal control (e.g., for developing autonomous self-driving cars). Unsupervised learning is interesting since it can unconstrain us from the need of labeled data and manual handcrafted feature engineering thereby facilitating flexible, general, and automated methods of machine learning. The focus of this survey paper is to provide an overview of the applications of unsupervised learning in the domain of networking. We provide a comprehensive survey highlighting the recent advancements in unsupervised learning techniques and describe their applications for various learning tasks in the context of networking. We also provide a discussion on future directions and open research issues, while also identifying potential pitfalls. While a few survey papers focusing on the applications of machine learning in networking have previously been published, a survey of similar scope and breadth is missing in literature. Through this paper, we advance the state of knowledge by carefully synthesizing the insights from these survey papers while also providing contemporary coverage of recent advances.

**On Collaborative Compressive Sensing Systems: The Framework, Design and Algorithm**

We propose a collaborative compressive sensing (CCS) framework consisting of a bank of

compressive sensing (CS) systems that share the same sensing matrix but have different sparsifying dictionaries. This CCS system is guaranteed to yield better performance than each individual CS system in a statistical sense, while with the parallel computing strategy, it requires the same time as that needed for each individual CS system to conduct compression and signal recovery. We then provide an approach to designing optimal CCS systems by utilizing a measure that involves both the sensing matrix and dictionaries and hence allows us to simultaneously optimize the sensing matrix and all the

dictionaries under the same scheme. An alternating minimization-based algorithm is derived for solving the corresponding optimal design problem. We provide a rigorous convergence analysis to show that the proposed algorithm is convergent. Experiments with real images are carried out and show that the proposed CCS system significantly improves on existing CS systems in terms of the signal recovery accuracy.

**Deep Reinforcement Learning for Event-Driven Multi-Agent Decision Processes**

The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems. Since macro-actions last for stochastic durations, multiple agents executing decentralized policies in cooperative environments must act asynchronously. We present an algorithm that modifies Generalized Advantage Estimation for temporally extended actions, allowing a state-of-the-art policy optimization algorithm to optimize policies in Dec-POMDPs in which agents act asynchronously. We show that our algorithm is capable of learning optimal policies in two cooperative domains, one involving real-time bus holding control and one involving wildfire fighting with unmanned aircraft. Our algorithm works by framing problems as ‘event-driven decision processes,’ which are scenarios where the sequence and timing of actions and events are random and governed by an underlying stochastic process. In addition to optimizing policies with continuous state and action spaces, our algorithm also facilitates the use of event-driven simulators, which do not require time to be discretized into time-steps. We demonstrate the benefit of using event-driven simulation in the context of multiple agents taking asynchronous actions. We show that fixed time-step simulation risks obfuscating the sequence in which closely-separated events occur, adversely affecting the policies learned. Additionally, we show that arbitrarily shrinking the time-step scales poorly with the number of agents.

**Curriculum Learning of Visual Attribute Clusters for Multi-Task Classification**

Visual attributes, from simple objects (e.g., backpacks, hats) to soft-biometrics (e.g., gender, height, clothing) have proven to be a powerful representational approach for many applications such as image description and human identification. In this paper, we introduce a novel method to combine the advantages of both multi-task and curriculum learning in a visual attribute classification framework. Individual tasks are grouped after performing hierarchical clustering based on their correlation. The clusters of tasks are learned in a curriculum learning setup by transferring knowledge between clusters. The learning process within each cluster is performed in a multi-task classification setup. By leveraging the acquired knowledge, we speed-up the process and improve performance. We demonstrate the effectiveness of our method via ablation studies and a detailed analysis of the covariates, on a variety of publicly available datasets of humans standing with their full-body visible. Extensive experimentation has proven that the proposed approach boosts the performance by 4% to 10%.

**A textual transform of multivariate time-series for prognostics**

Prognostics or early detection of incipient faults is an important industrial challenge for condition-based and preventive maintenance. Physics-based approaches to modeling fault progression are infeasible due to multiple interacting components, uncontrolled environmental factors and observability constraints. Moreover, such approaches to prognostics do not generalize to new domains. Consequently, domain-agnostic data-driven machine learning approaches to prognostics are desirable. Damage progression is a path-dependent process and explicitly modeling the temporal patterns is critical for accurate estimation of both the current damage state and its progression leading to total failure. In this paper, we present a novel data-driven approach to prognostics that employs a novel textual representation of multivariate temporal sensor observations for predicting the future health state of the monitored equipment early in its life. This representation enables us to utilize well-understood concepts from text-mining for modeling, prediction and understanding distress patterns in a domain agnostic way. The approach has been deployed and successfully tested on large scale multivariate time-series data from commercial aircraft engines. We report experiments on well-known publicly available benchmark datasets and simulation datasets. The proposed approach is shown to be superior in terms of prediction accuracy, lead time to prediction and interpretability.

**OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning**

Reinforcement learning has shown promise in learning policies that can solve complex problems. However, manually specifying a good reward function can be difficult, especially for intricate tasks. Inverse reinforcement learning offers a useful paradigm to learn the underlying reward function directly from expert demonstrations. Yet in reality, the corpus of demonstrations may contain trajectories arising from a diverse set of underlying reward functions rather than a single one. Thus, in inverse reinforcement learning, it is useful to consider such a decomposition. The options framework in reinforcement learning is specifically designed to decompose policies in a similar light. We therefore extend the options framework and propose a method to simultaneously recover reward options in addition to policy options. We leverage adversarial methods to learn joint reward-policy options using only observed expert states. We show that this approach works well in both simple and complex continuous control tasks and shows significant performance increases in one-shot transfer learning.

**Contrastive Principal Component Analysis**

We present a new technique called contrastive principal component analysis (cPCA) that is designed to discover low-dimensional structure that is unique to a dataset, or enriched in one dataset relative to other data. The technique is a generalization of standard PCA, for the setting where multiple datasets are available — e.g. a treatment and a control group, or a mixed versus a homogeneous population — and the goal is to explore patterns that are specific to one of the datasets. We conduct a wide variety of experiments in which cPCA identifies important dataset-specific patterns that are missed by PCA, demonstrating that it is useful for many applications: subgroup discovery, visualizing trends, feature selection, denoising, and data-dependent standardization. We provide geometrical interpretations of cPCA and show that it satisfies desirable theoretical guarantees. We also extend cPCA to nonlinear settings in the form of kernel cPCA. We have released our code as a python package and documentation is on Github.

**SBG-Sketch: A Self-Balanced Sketch for Labeled-Graph Stream Summarization**

Applications in various domains rely on processing graph streams, e.g., communication logs of a cloud-troubleshooting system, road-network traffic updates, and interactions on a social network. A labeled-graph stream refers to a sequence of streamed edges that form a labeled graph. Label-aware applications need to filter the graph stream before performing a graph operation. Due to the large volume and high velocity of these streams, it is often more practical to incrementally build a lossy-compressed version of the graph, and use this lossy version to approximately evaluate graph queries. Challenges arise when the queries are unknown in advance but are associated with filtering predicates based on edge labels. Surprisingly common, and especially challenging, are labeled-graph streams that have highly skewed label distributions that might also vary over time. This paper introduces Self-Balanced Graph Sketch (SBG-Sketch, for short), a graphical sketch for summarizing and querying labeled-graph streams that can cope with all these challenges. SBG-Sketch maintains synopsis for both the edge attributes (e.g., edge weight) as well as the topology of the streamed graph. SBG-Sketch allows efficient processing of graph-traversal queries, e.g., reachability queries. Experimental results over a variety of real graph streams show SBG-Sketch to reduce the estimation errors of state-of-the-art methods by up to 99%.

**VCExplorer: A Interactive Graph Exploration Framework Based on Hub Vertices with Graph Consolidation**

Graphs have been widely used to model different information networks, such as the Web, biological networks and social networks (e.g. Twitter). Due to the size and complexity of these graphs, how to explore and utilize these graphs has become a very challenging problem. In this paper, we propose, VCExplorer, a new interactive graph exploration framework that integrates the strengths of graph visualization and graph summarization. Unlike existing graph visualization tools where vertices of a graph may be clustered into a smaller collection of super/virtual vertices, VCExplorer displays a small number of actual source graph vertices (called hubs) and summaries of the information between these vertices. We refer to such a graph as a HA-graph (Hub-based Aggregation Graph). This allows users to appreciate the relationship between the hubs, rather than super/virtual vertices. Users can navigate through the HA- graph by ‘drilling down’ into the summaries between hubs to display more hubs. We illustrate how the graph aggregation techniques can be integrated into the exploring framework as the consolidated information to users. In addition, we propose efficient graph aggregation algorithms over multiple subgraphs via computation sharing. Extensive experimental evaluations have been conducted using both real and synthetic datasets and the results indicate the effectiveness and efficiency of VCExplorer for exploration.

**Temporal Pattern Mining from Evolving Networks**

Recently, evolving networks are becoming a suitable form to model many real-world complex systems, due to their peculiarities to represent the systems and their constituting entities, the interactions between the entities and the time-variability of their structure and properties. Designing computational models able to analyze evolving networks becomes relevant in many applications. The goal of this research project is to evaluate the possible contribution of temporal pattern mining techniques in the analysis of evolving networks. In particular, we aim at exploiting available snapshots for the recognition of valuable and potentially useful knowledge about the temporal dynamics exhibited by the network over the time, without making any prior assumption about the underlying evolutionary schema. Pattern-based approaches of temporal pattern mining can be exploited to detect and characterize changes exhibited by a network over the time, starting from observed snapshots.

**Distributed Lance-William Clustering Algorithm**

One important tool is the optimal clustering of data into useful categories. Dividing similar objects into a smaller number of clusters is of importance in many applications. These include search engines, monitoring of academic performance, biology and wireless networks. We first discuss a number of clustering methods. We present a parallel algorithm for the efficient clustering of objects into groups based on their similarity to each other. The input consists of an n by n distance matrix. This matrix would have a distance ranking for each pair of objects. The smaller the number, the more similar the two objects are to each other. We utilize parallel processors to calculate a hierarchal cluster of these n items based on this matrix. Another advantage of our method is distribution of the large n by n matrix. We have implemented our algorithm and have found it to be scalable both in terms of processing speed and storage.

**Doctoral Advisor or Medical Condition: Towards Entity-specific Rankings of Knowledge Base Properties [Extended Version]**

In knowledge bases such as Wikidata, it is possible to assert a large set of properties for entities, ranging from generic ones such as name and place of birth to highly profession-specific or background-specific ones such as doctoral advisor or medical condition. Determining a preference or ranking in this large set is a challenge in tasks such as prioritisation of edits or natural-language generation. Most previous approaches to ranking knowledge base properties are purely data-driven, that is, as we show, mistake frequency for interestingness. In this work, we have developed a human-annotated dataset of 350 preference judgments among pairs of knowledge base properties for fixed entities. From this set, we isolate a subset of pairs for which humans show a high level of agreement (87.5% on average). We show, however, that baseline and state-of-the-art techniques achieve only 61.3% precision in predicting human preferences for this subset. We then analyze what contributes to one property being rated as more important than another one, and identify that at least three factors play a role, namely (i) general frequency, (ii) applicability to similar entities and (iii) semantic similarity between property and entity. We experimentally analyze the contribution of each factor and show that a combination of techniques addressing all the three factors achieves 74% precision on the task. The dataset is available at

http://www.kaggle.com/srazniewski/wikidatapropertyranking.

**ProbeSim: Scalable Single-Source and Top-k SimRank Computations on Dynamic Graphs**

Single-source and top-

SimRank queries are two important types of similarity search in graphs with numerous applications in web mining, social network analysis, spam detection, etc. A plethora of techniques have been proposed for these two types of queries, but very few can efficiently support similarity search over large dynamic graphs, due to either significant preprocessing time or large space overheads. This paper presents ProbeSim, an index-free algorithm for single-source and top-

SimRank queries that provides a non-trivial theoretical guarantee in the absolute error of query results. ProbeSim estimates SimRank similarities without precomputing any indexing structures, and thus can naturally support real-time SimRank queries on dynamic graphs. Besides the theoretical guarantee, ProbeSim also offers satisfying practical efficiency and effectiveness due to several non-trivial optimizations. We conduct extensive experiments on a number of benchmark datasets, which demonstrate that our solutions significantly outperform the existing methods in terms of efficiency and effectiveness. Notably, our experiments include the first empirical study that evaluates the effectiveness of SimRank algorithms on graphs with billion edges, using the idea of pooling.

• Interplay of Coulomb interactions and disorder in three dimensional quadratic band crossings without time-reversal or particle-hole symmetry

• Orbits for eighteen visual binaries and two double-line spectroscopic binaries observed with HRCAM on the CTIO SOAR 4m telescope, using a new Bayesian orbit code based on Markov Chain Monte Carlo

• A Driven Tagged Particle in Asymmetric Simple Exclusion Processes

• Orthogonal Series Density Estimation for Complex Surveys

• Time-Optimal Collaborative Guidance using the Generalized Hopf Formula

• On Upper Approximations of Pareto Fronts

• Queuing with Heterogeneous Users: Block Probability and Sojourn times

• A Destroying Driven Tagged Particle in Symmetric Simple Exclusion Processes

• Derivation of Network Reprogramming Protocol with Z3

• Optimal projection of observations in a Bayesian setting

• High-dimensional posterior consistency for hierarchical non-local priors in regression

• Yaglom limits for R-transient chains with non-trivial Martin boundary

• A Server-based Approach for Predictable GPU Access with Improved Analysis

• A Memristive Neural Network Computing Engine using CMOS-Compatible Charge-Trap-Transistor (CTT)

• A PAC-Bayesian Analysis of Randomized Learning with Application to Stochastic Gradient Descent

• Learning of Coordination Policies for Robotic Swarms

• Localization in the Disordered Holstein model

• Secure Beamforming in Full-Duplex SWIPT Systems

• Dynamic Cross-Layer Beamforming in Hybrid Powered Communication Systems With Harvest-Use-Trade Strategy

• Multilevel mixed effects parametric survival analysis

• Estimating model evidence using ensemble-based data assimilation with localization – The model selection problem

• An Attention-based Collaboration Framework for Multi-View Network Representation Learning

• Construction C*: an improved version of Construction C

• On Graphs and the Gotsman-Linial Conjecture for d = 2

• Distributed event-triggered control for multi-agent formation stabilization and tracking

• Unique Information via Dependency Constraints

• Verifying Properties of Binarized Deep Neural Networks

• Think Globally, Embed Locally — Locally Linear Meta-embedding of Words

• An Optimality Proof for the PairDiff operator for Representing Relations between Words

• Controllability and data-driven identification of bipartite consensus on nonlinear signed networks

• Deep Lattice Networks and Partial Monotonic Functions

• Random matrices: repulsion in spectrum

• Concentration of distances in Wigner matrices

• Sieve: Actionable Insights from Monitored Metrics in Microservices

• Property Testing in High Dimensional Ising models

• A Voting-Based System for Ethical Decision Making

• Higher Distance Energies and Expanders with Structure

• Blind Estimation of Sparse Broadband Massive MIMO Channels with Ideal and One-bit ADCs

• Subset Testing and Analysis of Multiple Phenotypes (STAMP)

• Reversible Joint Hilbert and Linear Canonical Transform Without Distortion

• Online Learning of a Memory for Learning Rates

• Empowering In-Memory Relational Database Engines with Native Graph Processing

• Covering Numbers for Semicontinuous Functions

• Some inequalities for $k$-colored partition functions

• The cohomology of abelian Hessenberg varieties and the Stanley-Stembridge conjecture

• Measuring Player Retention and Monetization using the Mean Cumulative Function

• Equilibrium fluctuations for the weakly asymmetric discrete Atlas model

• SegFlow: Joint Learning for Video Object Segmentation and Optical Flow

• The Fourth Characteristic of a Semimartingale

• A shared latent space matrix factorisation method for recommending new trial evidence for systematic review updates

• Exponential concentration for zeroes of stationary Gaussian processes

• Transfer learning from synthetic to real images using variational autoencoders for robotic applications

• Real-time Semantic Segmentation of Crop and Weed for Precision Agriculture Robots Leveraging Background Knowledge in CNNs

• Latent Embeddings for Collective Activity Recognition

• A Central Limit Theorem for Fleming-Viot Particle Systems with Hard Killing

• Information-Coupled Turbo Codes for LTE Systems

• Careful prior specification avoids incautious inference for log-Gaussian Cox point processes

• Stochastic Channel Modeling for Diffusive Mobile Molecular Communication Systems

• A stencil scaling approach for accelerating matrix-free finite element implementations

• Complexity of Finding Perfect Bipartite Matchings Minimizing the Number of Intersecting Edges

• The Life in 1-Consensus

• Block-Diagonal Solutions to Lyapunov Inequalities and Generalisations of Diagonal Dominance

• Efficient Graph Edit Distance Computation and Verification via Anchor-aware Lower Bound Estimation

• On $2$-chains inside thin subsets of $\mathbb{R}^d$ and product of distances

• Affordable and Energy-Efficient Cloud Computing Clusters: The Bolzano Raspberry Pi Cloud Cluster Experiment

• Updating the silent speech challenge benchmark with deep learning

• Miscorrection-free Decoding of Staircase Codes

• Time-dependent reflection at the localization transition

• Differential transcendence & algebraicity criteria for the series counting weighted quadrant walks

• Atomic Norm Denoising-Based Joint Channel Estimation and Faulty Antenna Detection for Massive MIMO

• Higher Order Concentration of Measure

• UnDeepVO: Monocular Visual Odometry through Unsupervised Deep Learning

• Towards a better understanding of the matrix product function approximation algorithm in application to quantum physics

• Berry-Esseen Bounds for typical weighted sums

• Bandits with Delayed Anonymous Feedback

• A note on the 4-girth-thickness of K_{n,n,n}

• Specification tests in semiparametric transformation models

• On Energy Efficient Uplink Multi-User MIMO with Shared LNA Control

• Using marginal structural models to adjust for treatment drop-in when developing clinical prediction models

• Learning quadrangulated patches for 3D shape parameterization and completion

• Open Source Dataset and Deep Learning Models for Online Digit Gesture Recognition on Touchscreens

• Integrating hyper-parameter uncertainties in a multi-fidelity Bayesian model for the estimation of a probability of failure

• Forbidden Subgraphs for Chorded Pancyclicity

• The random pinning model with correlated disorder given by a renewal set

• De-identification of medical records using conditional random fields and long short-term memory networks

• EMR-based medical knowledge representation and inference via Markov random fields and distributed representation learning

• Linear Quadratic Games with Costly Measurements

• Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics

• Constructing a Hierarchical User Interest Structure based on User Profiles

• Bayesian Optimization with Automatic Prior Selection for Data-Efficient Direct Policy Search

• A Byzantine Fault-Tolerant Ordering Service for the Hyperledger Fabric Blockchain Platform

• Stock-out Prediction in Multi-echelon Networks

• Spatial features of synaptic adaptation affecting learning performance

• NOMA Assisted Wireless Caching: Strategies and Performance Analysis

• Iterated Stochastic Integrals in Infinite Dimensions – Approximation and Error Estimates

• Stochastic Burgers’ Equation on the Real Line: Regularity and Moment Estimates

• Synchronization in Kuramoto-Sakaguchi ensembles with competing influence of common noise and global coupling

• An Expectation Conditional Maximization approach for Gaussian graphical models

• New Examples of Dimension Zero Categories

• Deep Reinforcement Learning for Dexterous Manipulation with Concept Networks

• Characterization and enumeration of 3-regular permutation graphs

• Error-tolerant Multisecant Method for Nonlinearly Constrained Optimization

• Equilibrium-Independent Dissipativity with Quadratic Supply Rates

• Text Compression for Sentiment Analysis via Evolutionary Algorithms