**Learning Causally-Generated Stationary Time Series**

We present the Causal Gaussian Process Convolution Model (CGPCM), a doubly nonparametric model for causal, spectrally complex dynamical phenomena. The CGPCM is a generative model in which white noise is passed through a causal, nonparametric-window moving-average filter, a construction that we show to be equivalent to a Gaussian process with a nonparametric kernel that is biased towards causally-generated signals. We develop enhanced variational inference and learning schemes for the CGPCM and its previous acausal variant, the GPCM (Tobar et al., 2015b), that significantly improve statistical accuracy. These modelling and inferential contributions are demonstrated on a range of synthetic and real-world signals.

**Structured low-rank matrix completion for forecasting in time series analysis**

In this paper we consider the low-rank matrix completion problem with specific application to forecasting in time series analysis. Briefly, the low-rank matrix completion problem is the problem of imputing missing values of a matrix under a rank constraint. We consider a matrix completion problem for Hankel matrices and a convex relaxation based on the nuclear norm. Based on new theoretical results and a number of numerical and real examples, we investigate the cases when the proposed approach can work. Our results highlight the importance of choosing a proper weighting scheme for the known observations.

**Artificial Intelligence and Legal Liability**

A recent issue of a popular computing journal asked which laws would apply if a self-driving car killed a pedestrian. This paper considers the question of legal liability for artificially intelligent computer systems. It discusses whether criminal liability could ever apply; to whom it might apply; and, under civil law, whether an AI program is a product that is subject to product design legislation or a service to which the tort of negligence applies. The issue of sales warranties is also considered. A discussion of some of the practical limitations that AI systems are subject to is also included.

**Manipulating and Measuring Model Interpretability**

Despite a growing body of research focused on creating interpretable machine learning methods, there have been few empirical studies verifying whether interpretable methods achieve their intended effects on end users. We present a framework for assessing the effects of model interpretability on users via pre-registered experiments in which participants are shown functionally identical models that vary in factors thought to influence interpretability. Using this framework, we ran a sequence of large-scale randomized experiments, varying two putative drivers of interpretability: the number of features and the model transparency (clear or black-box). We measured how these factors impact trust in model predictions, the ability to simulate a model, and the ability to detect a model’s mistakes. We found that participants who were shown a clear model with a small number of features were better able to simulate the model’s predictions. However, we found no difference in multiple measures of trust and found that clear models did not improve the ability to correct mistakes. These findings suggest that interpretability research could benefit from more emphasis on empirically verifying that interpretable models achieve all their intended effects.

**Learning to Explain: An Information-Theoretic Perspective on Model Interpretation**

We introduce instancewise feature selection as a methodology for model interpretation. Our method is based on learning a function to extract a subset of features that are most informative for each given example. This feature selector is trained to maximize the mutual information between selected features and the response variable, where the conditional distribution of the response variable given the input is the model to be explained. We develop an efficient variational approximation to the mutual information, and show that the resulting method compares favorably to other model explanation methods on a variety of synthetic and real data sets using both quantitative metrics and human evaluation.

**Data Privacy for a $ρ$-Recoverable Function**

A user’s data is represented by a finite-valued random variable. Given a function of the data, a querier is required to recover, with at least a prescribed probability, the value of the function based on a query response provided by the user. The user devises the query response, subject to the recoverability requirement, so as to maximize privacy of the data from the querier. Privacy is measured by the probability of error incurred by the querier in estimating the data from the query response. We analyze single and multiple independent query responses, with each response satisfying the recoverability requirement, that provide maximum privacy to the user. Achievability schemes with explicit randomization mechanisms for query responses are given and their privacy compared with converse upper bounds.

**Federated Meta-Learning for Recommendation**

Recommender systems have been widely studied from the machine learning perspective, where it is crucial to share information among users while preserving user privacy. In this work, we present a federated meta-learning framework for recommendation in which user information is shared at the level of algorithm, instead of model or data adopted in previous approaches. In this framework, user-specific recommendation models are locally trained by a shared parameterized algorithm, which preserves user privacy and at the same time utilizes information from other users to help model training. Interestingly, the model thus trained exhibits a high capacity at a small scale, which is energy- and communication-efficient. Experimental results show that recommendation models trained by meta-learning algorithms in the proposed framework outperform the state-of-the-art in accuracy and scale. For example, on a production dataset, a shared model under Google Federated Learning (McMahan et al., 2017) with 900,000 parameters has prediction accuracy 76.72%, while a shared algorithm under federated meta-learning with less than 30,000 parameters achieves accuracy of 86.23%.

**Asynchronous Byzantine Machine Learning**

Asynchronous distributed machine learning solutions have proven very effective so far, but always assuming perfectly functioning workers. In practice, some of the workers can however exhibit Byzantine behavior, caused by hardware failures, software bugs, corrupt data, or even malicious attacks. We introduce \emph{Kardam}, the first distributed asynchronous stochastic gradient descent (SGD) algorithm that copes with Byzantine workers. Kardam consists of two complementary components: a filtering and a dampening component. The first is scalar-based and ensures resilience against

Byzantine workers. Essentially, this filter leverages the Lipschitzness of cost functions and acts as a self-stabilizer against Byzantine workers that would attempt to corrupt the progress of SGD. The dampening component bounds the convergence rate by adjusting to stale information through a generic gradient weighting scheme. We prove that Kardam guarantees almost sure convergence in the presence of asynchrony and Byzantine behavior, and we derive its convergence rate. We evaluate Kardam on the CIFAR-100 and EMNIST datasets and measure its overhead with respect to non Byzantine-resilient solutions. We empirically show that Kardam does not introduce additional noise to the learning procedure but does induce a slowdown (the cost of Byzantine resilience) that we both theoretically and empirically show to be less than

, where

is the number of Byzantine failures tolerated and

the total number of workers. Interestingly, we also empirically observe that the dampening component is interesting in its own right for it enables to build an SGD algorithm that outperforms alternative staleness-aware asynchronous competitors in environments with honest workers.

**SparCML: High-Performance Sparse Communication for Machine Learning**

One of the main drivers behind the rapid recent advances in machine learning has been the availability of efficient system support. This comes both through hardware acceleration, but also in the form of efficient software frameworks and programming models. Despite significant progress, scaling compute-intensive machine learning workloads to a large number of compute nodes is still a challenging task, with significant latency and bandwidth demands. In this paper, we address this challenge, by proposing SPARCML, a general, scalable communication layer for machine learning applications. SPARCML is built on the observation that many distributed machine learning algorithms either have naturally sparse communication patters, or have updates which can be sparsified in a structured way for improved performance, without any convergence or accuracy loss. To exploit this insight, we design and implement a set of communication efficient protocols for sparse input data, in conjunction with efficient machine learning algorithms which can leverage these primitives. Our communication protocols generalize standard collective operations, by allowing processes to contribute sparse input data vectors, of heterogeneous sizes. We call these operations sparse-input collectives, and present efficient practical algorithms with strong theoretical bounds on their running time and communication cost. Our generic communication layer is enriched with additional features, such support for non-blocking (asynchronous) operations, and support for low-precision data representations. We validate our algorithmic results experimentally on a range of large-scale machine learning applications and target architectures, showing that we can leverage sparsity for order- of-magnitude runtime savings, compared to state-of-the art methods and frameworks.

**The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets**

Machine learning models based on neural networks and deep learning are being rapidly adopted for many purposes. What those models learn, and what they may share, is a significant concern when the training data may contain secrets and the models are public — e.g., when a model helps users compose text messages using models trained on all users’ messages. This paper presents exposure: a simple-to-compute metric that can be applied to any deep learning model for measuring the memorization of secrets. Using this metric, we show how to extract those secrets efficiently using black-box API access. Further, we show that unintended memorization occurs early, is not due to over-fitting, and is a persistent issue across different types of models, hyperparameters, and training strategies. We experiment with both real-world models (e.g., a state-of-the-art translation model) and datasets (e.g., the Enron email dataset, which contains users’ credit card numbers) to demonstrate both the utility of measuring exposure and the ability to extract secrets. Finally, we consider many defenses, finding some ineffective (like regularization), and others to lack guarantees. However, by instantiating our own differentially-private recurrent model, we validate that by appropriately investing in the use of state-of-the-art techniques, the problem can be resolved, with high utility.

**Neural Predictive Coding using Convolutional Neural Networks towards Unsupervised Learning of Speaker Characteristics**

Learning speaker-specific features is vital in many applications like speaker recognition, diarization and speech recognition. This paper provides a novel approach, we term Neural Predictive Coding (NPC), to learn speaker-specific characteristics in a completely unsupervised manner from large amounts of unlabeled training data that even contain multi-speaker audio streams. The NPC framework exploits the proposed short-term active-speaker stationarity hypothesis which assumes two temporally-close short speech segments belong to the same speaker, and thus a common representation that can encode the commonalities of both the segments, should capture the vocal characteristics of that speaker. We train a convolutional deep siamese network to produce ‘speaker embeddings’ by optimizing a loss function that increases between-speaker variability and decreases within-speaker variability. The trained NPC model can produce these embeddings by projecting any test audio stream into a high dimensional manifold where speech frames of the same speaker come closer than they do in the raw feature space. Results in the frame-level speaker classification experiment along with the visualization of the embeddings manifest the distinctive ability of the NPC model to learn short-term speaker-specific features as compared to raw MFCC features and i-vectors. The utterance-level speaker classification experiments show that concatenating simple statistics of the short-term NPC embeddings over the whole utterance with the utterance-level i-vectors can give useful complimentary information to the i-vectors and boost the classification accuracy. The results also show the efficacy of this technique to learn those characteristics from large amounts of unlabeled training set which has no prior information about the environment of the test set.

**Multimodal Named Entity Recognition for Short Social Media Posts**

We introduce a new task called Multimodal Named Entity Recognition (MNER) for noisy user-generated data such as tweets or Snapchat captions, which comprise short text with accompanying images. These social media posts often come in inconsistent or incomplete syntax and lexical notations with very limited surrounding textual contexts, bringing significant challenges for NER. To this end, we create a new dataset for MNER called SnapCaptions (Snapchat image-caption pairs submitted to public and crowd-sourced stories with fully annotated named entities). We then build upon the state-of-the-art Bi-LSTM word/character based NER models with 1) a deep image network which incorporates relevant visual context to augment textual information, and 2) a generic modality-attention module which learns to attenuate irrelevant modalities while amplifying the most informative ones to extract contexts from, adaptive to each sample and token. The proposed MNER model with modality attention significantly outperforms the state-of-the-art text-only NER models by successfully leveraging provided visual contexts, opening up potential applications of MNER on myriads of social media platforms.

**L2-Nonexpansive Neural Networks**

This paper proposes a class of well-conditioned neural networks in which a unit amount of change in the inputs causes at most a unit amount of change in the outputs or any of the internal layers. We develop the known methodology of controlling Lipschitz constants to realize its full potential in maximizing robustness: our linear and convolution layers subsume those in the previous Parseval networks as a special case and allow greater degrees of freedom; aggregation, pooling, splitting and other operators are adapted in new ways, and a new loss function is proposed, all for the purpose of improving robustness. With MNIST and CIFAR-10 classifiers, we demonstrate a number of advantages. Without needing any adversarial training, the proposed classifiers exceed the state of the art in robustness against white-box L2-bounded adversarial attacks. Their outputs are quantitatively more meaningful than ordinary networks and indicate levels of confidence. They are also free of exploding gradients, among other desirable properties.

**The Clever Shopper Problem**

We investigate a variant of the so-called ‘Internet Shopping Problem’ introduced by Blazewicz et al. (2010), where a customer wants to buy a list of products at the lowest possible total cost from shops which offer discounts when purchases exceed a certain threshold. Although the problem is NP-hard, we provide exact algorithms for several cases, e.g. when each shop sells only two items, and an FPT algorithm for the number of items, or for the number of shops when all prices are equal. We complement each result with hardness proofs in order to draw a tight boundary between tractable and intractable cases. Finally, we give an approximation algorithm and hardness results for the problem of maximising the sum of discounts.

**The State of the Art in Integrating Machine Learning into Visual Analytics**

Visual analytics systems combine machine learning or other analytic techniques with interactive data visualization to promote sensemaking and analytical reasoning. It is through such techniques that people can make sense of large, complex data. While progress has been made, the tactful combination of machine learning and data visualization is still under-explored. This state-of-the-art report presents a summary of the progress that has been made by highlighting and synthesizing select research advances. Further, it presents opportunities and challenges to enhance the synergy between machine learning and visual analytics for impactful future research directions.

**Learning Topic Models by Neighborhood Aggregation**

Topic models are one of the most frequently used models in machine learning due to its high interpretability and modular structure. However extending the model to include supervisory signal, incorporate pre-trained word embedding vectors and add nonlinear output function to the model is not an easy task because one has to resort to highly intricate approximate inference procedure. In this paper, we show that topic models could be viewed as performing a neighborhood aggregation algorithm where the messages are passed through a network defined over words. Under the network view of topic models, nodes corresponds to words in a document and edges correspond to either a relationship describing co-occurring words in a document or a relationship describing same word in the corpus. The network view allows us to extend the model to include supervisory signals, incorporate pre-trained word embedding vectors and add nonlinear output function to the model in a simple manner. Moreover, we describe a simple way to train the model that is well suited in a semi-supervised setting where we only have supervisory signals for some portion of the corpus and the goal is to improve prediction performance in the held-out data. Through careful experiments we show that our approach outperforms state-of-the-art supervised Latent Dirichlet Allocation implementation in both held-out document classification tasks and topic coherence.

**Finding Top-k Optimal Sequenced Routes — Full Version**

Motivated by many practical applications in logistics and mobility-as-a-service, we study the top-k optimal sequenced routes (KOSR) querying on large, general graphs where the edge weights may not satisfy the triangle inequality, e.g., road network graphs with travel times as edge weights. The KOSR querying strives to find the top-k optimal routes (i.e., with the top-k minimal total costs) from a given source to a given destination, which must visit a number of vertices with specific vertex categories (e.g., gas stations, restaurants, and shopping malls) in a particular order (e.g., visiting gas stations before restaurants and then shopping malls). To efficiently find the top-k optimal sequenced routes, we propose two algorithms PruningKOSR and StarKOSR. In PruningKOSR, we define a dominance relationship between two partially-explored routes. The partially-explored routes that can be dominated by other partially-explored routes are postponed being extended, which leads to a smaller searching space and thus improves efficiency. In StarKOSR, we further improve the efficiency by extending routes in an A* manner. With the help of a judiciously designed heuristic estimation that works for general graphs, the cost of partially explored routes to the destination can be estimated such that the qualified complete routes can be found early. In addition, we demonstrate the high extensibility of the proposed algorithms by incorporating Hop Labeling, an effective label indexing technique for shortest path queries, to further improve efficiency. Extensive experiments on multiple real-world graphs demonstrate that the proposed methods significantly outperform the baseline method. Furthermore, when k=1, StarKOSR also outperforms the state-of-the-art method for the optimal sequenced route queries.

**Diversity regularization in deep ensembles**

Calibrating the confidence of supervised learning models is important for a variety of contexts where the certainty over predictions should be reliable. However, it has been reported that deep neural network models are often too poorly calibrated for achieving complex tasks requiring reliable uncertainty estimates in their prediction. In this work, we are proposing a strategy for training deep ensembles with a diversity function regularization, which improves the calibration property while maintaining a similar prediction accuracy.

**An Analysis of Categorical Distributional Reinforcement Learning**

Distributional approaches to value-based reinforcement learning model the entire distribution of returns, rather than just their expected values, and have recently been shown to yield state-of-the-art empirical performance. This was demonstrated by the recently proposed C51 algorithm, based on categorical distributional reinforcement learning (CDRL) [Bellemare et al., 2017]. However, the theoretical properties of CDRL algorithms are not yet well understood. In this paper, we introduce a framework to analyse CDRL algorithms, establish the importance of the projected distributional Bellman operator in distributional RL, draw fundamental connections between CDRL and the Cram\’er distance, and give a proof of convergence for sample-based categorical distributional reinforcement learning algorithms.

**Vector Field Based Neural Networks**

A novel Neural Network architecture is proposed using the mathematically and physically rich idea of vector fields as hidden layers to perform nonlinear transformations in the data. The data points are interpreted as particles moving along a flow defined by the vector field which intuitively represents the desired movement to enable classification. The architecture moves the data points from their original configuration to anew one following the streamlines of the vector field with the objective of achieving a final configuration where classes are separable. An optimization problem is solved through gradient descent to learn this vector field.

• Facilitated quantum cellular automata as simple models with nonthermal eigenstates and dynamics

• Machine Theory of Mind

• Determining the best classifier for predicting the value of a boolean field on a blood donor database

• Aggregating the response in time series regression models, applied to weather-related cardiovascular mortality

• Lossless Compression of Angiogram Foreground with Visual Quality Preservation of Background

• Generalizable Adversarial Examples Detection Based on Bi-model Decision Mismatch

• The Lattice of subracks is atomic

• Counting Motifs with Graph Sampling

• Left Ventricle Segmentation in Cardiac MR Images Using Fully Convolutional Network

• Proving ergodicity via divergence of ergodic sums

• Lossless Image Compression Algorithm for Wireless Capsule Endoscopy by Content-Based Classification of Image Blocks

• Reversible Image Watermarking for Health Informatics Systems Using Distortion Compensation in Wavelet Domain

• Segmentation of Bleeding Regions in Wireless Capsule Endoscopy Images an Approach for inside Capsule Video Summarization

• Semantic Segmentation Refinement by Monte Carlo Region Growing of High Confidence Detections

• Optimal Multi-User Scheduling of Buffer-Aided Relay Systems

• Liver Segmentation in Abdominal CT Images by Adaptive 3D Region Growing

• Communication Complexity of One-Shot Remote State Preparation

• Continuous Relaxation of MAP Inference: A Nonconvex Perspective

• Liver segmentation in CT images using three dimensional to two dimensional fully connected network

• A New Hybrid Half-Duplex/Full-Duplex Relaying System with Antenna Diversity

• Protecting Sensory Data against Sensitive Inferences

• Low complexity convolutional neural network for vessel segmentation in portable retinal diagnostic devices

• A Guide to Comparing the Performance of VA Algorithms

• Communication Melting in Graphs and Complex Networks

• Permanental processes with kernels that are not equivalent to a symmetric matrix

• Formalizing and Implementing Distributed Ledger Objects

• Phase transition for infinite systems of spiking neurons

• Variational Inference for Policy Gradient

• Learning to Gather without Communication

• Equivelar toroids with few flag-orbits

• Mutual Assent or Unilateral Nomination? A Performance Comparison of Intersection and Union Rules for Integrating Self-reports of Social Relationships

• CoVeR: Learning Covariate-Specific Vector Representations with Tensor Decompositions

• Convergent Actor-Critic Algorithms Under Off-Policy Training and Function Approximation

• Concise Complexity Analyses for Trust-Region Methods

• Detecting Small, Densely Distributed Objects with Filter-Amplifier Networks and Loss Boosting

• Cross-Modality Synthesis from CT to PET using FCN and GAN Networks for Improved Automated Lesion Detection

• Driver Hand Localization and Grasp Analysis: A Vision-based Real-time Approach

• xView: Objects in Context in Overhead Imagery

• MPST: A Corpus of Movie Plot Synopses with Tags

• Modelling spatiotemporal variation of positive and negative sentiment on Twitter to improve the identification of localised deviations

• Efficient Enumeration of Dominating Sets for Sparse Graphs

• Multi-Sensor Integration for Indoor 3D Reconstruction

• End-to-end learning of keypoint detector and descriptor for pose invariant 3D matching

• Regularity of biased 1D random walks in random environment

• Improved Techniques For Weakly-Supervised Object Localization

• Entropy Rate Estimation for Markov Chains with Large State Space

• A New Design of Binary MDS Array Codes with Asymptotically Weak-Optimal Repair

• Learning Mixtures of Linear Regressions with Nearly Optimal Complexity

• Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points

• On the implementation of a primal-dual algorithm for second order time-dependent mean field games with local couplings

• Safety-Aware Optimal Control of Stochastic Systems Using Conditional Value-at-Risk

• Regional Multi-Armed Bandits

• Video Person Re-identification by Temporal Residual Learning

• Magnetoresistance in organic semiconductors: including pair correlations in the kinetic equations for hopping transport

• Dynamic Output Feedback Guaranteed-Cost Synchronization for Multiagent Networks with Given Cost Budgets

• Exploiting Inter-User Interference for Secure Massive Non-Orthogonal Multiple Access

• The Hidden Vulnerability of Distributed Learning in Byzantium

• Graph-Based Blind Image Deblurring From a Single Photograph

• Where’s YOUR focus: Personalized Attention

• Faster integer multiplication using short lattice vectors

• Adversarial Learning for Semi-Supervised Semantic Segmentation

• Asynchronous stochastic approximations with asymptotically biased errors and deep multi-agent learning

• Two theorems on distribution of Gaussian quadratic forms

• Aspect-Aware Latent Factor Model: Rating Prediction with Ratings and Reviews

• On detection of Gaussian stochastic sequences

• Numerical integration in arbitrary-precision ball arithmetic

• Actigraphy-based Sleep/Wake Pattern Detection using Convolutional Neural Networks

• Stereo obstacle detection for unmanned surface vehicles by IMU-assisted semantic segmentation

• Non-rigid Object Tracking via Deep Multi-scale Spatial-Temporal Discriminative Saliency Maps

• Topological phases of non-Hermitian systems

• Incremental and Iterative Learning of Answer Set Programs from Mutually Distinct Examples

• Near Isometric Terminal Embeddings for Doubling Metrics

• Robustness of classifiers to uniform $\ell\_p$ and Gaussian noise

• Cambrian acyclic domains: counting $c$-singletons

• Learning to Route with Sparse Trajectory Sets—Extended Version

• Joint Antenna Selection and Phase-Only Beamforming Using Mixed-Integer Nonlinear Programming

• Decomposition of a graph into two disjoint odd subgraphs

• Multidimensional multiscale scanning in Exponential Families: Limit theory and statistical consequences

• Generating High-Quality Query Suggestion Candidates for Task-Based Search

• Robust estimators in a generalized partly linear regression model under monotony constraints

• On the permanent of Sylvester-Hadamard matrices

• The use of sampling weights in the M-quantile random-effects regression: an application to PISA mathematics scores

• Sounderfeit: Cloning a Physical Model with Conditional Adversarial Autoencoders

• Iterate averaging as regularization for stochastic gradient descent

• Towards an Understanding of Entity-Oriented Search Intents

• Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks

• Spanned lines and Langer’s inequality

• Structure and Supersaturation for Intersecting Families

• On Rational Delegations in Liquid Democracy

• The Best of Both Worlds: Asymptotically Efficient Mechanisms with a Guarantee on the Expected Gains-From-Trade

• Complex-valued Neural Networks with Non-parametric Activation Functions

• Stabilizing discrete-time linear systems

• Synchronizing the Smallest Possible System

• Are Two (Samples) Really Better Than One? On the Non-Asymptotic Performance of Empirical Revenue Maximization

• 2VRP: a benchmark problem for small but rich VRPs

• Data Consistency Simulation Tool for NoSQL Database Systems

• MagnifyMe: Aiding Cross Resolution Face Recognition via Identity Aware Synthesis

• Classification of Breast Cancer Histology using Deep Learning

• Applications of Optimal Control of a Nonconvex Sweeping Process to Optimization of the Planar Crowd Motion Model

• Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem

• Understanding the Performance of Ceph Block Storage for Hyper-Converged Cloud with All Flash Storage

• Adaptive synchronisation of unknown nonlinear networked systems with prescribed performance

• Sparse Bayesian dynamic network models, with genomics applications

• Harmonious Attention Network for Person Re-Identification

• A novel incentive-based demand response model for Cournot competition in electricity markets

• Scaling limits of discrete snakes with stable branching

• Reliable Intersection Control in Non-cooperative Environments

• Path-Specific Counterfactual Fairness

• Stability and Optimal Control of Switching PDE-Dynamical Systems

• A note on friezes of type $Λ_4$ and $Λ_6$

• LIDIOMS: A Multilingual Linked Idioms Data Set

• RDF2PT: Generating Brazilian Portuguese Texts from RDF Data

• Collaboratively Learning the Best Option, Using Bounded Memory

• Correlation-Adjusted Survival Scores for High-Dimensional Variable Selection

• Projection-Free Online Optimization with Stochastic Gradient: From Convexity to Submodularity

• The spatial Lambda-Fleming-Viot process with fluctuating selection

• Large and realistic models of Amorphous Silicon

• Large-scale limit of interface fluctuation models

• Seeing the forest for the trees? An investigation of network knowledge

• Adversarial Examples that Fool both Human and Computer Vision

• A Polynomial Time Subsumption Algorithm for Nominal Safe $\mathcal{ELO}_\bot$ under Rational Closure

• Half-space Macdonald processes

• ChatPainter: Improving Text to Image Generation using Dialogue

• A new model for Cerebellar computation

• VizWiz Grand Challenge: Answering Visual Questions from Blind People

• Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds

• Achievable Rate of Private Function Retrieval from MDS Coded Databases

• Thresholds for vanishing of `Isolated’ faces in random Čech and Vietoris-Rips complexes

• Quantum linear systems algorithms: a primer

• Energy Transfer and Spectra in Simulations of Two-dimensional Compressible Turbulence

• A Better (Bayesian) Interval Estimate for Within-Subject Designs

• Pattern-based Modeling of Multiresilience Solutions for High-Performance Computing

• NetChain: Scale-Free Sub-RTT Coordination (Extended Version)

• Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover

• What are the most important factors that influence the changes in London Real Estate Prices? How to quantify them?

• Hessian-based Analysis of Large Batch Training and Robustness to Adversaries

• Arbitrarily Substantial Number Representation for Complex Number

• Characterizing Implicit Bias in Terms of Optimization Geometry