Communication Scheduling as a First-Class Citizen in Distributed Machine Learning Systems

State-of-the-art machine learning systems rely on graph-based models, with the distributed training of these models being the norm in AI-powered production pipelines. The performance of these communication-heavy systems depends on the effective overlap of communication and computation. While the overlap challenge has been addressed in systems with simpler model representations, it remains an open problem in graph-based models. In this work, we develop a system for communication scheduling which realizes near-optimal overlap of communication and computation in graph-based models. Our system is implemented over TensorFlow and requires no changes in the model or developer inputs. Our system improves the throughput by up to 82% in inference and 20% in training, while also reducing straggler effect by up to 2.8x. A part of our implementation is already merged with TensorFlow codebase; the rest is publicly available.

Ensuring referential integrity under causal consistency

Referential integrity (RI) is an important correctness property of a shared, distributed object storage system. It is sometimes thought that enforcing RI requires a strong form of consistency. In this paper, we argue that causal consistency suffices to maintain RI. We support this argument with pseudocode for a reference CRDT data type that maintains RI under causal consistency. QuickCheck has not found any errors in the model.

Competitive Machine Learning: Best Theoretical Prediction vs Optimization

Machine learning is often used in competitive scenarios: Participants learn and fit static models, and those models compete in a shared platform. The common assumption is that in order to win a competition one has to have the best predictive model, i.e., the model with the smallest out-sample error. Is that necessarily true? Does the best theoretical predictive model for a target always yield the best reward in a competition? If not, can one take the best model and purposefully change it into a theoretically inferior model which in practice results in a higher competitive edge? How does that modification look like? And finally, if all participants modify their prediction models towards the best practical performance, who benefits the most? players with inferior models, or those with theoretical superiority? The main theme of this paper is to raise these important questions and propose a theoretical model to answer them. We consider a study case where two linear predictive models compete over a shared target. The model with the closest estimate gets the whole reward, which is equal to the absolute value of the target. We characterize the reward function of each model, and using a basic game theoretic approach, demonstrate that the inferior competitor can significantly improve his performance by choosing optimal model coefficients that are different from the best theoretical prediction. This is a preliminary study that emphasizes the fact that in many applications where predictive machine learning is at the service of competition, much can be gained from practical (back-testing) optimization of the model compared to static prediction improvement.

Sequential Outlier Detection based on Incremental Decision Trees

We introduce an online outlier detection algorithm to detect outliers in a sequentially observed data stream. For this purpose, we use a two-stage filtering and hedging approach. In the first stage, we construct a multi-modal probability density function to model the normal samples. In the second stage, given a new observation, we label it as an anomaly if the value of aforementioned density function is below a specified threshold at the newly observed point. In order to construct our multi-modal density function, we use an incremental decision tree to construct a set of subspaces of the observation space. We train a single component density function of the exponential family using the observations, which fall inside each subspace represented on the tree. These single component density functions are then adaptively combined to produce our multi-modal density function, which is shown to achieve the performance of the best convex combination of the density functions defined on the subspaces. As we observe more samples, our tree grows and produces more subspaces. As a result, our modeling power increases in time, while mitigating overfitting issues. In order to choose our threshold level to label the observations, we use an adaptive thresholding scheme. We show that our adaptive threshold level achieves the performance of the optimal pre-fixed threshold level, which knows the observation labels in hindsight. Our algorithm provides significant performance improvements over the state of the art in our wide set of experiments involving both synthetic as well as real data.

TRAJEDI: Trajectory Dissimilarity

The vast increase in our ability to obtain and store trajectory data necessitates trajectory analytics techniques to extract useful information from this data. Pair-wise distance functions are a foundation building block for common operations on trajectory datasets including constrained SELECT queries, k-nearest neighbors, and similarity and diversity algorithms. The accuracy and performance of these operations depend heavily on the speed and accuracy of the underlying trajectory distance function, which is in turn affected by trajectory calibration. Current methods either require calibrated data, or perform calibration of the entire relevant dataset first, which is expensive and time consuming for large datasets. We present TRAJEDI, a calibrationaware pair-wise distance calculation scheme that outperforms naive approaches while preserving accuracy. We also provide analyses of parameter tuning to trade-off between speed and accuracy. Our scheme is usable with any diversity, similarity or k-nearest neighbor algorithm.

Attention-based Graph Neural Network for Semi-supervised Learning

Recently popularized graph neural networks achieve the state-of-the-art accuracy on a number of standard benchmark datasets for graph-based semi-supervised learning, improving significantly over existing approaches. These architectures alternate between a propagation layer that aggregates the hidden states of the local neighborhood and a fully-connected layer. Perhaps surprisingly, we show that a linear model, that removes all the intermediate fully-connected layers, is still able to achieve a performance comparable to the state-of-the-art models. This significantly reduces the number of parameters, which is critical for semi-supervised learning where number of labeled examples are small. This in turn allows a room for designing more innovative propagation layers. Based on this insight, we propose a novel graph neural network that removes all the intermediate fully-connected layers, and replaces the propagation layers with attention mechanisms that respect the structure of the graph. The attention mechanism allows us to learn a dynamic and adaptive local summary of the neighborhood to achieve more accurate predictions. In a number of experiments on benchmark citation networks datasets, we demonstrate that our approach outperforms competing methods. By examining the attention weights among neighbors, we show that our model provides some interesting insights on how neighbors influence each other.

Evolutionary Architecture Search For Deep Multitask Networks

Multitask learning, i.e. learning several tasks at once with the same neural network, can improve performance in each of the tasks. Designing deep neural network architectures for multitask learning is a challenge: There are many ways to tie the tasks together, and the design choices matter. The size and complexity of this problem exceeds human design ability, making it a compelling domain for evolutionary optimization. Using the existing state of the art soft ordering architecture as the starting point, methods for evolving the modules of this architecture and for evolving the overall topology or routing between modules are evaluated in this paper. A synergetic approach of evolving custom routings with evolved, shared modules for each task is found to be very powerful, significantly improving the state of the art in the Omniglot multitask, multialphabet character recognition domain. This result demonstrates how evolution can be instrumental in advancing deep neural network and complex system design in general.

Variance Networks: When Expectation Does Not Meet Your Expectations

In this paper, we propose variance networks, a new model that stores the learned information in the variances of the network weights. Surprisingly, no information gets stored in the expectations of the weights, therefore if we replace these weights with their expectations, we would obtain a random guess quality prediction. We provide a numerical criterion that uses the loss curvature to determine which random variables can be replaced with their expected values, and find that only a small fraction of weights is needed for ensembling. Variance networks represent a diverse ensemble that is more robust to adversarial attacks than conventional low-variance ensembles. The success of this model raises several counter-intuitive implications for the training and application of Deep Learning models.

ARMDN: Associative and Recurrent Mixture Density Networks for eRetail Demand Forecasting

Accurate demand forecasts can help on-line retail organizations better plan their supply-chain processes. The challenge, however, is the large number of associative factors that result in large, non-stationary shifts in demand, which traditional time series and regression approaches fail to model. In this paper, we propose a Neural Network architecture called AR-MDN, that simultaneously models associative factors, time-series trends and the variance in the demand. We first identify several causal features and use a combination of feature embeddings, MLP and LSTM to represent them. We then model the output density as a learned mixture of Gaussian distributions. The AR-MDN can be trained end-to-end without the need for additional supervision. We experiment on a dataset of an year’s worth of data over tens-of-thousands of products from Flipkart. The proposed architecture yields a significant improvement in forecasting accuracy when compared with existing alternatives.

CIoTA: Collaborative IoT Anomaly Detection via Blockchain

Due to their rapid growth and deployment, Internet of things (IoT) devices have become a central aspect of our daily lives. However, they tend to have many vulnerabilities which can be exploited by an attacker. Unsupervised techniques, such as anomaly detection, can help us secure the IoT devices. However, an anomaly detection model must be trained for a long time in order to capture all benign behaviors. This approach is vulnerable to adversarial attacks since all observations are assumed to be benign while training the anomaly detection model. In this paper, we propose CIoTA, a lightweight framework that utilizes the blockchain concept to perform distributed and collaborative anomaly detection for devices with limited resources. CIoTA uses blockchain to incrementally update a trusted anomaly detection model via self-attestation and consensus among IoT devices. We evaluate CIoTA on our own distributed IoT simulation platform, which consists of 48 Raspberry Pis, to demonstrate CIoTA’s ability to enhance the security of each device and the security of the network as a whole.

Kickstarting Deep Reinforcement Learning

We present a method for using previously-trained ‘teacher’ agents to kickstart the training of a new ‘student’ agent. To this end, we leverage ideas from policy distillation and population based training. Our method places no constraints on the architecture of the teacher or student agents, and it regulates itself to allow the students to surpass their teachers in performance. We show that, on a challenging and computationally-intensive multi-task benchmark (DMLab-30), kickstarted training improves the data efficiency of new agents, making it significantly easier to iterate on their design. We also show that the same kickstarting pipeline can allow a single student agent to leverage multiple ‘expert’ teachers which specialize on individual tasks. In this setting kickstarting yields surprisingly large gains, with the kickstarted agent matching the performance of an agent trained from scratch in almost 10x fewer steps, and surpassing its final performance by 42 percent. Kickstarting is conceptually simple and can easily be incorporated into reinforcement learning experiments.

Detecting Nonlinear Causality in Multivariate Time Series with Sparse Additive Models

We propose a nonparametric method for detecting nonlinear causal relationship within a set of multidimensional discrete time series, by using sparse additive models (SpAMs). We show that, when the input to the SpAM is a \beta-mixing time series, the model can be fitted by first approximating each unknown function with a linear combination of a set of B-spline bases, and then solving a group-lasso-type optimization problem with nonconvex regularization. Theoretically, we characterize the oracle statistical properties of the proposed sparse estimator in function estimation and model selection. Numerically, we propose an efficient pathwise iterative shrinkage thresholding algorithm (PISTA), which tames the nonconvexity and guarantees linear convergence towards the desired sparse estimator with high probability.

NVIDIA Tensor Core Programmability, Performance & Precision

The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called ‘Tensor Core’ that performs one matrix-multiply-and-accumulate on 4×4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to program NVIDIA Tensor Cores, their performances and the precision loss due to computation in mixed precision. Currently, NVIDIA provides three different ways of programming matrix-multiply-and-accumulate on Tensor Cores: the CUDA Warp Matrix Multiply Accumulate (WMMA) API, CUTLASS, a templated library based on WMMA, and cuBLAS GEMM. After experimenting with different approaches, we found that NVIDIA Tensor Cores can deliver up to 83 Tflops/s in mixed precision on a Tesla V100 GPU, seven and three times the performance in single and half precision respectively. A WMMA implementation of batched GEMM reaches a performance of 4 Tflops/s. While precision loss due to matrix multiplication with half precision input might be critical in many HPC applications, it can be considerably reduced at the cost of increased computation. Our results indicate that HPC applications using matrix multiplications can strongly benefit from using of NVIDIA Tensor Cores.

Deep Dictionary Learning: A PARametric NETwork Approach

Deep dictionary learning seeks multiple dictionaries at different image scales to capture complementary coherent characteristics. We propose a method for learning a hierarchy of synthesis dictionaries with an image classification goal. The dictionaries and classification parameters are trained by a classification objective, and the sparse features are extracted by reducing a reconstruction loss in each layer. The reconstruction objectives in some sense regularize the classification problem and inject source signal information in the extracted features. The performance of the proposed hierarchical method increases by adding more layers, which consequently makes this model easier to tune and adapt. The proposed algorithm furthermore, shows remarkably lower fooling rate in presence of adversarial perturbation. The validation of the proposed approach is based on its classification performance using four benchmark datasets and is compared to a CNN of similar size.

Partially Linear Spatial Probit Models

A partially linear probit model for spatially dependent data is considered. A triangular array setting is used to cover various patterns of spatial data. Conditional spatial heteroscedasticity and non-identically distributed observations and a linear process for disturbances are assumed, allowing various spatial dependencies. The estimation procedure is a combination of a weighted likelihood and a generalized method of moments. The procedure first fixes the parametric components of the model and then estimates the non-parametric part using weighted likelihood; the obtained estimate is then used to construct a GMM parametric component estimate. The consistency and asymptotic distribution of the estimators are established under sufficient conditions. Some simulation experiments are provided to investigate the finite sample performance of the estimators.

Directing Chemotaxis-Based Spatial Self-Organization via Biased, Random Initial Conditions
Disconnected Cuts in Claw-free Graphs
Standing Wave Decomposition Gaussian Process
Provably robust estimation of modulo 1 samples of a smooth function with applications to phase unwrapping
Nonparametric Risk Assessment and Density Estimation for Persistence Landscapes
Network Traffic Driven Storage Repair
Scoring Formulation for Multi-Condition Joint PLDA
Data Driven Stability Analysis of Black-box Switched Linear Systems
Bit-Tactical: Exploiting Ineffectual Computations in Convolutional Neural Networks: Which, Why, and How
Three colour bipartite Ramsey number of cycles and paths
On the information in spike timing: neural codes derived from polychronous groups
Markov chains under nonlinear expectation
Minimum $T$-Joins and Signed-Circuit Covering
Community Interaction and Conflict on the Web
Semimartingales and Shrinkage of Filtration
Edge-decomposing graphs into coprime forests
Quasi-Equilibrium Problems with Non-self Constraint Map
Computational Complexity of Generalized Push Fight
Local Kernels that Approximate Bayesian Regularization and Proximal Operators
Accelerated Wirtinger Flow for Multiplexed Fourier Ptychographic Microscopy
Random Partitions and Cohen-Lenstra Heuristics
Contour Parametrization via Anisotropic Mean Curvature Flows
Geodesic nets with three boundary vertices
A Large-Scale Multi-Institutional Evaluation of Advanced Discrimination Algorithms for Buried Threat Detection in Ground Penetrating Radar
Mobile Edge Computing for Cellular-Connected UAV: Computation Offloading and Trajectory Optimization
Enhancing Evolutionary Optimization in Uncertain Environments by Allocating Evaluations via Multi-armed Bandit Algorithms
Cluster Size Optimization in Cooperative Spectrum Sensing
Tokunaga self-similarity arises naturally from time invariance
Enhanced Optimization with Composite Objectives and Novelty Selection
Counting trees in a graph
Optimum Linear Codes with Support Constraints over Small Fields
Influence of the Event Rate on Discrimination Abilities of Bankruptcy Prediction Models
Speech Recognition: Keyword Spotting Through Image Recognition
How to solve the stochastic partial differential equation that gives a Matérn random field using the finite element method
Multi-Agent Submodular Optimization
A Minimax Surrogate Loss Approach to Conditional Difference Estimation
Generalization and Expressivity for Deep Nets
Driving Scene Perception Network: Real-time Joint Detection, Depth Estimation and Semantic Segmentation
We Built a Fake News & Click-bait Filter: What Happened Next Will Blow Your Mind!
The Maker-Breaker Rado game on a random set of integers
Approximation schemes for mixed optimal stopping and control problems with nonlinear expectations and jumps
Efficient FPGA Implementation of Conjugate Gradient Methods for Laplacian System using HLS
Estimating fast mean-reverting jumps in electricity market models
Analog of Anderson theorem for the polar phase of liquid 3He in nematic aerogel
Two comments on balls in vertex transitive graphs
Distinct collective states due to the trade-off between attractive and repulsive couplings
ShuffleSeg: Real-time Semantic Segmentation Network
Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Fire detection in a still image using colour information
Graph-based Clustering under Differential Privacy
Viscosity Solution for Optimal Stopping Problems of Feller Processes
Submodular Hypergraphs: p-Laplacians, Cheeger Inequalities and Spectral Clustering
Learning and analyzing vector encoding of symbolic representations
Sample-Relaxed Two-Dimensional Color Principal Component Analysis for Face Recognition and Image Reconstruction
Efficient Enumeration of Bipartite Subgraphs in Graphs
Gradient estimates for SDEs without monotonicity type conditions
Learning to Localize Sound Source in Visual Scenes
Revisiting Decomposable Submodular Function Minimization with Incidence Relations
A Deep Learning Approach for Pose Estimation from Volumetric OCT Data
Determination of the 4-genus of a complete graph (with an appendix)
Webly Supervised Learning with Category-level Semantic Information
Testing One Hypothesis Multiple Times: The Multidimensional Case
Language Identification of Bengali-English Code-Mixed data using Character & Phonetic based LSTM Models
A tight $\sin Θ$ theorem for empirical covariance operators
Jamming in Perspective
Detecting Adversarial Examples via Neural Fingerprinting
Empirical Likelihood Based Summary ROC Curve for Meta-Analysis of Diagnostic Studies
On dynamic ensemble selection and data preprocessing for multi-class imbalance learning
Knowledge Aided Consistency for Weakly Supervised Phrase Grounding
Combating Adversarial Attacks Using Sparse Representations
Parallel FPGA Router using Sub-Gradient method and Steiner tree
Path of Vowel Raising in Chengdu Dialect of Mandarin
Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction
Reduction of Restricted Maximum Likelihood for Random Coefficient Models
Improved Asymptotics for Zeros of Kernel Estimates via a Reformulation of the Leadbetter-Cryer Integral
Optimal Data-based Kernel Estimation of Evolutionary Spectra
Posterior Contraction and Credible Sets for Filaments of Regression Functions
Piecewise Convex Function Estimation: Pilot Estimators
Piecewise Convex Function Estimation: Representations, Duality and Model Selection
Piecewise Convex Function Estimation and Model Selection
Banded Matrix Fraction Representation of Triangular Input Normal Pairs
Adaptive Kernel Estimation of the Spectral Density with Boundary Kernel Analysis
Pickup and Delivery Problem with Transfers
Fast Adaptive Identification of Stable Innovation Filters
A pathway-based kernel boosting method for sample classification using genomic data
Optimal Estimation of Dynamically Evolving Diffusivities
$k$-Error linear complexity for multidimensional arrays
Forbidden subgraphs for constant domination number
Deep reinforcement learning for time series: playing idealized trading games
Generating Bilingual Pragmatic Color References
Salable Breadth-First Search on a GPU Cluster
Translation Deformations of Matroid Representation
Cubic Range Error Model for Stereo Vision with Illuminators
Empirical bounds for functions with weak interactions
Nonconvex weak sharp minima on Riemannian manifolds
Exact uniform sampling over catalan structures
Paths between colourings of sparse graphs
The Secure Machine: Efficient Secure Execution On Untrusted Platforms
Fractional L-intersecting families
Deeply supervised neural network with short connections for retinal vessel segmentation
BEBP: An Poisoning Method Against Machine Learning Based IDSs
Hybrid Beamforming for 5G Millimeter-Wave Multi-Cell Networks
Contextualizing selection bias in Mendelian randomization: how bad is it likely to be?
On Trade in Bilateral Oligopolies with Altruistic and Spiteful Agents
Adaptive Smoothing of the Log-Spectrum with Multiple Tapering
Kernels by rainbow paths in arc-colored tournaments
Function Estimation Using Data Adaptive Kernel Estimation – How Much Smoothing?
Preparing Bengali-English Code-Mixed Corpus for Sentiment Analysis of Indian Languages
Incentives in the Dark: Multi-armed Bandits for Evolving Users with Unknown Type
Multi-objective Contextual Bandit Problem with Similarity Information
Data-Augmented Contact Model for Rigid Body Simulation
Maximum Weight Spectrum Codes
The shortness of human life constitutes its limit
Reproducibility and Pseudo-Determinism in Log-Space
A Linear Algebraic Approach to Subfield Subcodes of GRS Codes
Upper bounds for domination numbers of graphs using Turán’s Theorem and Lovász local lemma
Cascade context encoder for improved inpainting
Entity Resolution and Federated Learning get a Federated Resolution
Sales forecasting using WaveNet within the framework of the Kaggle competition
Updating Beamformers to Respond to Changes in Users
Combinatorial Multi-Objective Multi-Armed Bandit Problem
Hard-core configurations on a triangular lattice and Eisenstein primes
Interpreting Deep Classifier by Visual Distillation of Dark Knowledge
Coxeter groups and quiver representations
Some adaptive analog of Yu.~E.~Nesterov’s method for variational inequalities with a strongly monotone field
Exponential Condition Number of Solutions of the Discrete Lyapunov Equation
Multiple Instance Choquet Integral Classifier Fusion and Regression for Remote Sensing Applications
PCA by Determinant Optimization has no Spurious Local Optima
Representation Learning over Dynamic Graphs
On an alternative sequence comparison statistic of Steele
Learning Local Distortion Visibility From Image Quality Data-sets
Two-Stage Convolutional Neural Network for Breast Cancer Histology Image Classification
Cache-Assisted Broadcast-Relay Wireless Networks: A Delivery-Time Cache-Memory Tradeoff
Delivery Time Minimization in Cache-Assisted Broadcast-Relay Wireless Networks with Imperfect CSI
Pseudo-task Augmentation: From Deep Multitask Learning to Intratask Sharing—and Back
Blind Identification of Invertible Graph Filters with Multiple Sparse Inputs
Kernel estimation of the instantaneous frequency
Statistical tests for evaluating an earthquake prediction method
Minimum bias multiple taper spectral estimation
Link prediction for egocentrically sampled networks
Learning Binary Bayesian Networks in Polynomial Time and Sample Complexity
Algorithmic Trading with Partial Information: A Mean Field Game Approach
Structure-preserving $H^2$ optimal model reduction based on Riemannian trust-region method
Full Reference Objective Quality Assessment for Reconstructed Background Images
Dedekind Zeta Zeroes and Faster Complex Dimension Computation
Style Aggregated Network for Facial Landmark Detection
A Deep Learning Based Behavioral Approach to Indoor Autonomous Navigation
GPU Accelerated Self-join for the Distance Similarity Metric
Innovative Texture Database Collecting Approach and Feature Extraction Method based on Combination of Gray Tone Difference Matrixes, Local Binary Patterns,and K-means Clustering
Malliavin Calculus for Non-colliding Particle Systems
Deep Class-Wise Hashing: Semantics-Preserving Hashing via Class-wise Loss
Influence in systems with convex decisions
A Modular Design for Geo-Distributed Querying
Solving Markov decision processes for network-level post-hazard recovery via simulation optimization and rollout
Poisson Local Limit Theorems for Poisson’s Binomial in the Case of Infinite Expectation
A Twisted Burnside Lemma and size-independent statistics on finite linear groups
Scaled penalization of Brownian motion with drift and the Brownian ascent
On Overcoming the Impact of Doppler Spectrum in Millimeter-Wave V2I Communications
Confined spatial networks with wireless applications
Entanglement-assisted quantum MDS codes from constacyclic codes with large minimum distance
On the first-passage area of a L$\acute{\text{e}}$vy process
R3Net: Random Weights, Rectifier Linear Units and Robustness for Artificial Neural Network
Noise2Noise: Learning Image Restoration without Clean Data
Extreme Learning Machine for Graph Signal Processing
Wireless Energy Transfer to a Pair of Energy Receivers using Signal Strength Feedback
Multi-kernel Regression For Graph Signal Processing
Automated non-mass enhancing lesion detection and segmentation in breast DCE-MRI
Semiparametric Contextual Bandits
A.s. convergence for infinite colour Pólya urns associated with random walks
High Throughput Synchronous Distributed Stochastic Gradient Descent
Increasing the Degree of Parallelism Using Speculative Execution in Task-based Runtime Systems
Hybrid interconnection of iterative bidding and power network dynamics for frequency regulation and optimal dispatch
Extremal dependence of random scale constructions
Leveraging Crowdsourcing Data For Deep Active Learning – An Application: Learning Intents in Alexa
Omnidirectional CNN for Visual Place Recognition and Navigation
Variational Inference for Gaussian Process with Panel Count Data
Approximate Bayesian Computation in controlled branching processes: the role of summary statistics
Causal Consistency and Latency Optimality: Friend or Foe?
FeTa: A DCA Pruning Algorithm with Generalization Error Guarantees
Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation
Bayesian inference for a partially observed birth-death process using data on proportions
SO-Net: Self-Organizing Network for Point Cloud Analysis
Efficient construction of Bayes optimal designs for stochastic process models
FDRC: Flow-Driven Rule Caching Optimization in Software Defined Networking
Super-Resolution of Sentinel-2 Images: Learning a Globally Applicable Deep Neural Network
Size of a minimal cutset in supercritical first passage percolation
Quadratic and symmetric bilinear forms over finite fields and their association schemes
SDPMN: Privacy Preserving MapReduce Network Using SDN
In-depth Assessment of an Interactive Graph-based Approach for the Segmentation for Pancreatic Metastasis in Ultrasound Acquisitions of the Liver with two Specialists in Internal Medicine
Linear-Time In-Place DFS and BFS in the Restore Model
Many-body localization transition with power-law interactions: Statistics of eigenstates
Entity-Aware Language Model as an Unsupervised Reranker
Geodabs: Trajectory Indexing Meets Fingerprinting at Scale
Error estimates for the approximation of a discrete-valued optimal control problem
Neural Conditional Gradients
Learning unknown ODE models with Gaussian processes
Representation Learning and Recovery in the ReLU Model
The Everlasting Database: Statistical Validity at a Fair Price
Deep Learning in Mobile and Wireless Networking: A Survey
Theoretical Bounds and Constructions of Codes in the Generalized Cayley Metric
Power-Efficient Deployment of UAVs as Relays
New Algorithms for Weighted $k$-Domination and Total $k$-Domination Problems in Proper Interval Graphs
Toric varieties associated to root systems
Semantic Parsing Natural Language into SPARQL: Improving Target Language Representation with Neural Attention
Replication study: Development and validation of deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs
A Hybrid Quantum-Classical Paradigm to Mitigate Embedding Costs in Quantum Annealing
On the chromatic number of a subgraph of the Kneser graph
Classifying Online Dating Profiles on Tinder using FaceNet Facial Embeddings Tracking all individuals in large collectives of unmarked animals
Partial Identifiability of Restricted Latent Class Models
Topical Community Detection in Event-based Social Network
Learning the Base Distribution in Implicit Generative Models
Beyond Gröbner Bases: Basis Selection for Minimal Solvers
M-estimation in high-dimensional linear model
The stochastic Cauchy problem driven by a cylindrical Levy process
Differential Equations Driven by Variable Order Hölder Noise, and the Regularizing Effect of Delay
Optimal Rates of Sketched-regularized Algorithms for Least-Squares Regression over Hilbert Spaces
A Feature-Rich Vietnamese Named-Entity Recognition Model
Discriminability objective for training descriptive captions
Effective Implementation of GPU-based Revised Simplex algorithm applying new memory management and cycle avoidance strategies
Synchronization of stochastic mean field networks of Hodgkin-Huxley neurons with noisy channels
Delayed Impact of Fair Machine Learning
Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches
Partitioning a graph into degenerate subgraphs
An information-theoretic Phase I/II design for molecularly targeted agents that does not require an assumption of monotonicity
Quantum Supremacy and the Complexity of Random Circuit Sampling