Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution

Current machine learning systems operate, almost exclusively, in a statistical, or model-free mode, which entails severe theoretical limits on their power and performance. Such systems cannot reason about interventions and retrospection and, therefore, cannot serve as the basis for strong AI. To achieve human level intelligence, learning machines need the guidance of a model of reality, similar to the ones used in causal inference tasks. To demonstrate the essential role of such models, I will present a summary of seven tasks which are beyond reach of current machine learning systems and which have been accomplished using the tools of causal modeling.

Cost-Sensitive Convolution based Neural Networks for Imbalanced Time-Series Classification

Some deep convolutional neural networks were proposed for time-series classification and class imbalanced problems. However, those models performed degraded and even failed to recognize the minority class of an imbalanced temporal sequences dataset. Minority samples would bring troubles for temporal deep learning classifiers due to the equal treatments of majority and minority class. Until recently, there were few works applying deep learning on imbalanced time-series classification (ITSC) tasks. Here, this paper aimed at tackling ITSC problems with deep learning. An adaptive cost-sensitive learning strategy was proposed to modify temporal deep learning models. Through the proposed strategy, classifiers could automatically assign misclassification penalties to each class. In the experimental section, the proposed method was utilized to modify five neural networks. They were evaluated on a large volume, real-life and imbalanced time-series dataset with six metrics. Each single network was also tested alone and combined with several mainstream data samplers. Experimental results illustrated that the proposed cost-sensitive modified networks worked well on ITSC tasks. Compared to other methods, the cost-sensitive convolution neural network and residual network won out in the terms of all metrics. Consequently, the proposed cost-sensitive learning strategy can be used to modify deep learning classifiers from cost-insensitive to cost-sensitive. Those cost-sensitive convolutional networks can be effectively applied to address ITSC issues.

Multivariate LSTM-FCNs for Time Series Classification

Over the past decade, multivariate time series classification has been receiving a lot of attention. We propose augmenting the existing univariate time series classification models, LSTM-FCN and ALSTM-FCN with a squeeze and excitation block to further improve performance. Our proposed models outperform most of the state of the art models while requiring minimum preprocessing. The proposed models work efficiently on various complex multivariate time series classification tasks such as activity recognition or action recognition. Furthermore, the proposed models are highly efficient at test time and small enough to deploy on memory constrained systems.

Brain EEG Time Series Selection: A Novel Graph-Based Approach for Classification

Brain Electroencephalography (EEG) classification is widely applied to analyze cerebral diseases in recent years. Unfortunately, invalid/noisy EEGs degrade the diagnosis performance and most previously developed methods ignore the necessity of EEG selection for classification. To this end, this paper proposes a novel maximum weight clique-based EEG selection approach, named mwcEEGs, to map EEG selection to searching maximum similarity-weighted cliques from an improved Fr\'{e}chet distance-weighted undirected EEG graph simultaneously considering edge weights and vertex weights. Our mwcEEGs improves the classification performance by selecting intra-clique pairwise similar and inter-clique discriminative EEGs with similarity threshold \delta. Experimental results demonstrate the algorithm effectiveness compared with the state-of-the-art time series selection algorithms on real-world EEG datasets.

A Semi-Parametric Binning Approach to Quickest Change Detection

The problem of quickest detection of a change in distribution is considered under the assumption that the pre-change distribution is known, and the post-change distribution is only known to belong to a family of distributions distinguishable from a discretized version of the pre-change distribution. A sequential change detection procedure is proposed that partitions the sample space into a finite number of bins, and monitors the number of samples falling into each of these bins to detect the change. A test statistic that approximates the generalized likelihood ratio test is developed. It is shown that the proposed test statistic can be efficiently computed using a recursive update scheme, and a procedure for choosing the number of bins in the scheme is provided. Various asymptotic properties of the test statistic are derived to offer insights into its performance trade-off between average detection delay and average run length to a false alarm. Testing on synthetic and real data demonstrates that our approach is comparable or better in performance to existing non-parametric change detection methods.

Evaluation of Machine Learning Fameworks on Finis Terrae II

Machine Learning (ML) and Deep Learning (DL) are two technologies used to extract representations of the data for a specific purpose. ML algorithms take a set of data as input to generate one or several predictions. To define the final version of one model, usually there is an initial step devoted to train the algorithm (get the right final values of the parameters of the model). There are several techniques, from supervised learning to reinforcement learning, which have different requirements. On the market, there are some frameworks or APIs that reduce the effort for designing a new ML model. In this report, using the benchmark DLBENCH, we will analyse the performance and the execution modes of some well-known ML frameworks on the Finis Terrae II supercomputer when supervised learning is used. The report will show that placement of data and allocated hardware can have a large influence on the final timeto-solution.

Some techniques in density estimation

Density estimation is an interdisciplinary topic at the intersection of statistics, theoretical computer science and machine learning. We review some old and new techniques for bounding sample complexity of estimating densities of continuous distributions, focusing on the class of mixtures of Gaussians and its subclasses.

Comparative Study on Generative Adversarial Networks

In recent years, there have been tremendous advancements in the field of machine learning. These advancements have been made through both academic as well as industrial research. Lately, a fair amount of research has been dedicated to the usage of generative models in the field of computer vision and image classification. These generative models have been popularized through a new framework called Generative Adversarial Networks. Moreover, many modified versions of this framework have been proposed in the last two years. We study the original model proposed by Goodfellow et al. as well as modifications over the original model and provide a comparative analysis of these models.

Noisy Expectation-Maximization: Applications and Generalizations

We present a noise-injected version of the Expectation-Maximization (EM) algorithm: the Noisy Expectation Maximization (NEM) algorithm. The NEM algorithm uses noise to speed up the convergence of the EM algorithm. The NEM theorem shows that injected noise speeds up the average convergence of the EM algorithm to a local maximum of the likelihood surface if a positivity condition holds. The generalized form of the noisy expectation-maximization (NEM) algorithm allow for arbitrary modes of noise injection including adding and multiplying noise to the data. We demonstrate these noise benefits on EM algorithms for the Gaussian mixture model (GMM) with both additive and multiplicative NEM noise injection. A separate theorem (not presented here) shows that the noise benefit for independent identically distributed additive noise decreases with sample size in mixture models. This theorem implies that the noise benefit is most pronounced if the data is sparse. Injecting blind noise only slowed convergence.

Multiple Imputation: A Review of Practical and Theoretical Findings

Multiple imputation is a straightforward method for handling missing data in a principled fashion. This paper presents an overview of multiple imputation, including important theoretical results and their practical implications for generating and using multiple imputations. A review of strategies for generating imputations follows, including recent developments in flexible joint modeling and sequential regression/chained equations/fully conditional specification approaches. Finally, we compare and contrast different methods for generating imputations on a range of criteria before identifying promising avenues for future research.

MINE: Mutual Information Neural Estimation

We argue that the estimation of the mutual information between high dimensional continuous random variables is achievable by gradient descent over neural networks. This paper presents a Mutual Information Neural Estimator (MINE) that is linearly scalable in dimensionality as well as in sample size. MINE is back-propable and we prove that it is strongly consistent. We illustrate a handful of applications in which MINE is succesfully applied to enhance the property of generative models in both unsupervised and supervised settings. We apply our framework to estimate the information bottleneck, and apply it in tasks related to supervised classification problems. Our results demonstrate substantial added flexibility and improvement in these settings.

How Many Samples Required in Big Data Collection: A Differential Message Importance Measure

Information collection is a fundamental problem in big data, where the size of sampling sets plays a very important role. This work considers the information collection process by taking message importance into account. Similar to differential entropy, we define differential message importance measure (DMIM) as a measure of message importance for continuous random variable. It is proved that the change of DMIM can describe the gap between the distribution of a set of sample values and a theoretical distribution. In fact, the deviation of DMIM is equivalent to Kolmogorov-Smirnov statistic, but it offers a new way to characterize the distribution goodness-of-fit. Numerical results show some basic properties of DMIM and the accuracy of the proposed approximate values. Furthermore, it is also obtained that the empirical distribution approaches the real distribution with decreasing of the DMIM deviation, which contributes to the selection of suitable sampling points in actual system.

State Variation Mining: On Information Divergence with Message Importance in Big Data

Information transfer which reveals the state variation of variables can play a vital role in big data analytics and processing. In fact, the measure for information transfer can reflect the system change from the statistics by using the variable distributions, similar to KL divergence and Renyi divergence. Furthermore, in terms of the information transfer in big data, small probability events dominate the importance of the total message to some degree. Therefore, it is significant to design an information transfer measure based on the message importance which emphasizes the small probability events. In this paper, we propose the message importance divergence (MID) and investigate its characteristics and applications on three aspects. First, the message importance transfer capacity based on MID is presented to offer an upper bound for the information transfer with disturbance. Then, we utilize the MID to guide the queue length selection, which is the fundamental problem considered to have higher social or academic value in the caching operation of mobile edge computing. Finally, we extend the MID to the continuous case and discuss the robustness by using it to measuring information distance.

MSDNN: Multi-Scale Deep Neural Network for Salient Object Detection

Salient object detection is a fundamental problem and has been received a great deal of attentions in computer vision. Recently deep learning model became a powerful tool for image feature extraction. In this paper, we propose a multi-scale deep neural network (MSDNN) for salient object detection. The proposed model first extracts global high-level features and context information over the whole source image with recurrent convolutional neural network (RCNN). Then several stacked deconvolutional layers are adopted to get the multi-scale feature representation and obtain a series of saliency maps. Finally, we investigate a fusion convolution module (FCM) to build a final pixel level saliency map. The proposed model is extensively evaluated on four salient object detection benchmark datasets. Results show that our deep model significantly outperforms other 12 state-of-the-art approaches.

Deep Learning for Sampling from Arbitrary Probability Distributions

This paper proposes a fully connected neural network model to map samples from a uniform distribution to samples of any explicitly known probability density function. During the training, the Jensen-Shannon divergence between the distribution of the model’s output and the target distribution is minimized. We experimentally demonstrate that our model converges towards the desired state. It provides an alternative to existing sampling methods such as inversion sampling, rejection sampling, Gaussian mixture models and Markov-Chain-Monte-Carlo. Our model has high sampling efficiency and is easily applied to any probability distribution, without the need of further analytical or numerical calculations. It can produce correlated samples, such that the output distribution converges faster towards the target than for independent samples. But it is also able to produce independent samples, if single values are fed into the network and the input values are independent as well. We focus on one-dimensional sampling, but additionally illustrate a two-dimensional example with a target distribution of dependent variables.

Fairness in Supervised Learning: An Information Theoretic Approach

Automated decision making systems are increasingly being used in real-world applications. In these systems for the most part, the decision rules are derived by minimizing the training error on the available historical data. Therefore, if there is a bias related to a sensitive attribute such as gender, race, religion, etc. in the data, say, due to cultural/historical discriminatory practices against a certain demographic, the system could continue discrimination in decisions by including the said bias in its decision rule. We present an information theoretic framework for designing fair predictors from data, which aim to prevent discrimination against a specified sensitive attribute in a supervised learning setting. We use equalized odds as the criterion for discrimination, which demands that the prediction should be independent of the protected attribute conditioned on the actual label. To ensure fairness and generalization simultaneously, we compress the data to an auxiliary variable, which is used for the prediction task. This auxiliary variable is chosen such that it is decontaminated from the discriminatory attribute in the sense of equalized odds. The final predictor is obtained by applying a Bayesian decision rule to the auxiliary variable.

SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks

Going deeper and wider in neural architectures improves the accuracy, while the limited GPU DRAM places an undesired restriction on the network design domain. Deep Learning (DL) practitioners either need change to less desired network architectures, or nontrivially dissect a network across multiGPUs. These distract DL practitioners from concentrating on their original machine learning tasks. We present SuperNeurons: a dynamic GPU memory scheduling runtime to enable the network training far beyond the GPU DRAM capacity. SuperNeurons features 3 memory optimizations, \textit{Liveness Analysis}, \textit{Unified Tensor Pool}, and \textit{Cost-Aware Recomputation}, all together they effectively reduce the network-wide peak memory usage down to the maximal memory usage among layers. We also address the performance issues in those memory saving techniques. Given the limited GPU DRAM, SuperNeurons not only provisions the necessary memory for the training, but also dynamically allocates the memory for convolution workspaces to achieve the high performance. Evaluations against Caffe, Torch, MXNet and TensorFlow have demonstrated that SuperNeurons trains at least 3.2432 deeper network than current ones with the leading performance. Particularly, SuperNeurons can train ResNet2500 that has 10^4 basic network layers on a 12GB K40c.

Non-Parametric Transformation Networks

ConvNets have been very effective in many applications where it is required to learn invariances to within-class nuisance transformations. However, through their architecture, ConvNets only enforce invariance to translation. In this paper, we introduce a new class of convolutional architectures called Non-Parametric Transformation Networks (NPTNs) which can learn general invariances and symmetries directly from data. NPTNs are a direct and natural generalization of ConvNets and can be optimized directly using gradient descent. They make no assumption regarding structure of the invariances present in the data and in that aspect are very flexible and powerful. We also model ConvNets and NPTNs under a unified framework called Transformation Networks which establishes the natural connection between the two. We demonstrate the efficacy of NPTNs on natural data such as MNIST and CIFAR 10 where it outperforms ConvNet baselines with the same number of parameters. We show it is effective in learning invariances unknown apriori directly from data from scratch. Finally, we apply NPTNs to Capsule Networks and show that they enable them to perform even better.

DCDistance: A Supervised Text Document Feature extraction based on class labels

Text Mining is a field that aims at extracting information from textual data. One of the challenges of such field of study comes from the pre-processing stage in which a vector (and structured) representation should be extracted from unstructured data. The common extraction creates large and sparse vectors representing the importance of each term to a document. As such, this usually leads to the curse-of-dimensionality that plagues most machine learning algorithms. To cope with this issue, in this paper we propose a new supervised feature extraction and reduction algorithm, named DCDistance, that creates features based on the distance between a document to a representative of each class label. As such, the proposed technique can reduce the features set in more than 99% of the original set. Additionally, this algorithm was also capable of improving the classification accuracy over a set of benchmark datasets when compared to traditional and state-of-the-art features selection algorithms.

tau-FPL: Tolerance-Constrained Learning in Linear Time

Learning a classifier with control on the false-positive rate plays a critical role in many machine learning applications. Existing approaches either introduce prior knowledge dependent label cost or tune parameters based on traditional classifiers, which lack consistency in methodology because they do not strictly adhere to the false-positive rate constraint. In this paper, we propose a novel scoring-thresholding approach, tau-False Positive Learning (tau-FPL) to address this problem. We show the scoring problem which takes the false-positive rate tolerance into accounts can be efficiently solved in linear time, also an out-of-bootstrap thresholding method can transform the learned ranking function into a low false-positive classifier. Both theoretical analysis and experimental results show superior performance of the proposed tau-FPL over existing approaches.

SPIN: A Fast and Scalable Matrix Inversion Method in Apache Spark

The growth of big data in domains such as Earth Sciences, Social Networks, Physical Sciences, etc. has lead to an immense need for efficient and scalable linear algebra operations, e.g. Matrix inversion. Existing methods for efficient and distributed matrix inversion using big data platforms rely on LU decomposition based block-recursive algorithms. However, these algorithms are complex and require a lot of side calculations, e.g. matrix multiplication, at various levels of recursion. In this paper, we propose a different scheme based on Strassen’s matrix inversion algorithm (mentioned in Strassen’s original paper in 1969), which uses far fewer operations at each level of recursion. We implement the proposed algorithm, and through extensive experimentation, show that it is more efficient than the state of the art methods. Furthermore, we provide a detailed theoretical analysis of the proposed algorithm, and derive theoretical running times which match closely with the empirically observed wall clock running times, thus explaining the U-shaped behaviour w.r.t. block-sizes.

An Interpretable Reasoning Network for Multi-Relation Question Answering

Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis.

Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Learning similarity functions between image pairs with deep neural networks yields highly correlated activations of embeddings. In this work, we show how to improve the robustness of such embeddings by exploiting the independence within ensembles. To this end, we divide the last embedding layer of a deep network into an embedding ensemble and formulate training this ensemble as an online gradient boosting problem. Each learner receives a reweighted training sample from the previous learners. Further, we propose two loss functions which increase the diversity in our ensemble. These loss functions can be applied either for weight initialization or during training. Together, our contributions leverage large embedding sizes more effectively by significantly reducing correlation of the embedding and consequently increase retrieval accuracy of the embedding. Our method works with any differentiable loss function and does not introduce any additional parameters during test time. We evaluate our metric learning method on image retrieval tasks and show that it improves over state-of-the-art methods on the CUB 200-2011, Cars-196, Stanford Online Products, In-Shop Clothes Retrieval and VehicleID datasets.

Building a Conversational Agent Overnight with Dialogue Self-Play

We propose Machines Talking To Machines (M2M), a framework combining automation and crowdsourcing to rapidly bootstrap end-to-end dialogue agents for goal-oriented dialogues in arbitrary domains. M2M scales to new tasks with just a task schema and an API client from the dialogue system developer, but it is also customizable to cater to task-specific interactions. Compared to the Wizard-of-Oz approach for data collection, M2M achieves greater diversity and coverage of salient dialogue flows while maintaining the naturalness of individual utterances. In the first phase, a simulated user bot and a domain-agnostic system bot converse to exhaustively generate dialogue ‘outlines’, i.e. sequences of template utterances and their semantic parses. In the second phase, crowd workers provide contextual rewrites of the dialogues to make the utterances more natural while preserving their meaning. The entire process can finish within a few hours. We propose a new corpus of 3,000 dialogues spanning 2 domains collected with M2M, and present comparisons with popular dialogue datasets on the quality and diversity of the surface forms and dialogue flows.

Cobra: A Framework for Cost Based Rewriting of Database Applications

Database applications are typically written using a mixture of imperative languages and declarative frameworks for data processing. Application logic gets distributed across the declarative and imperative parts of a program. Often, there is more than one way to implement the same program, whose efficiency may depend on a number of parameters. In this paper, we propose a framework that automatically generates all equivalent alternatives of a given program using a given set of program transformations, and chooses the least cost alternative. We use the concept of program regions as an algebraic abstraction of a program and extend the Volcano/Cascades framework for optimization of algebraic expressions, to optimize programs. We illustrate the use of our framework for optimizing database applications. We show through experimental results, that our framework has wide applicability in real world applications and provides significant performance benefits.

Formal Dependability Modeling and Optimization of Scrubbed-Partitioned TMR for SRAM-based FPGAs
LDPC Codes with Local and Global Decoding
Model-Based Action Exploration
Resolvability on Continuous Alphabets
Interactive Learning of Acyclic Conditional Preference Networks
Extremal $G$-free induced subgraphs of Kneser graphs
Influence of topology in the mobility enhancement of pulse-coupled oscillator synchronization
Timely Status Update in Massive IoT Systems: Decentralized Scheduling for Wireless Uplinks
Fully-Coupled Two-Stream Spatiotemporal Networks for Extremely Low Resolution Action Recognition
A Brain-Inspired Trust Management Model to Assure Security in a Cloud based IoT Framework for Neuroscience Applications
On the roots of Wiener polynomials of graphs
Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction
Average Power and $λ$-power in Multiple Testing Scenarios when the Benjamini-Hochberg False Discovery Rate Procedure is Used
Noisy Feedback and Loss Unlimited Private Communication
Efficient C-RAN Random Access for IoT Devices: Learning Links via Recommendation Systems
Regularly varying non-stationary Galton–Watson processes with immigration
Minimax Optimality of Sign Test for Paired Heterogeneous Data
Adaptive Bit Allocation for OFDM Cognitive Radio Systems with Imperfect Channel Estimation
Enhancing Underwater Imagery using Generative Adversarial Networks
Non-Rigid Image Registration Using Self-Supervised Fully Convolutional Networks without Training Data
Brain Age Prediction Based on Resting-State Functional Connectivity Patterns Using Convolutional Neural Networks
A Hardware-Friendly Algorithm for Scalable Training and Deployment of Dimensionality Reduction Models on FPGA
On Partially Overlapping Coexistence for Dynamic Spectrum Access in Cognitive Radio
Spatio-Temporal Pricing for Ridesharing Platforms
Did William Shakespeare and Thomas Kyd Write Edward III?
Application of a semantic segmentation convolutional neural network for accurate automatic detection and mapping of solar photovoltaic arrays in aerial imagery
Cognitive Non-Orthogonal Multiple Access with Cooperative Relaying: A New Wireless Frontier for 5G Spectrum Sharing
A Simplified Coding Scheme for the Broadcast Channel with Complementary Receiver Side Information under Individual Secrecy Constraints
Asymptotic Static Hedge via Symmetrization
Communication Optimality Trade-offs For Distributed Estimation
A3T: Adversarially Augmented Adversarial Training
Emergent memory in cell signaling: Persistent adaptive dynamics in cascades can arise from the diversity of relaxation time-scales
Deep Stereo Matching with Explicit Cost Aggregation Sub-Architecture
Content Based Status Updates
Status Updates in a multi-stream M/G/1/1 preemptive queue
Controller Synthesis for Safety of Physically-Viable Data-Driven Models
How to augment a small learning set for improving the performances of a CNN-based steganalyzer?
Optimal control of an evolution equation with non-smooth dissipation
How should a fixed budget of dwell time be spent in scanning electron microscopy to optimize image quality?
On notions of Q-independence and Q-identical distributiveness
Multivariate stochastic delay differential equations and CAR representations of CARMA processes
Sensitivity indices for independent groups of variables
Hierarchical Motion Consistency Constraint for Efficient Geometrical Verification in UAV Image Matching
Exceptional and modern intervals of the Tamari lattice
Combinatorics of compactified universal Jacobians
Planning with Trust for Human-Robot Collaboration
Spatio-Temporal Linkage over Location Enhanced Services
Generative Single Image Reflection Separation
Self-Predicting Boolean Functions
Multiple Antennas Secure Transmission under Pilot Spoofing and Jamming Attack
Active repositioning of storage units in Robotic Mobile Fulfillment Systems
Perfect codes in generalized Fibonacci cubes
Couplings in L^p distance of two Brownian motions and their L{é}vy area
Clinical and Non-clinical Effects on Surgery Duration: Statistical Modeling and Analysis
On the goodness-of-fit of generalized linear geostatistical models
A Game Theoretic Approach to Hyperbolic Consensus Problems
Improved bounds on the multicolor Ramsey numbers of paths and even cycles
Interpretation of the vibrational spectra of glassy polymers using coarse-grained simulations
On Partly Overloaded Spreading Sequences with Variable Spreading Factor
Deep Episodic Memory: Encoding, Recalling, and Predicting Episodic Experiences for Robot Action Execution
First-passage times over moving boundaries for asymptotically stable walks
Cosmic String Detection with Tree-Based Machine Learning
Local asymptotic self-similarity for heavy tailed harmonizable fractional Lévy motions
Second order models for optimal transport and cubic splines on the Wasserstein space
Variational Second-Order Interpolation on the Group of Diffeomorphisms with a Right-Invariant Metric
Bayesian Quadrature for Multiple Related Integrals
Can Who-Edits-What Predict Edit Survival?
QuickNAT: Segmenting MRI Neuroanatomy in 20 seconds
Arhuaco: Deep Learning and Isolation Based Security for Distributed High-Throughput Computing
Youla Coding and Computation of Gaussian Feedback Capacity
Computing permanents of complex diagonally dominant matrices and tensors
A Simple and Efficient Estimation Method for Models with Nonignorable Missing Data
Determining Projection Constants of Univariate Polynomial Spaces
Multinomial logistic model for coinfection diagnosis between arbovirus and malaria in Kedougou
A unifying Perron-Frobenius theorem for nonnegative tensors via multi-homogeneous maps
Machine Intelligence Techniques for Next-Generation Context-Aware Wireless Networks
List Decoding of Locally Repairable Codes
Optimal Streaming Codes for Channels with Burst and Arbitrary Erasures
Development of Energy Models for Design Space Exploration of Embedded Many-Core Systems
Inexact cuts in Deterministic and Stochastic Dual Dynamic Programming applied to linear optimization problems
Safe Privatization in Transactional Memory
Graph domination-saturation
On projective and affine equivalence of sub-Riemannian metrics
Conditional Probability Models for Deep Image Compression
Deep saliency: What is learnt by a deep network about saliency?
A note on Herglotz’s theorem for time series on function spaces
Real-world Anomaly Detection in Surveillance Videos
Asynchronous Stochastic Variational Inference
The Control Toolbox – An Open-Source C++ Library for Robotics, Optimal and Model Predictive Control
Generalization Error Bounds for Noisy, Iterative Algorithms
Belief Propagation Decoding of Polar Codes on Permuted Factor Graphs
A Family of Tractable Graph Distances
A Workload Analysis of NSF’s Innovative HPC Resources Using XDMoD
TFisher Tests: Optimal and Adaptive Thresholding for Combining $p$-Values
A Multi-Hop Framework for Multi-Source, Multi-Relay, All-Cast Channels
Light Field Super-Resolution using a Low-Rank Prior and Deep Convolutional Neural Networks
Corner cases, singularities, and dynamic factoring
Not All Ops Are Created Equal!
Prototypicality effects in global semantic description of objects
TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
The DCS Theorem
Estimating the Number of Connected Components in a Graph via Subgraph Sampling
Predicting Future Lane Changes of Other Highway Vehicles using RNN-based Deep Models
Combining Symbolic and Function Evaluation Expressions In Neural Programs
Susceptibility of power grids to input fluctuations
Engineering Cooperative Smart Things based on Embodied Cognition
A Computational Model of Commonsense Moral Decision Making
Comprehensive Optimization of Parametric Kernels for Graphics Processing Units
On the Capacity Region of the Deterministic Y-Channel with Common and Private Messages
Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers
Feature Space Transfer for Data Augmentation
Coded Cooperative Computation for Internet of Things
Queue-aware Energy Efficient Control for Dense Wireless Networks
Estimation in the group action channel
Is profile likelihood a true likelihood? An argument in favor
Inverted Residuals and Linear Bottlenecks: Mobile Networks forClassification, Detection and Segmentation
Cohomology rings of compactifications of toric arrangements
Distributed Multi-User Secret Sharing
Phase diagrams of Weyl semimetals with competing diagonal and off-diagonal disorders
A Context-free Grammar for Peaks and Double Descents of Permutations
Social Advantage with Mixed Entangled States
A Survey on Compiler Autotuning using Machine Learning
On the convergence properties of GAN training
Towards a more efficient representation of imputation operators in TPOT
Tight Bounds for $\ell_p$ Oblivious Subspace Embeddings
LDPC Codes over Gaussian Multiple Access Wiretap Channel
Asymptotic Distribution of Multilevel Channel Polarization for a Certain Class of Erasure Channels
Longest Common Prefixes with $k$-Errors and Applications
Sparse NOMA: A Closed-Form Characterization
Detecting Offensive Language in Tweets Using Deep Learning
Aperiodic Sampled-Data Control via Explicit Transmission Mapping: A Set Invariance Approach
Semi-supervised Fisher vector network
Variable-Length Resolvability for Mixed Sources and its Application to Variable-Length Source Coding
Secure Communications in NOMA System: Subcarrier Assignment and Power Allocation
Scalable De Novo Genome Assembly Using Pregel
Size-to-depth: A New Perspective for Single Image Depth Estimation
Boolean functions: noise stability, non-interactive correlation, and mutual information
A Scalable Belief Propagation Algorithm for Radio Signal Based SLAM
Lattice Erasure Codes of Low Rank with Noise Margins
On the Measurement Uncertainty in a Reverberation Chamber Including Frequency Stirring
EmbedRank: Unsupervised Keyphrase Extraction using Sentence Embeddings
Not-All-Equal and 1-in-Degree Decompositions: Algorithmic Complexity and Applications
Channel Whispering: a Protocol for Physical Layer Group Key Generation. Application to IR-UWB through Deconvolution
Waring’s Theorem for Binary Powers
Persistence of one-dimensional AR(1)-sequences
Can Computers Create Art?
Better Runtime Guarantees Via Stochastic Domination
A Stochastic Singular Vector Based MIMO Channel Model for MAC Layer Tracking
On a statistical approach to mate choices in reproduction
Irreversible investment with fixed adjustment costs: a stochastic impulse control approach
An Explicit Convergence Rate for Nesterov’s Method from SDP
Model Predictive Control in Spacecraft Rendezvous and Soft Docking
Near-optimal approximation algorithm for simultaneous Max-Cut
Fast Methods for Solving the Cluster Containment Problem for Phylogenetic Networks
Extinction time of a CB-processes with competition in a Lévy random environment
Saturated equiangular lines in Euclidean spaces
Non-Orthogonal Multiple Access for mmWave Drone Networks with Limited Feedback
Polynomial stability of exact solution and a numerical method for stochastic differential equations with time-dependent delay
Shrink or Substitute: Handling Process Failures in HPC Systems using In-situ Recovery
A Bio-inspired Collision Detecotr for Small Quadcopter
Regularity of stochastic nonlocal diffusion equations
Compressed Neighbour Discovery using Sparse Kerdock Matrices
Fix your classifier: the marginal value of training the last weight layer
Cooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication
Hire the Experts: Combinatorial Auction Based Scheme for Experts Selection in E-Healthcare
Throughput Maximization for UAV-Enabled Wireless Powered Communication Networks
Frame Moments and Welch Bound with Erasures
Properties of non-symmetric Macdonald polynomials at $q=1$ and $q=0$
Energy-Efficient Resource Allocation in NOMA Heterogeneous Networks
Remarks on Graphons
Poisson Cox Point Processes for Vehicular Networks
On the effect of blockage objects in dense MIMO SWIPT networks
Asymptotic Enumeration of Graph Classes with Many Components
Towards Realistic Threat Modeling: Attack Commodification, Irrelevant Vulnerabilities, and Unrealistic Assumptions
Fully Quantum Arbitrarily Varying Channels: Random Coding Capacity and Capacity Dichotomy
Distributed dynamic load balancing for task parallel programming
The method of hypergraph containers
A Bayesian Evidence Synthesis Approach to Estimate Disease Prevalence in Hard-To-Reach Populations: Hepatitis C in New York City
Deep Reinforcement Fuzzing
Frame-Recurrent Video Super-Resolution
On the shape factor of interaction laws for a non-local approximation of the Sobolev norm and the total variation
On Identifying a Massive Number of Distributions
Stochastic quantization of an Abelian gauge theory
New Perspectives on Multi-Prover Interactive Proofs
Deep Reinforcement Learning of Cell Movement in the Early Stage of C. elegans Embryogenesis
PACER: Peripheral Activity Completion Estimation and Recognition
A functional limit theorem for the profile of random recursive trees
Algorithmic Polynomials
Some Generalizations of Good Integers and Their Applications in the Study of Self-Dual Negacyclic Codes
Some remarks on biased recursive trees
Hierarchical Memory Management for Mutable State
Top k Memory Candidates in Memory Networks for Common Sense Reasoning
Theorems About Integration Order Replacement in Multiple Ito Stochastic Integrals
An Elementary Dyadic Riemann Hypothesis
Strategies for Stable Merge Sorting
Generalized Lambert Series Identities and Applications in Rank Differences
Renewal in Hawkes processes with self-excitation and inhibition
Non-Orthogonal Multiple Access For Cooperative Communications: Challenges, Opportunities, And Trends
Deep Net Triage: Assessing the Criticality of Network Layers by Structural Compression
Hyperspectral recovery from RGB images using Gaussian Processes
Information Geometric Approach to Bayesian Lower Error Bounds
The Circular Law for Random Matrices with Intra-row Dependence
Efficient Trimmed Convolutional Arithmetic Encoding for Lossless Image Compression
The decoding failure probability of MDPC codes
Fault-Tolerant Hotelling Games
Partial geodesics on symmetric groups endowed with breakpoint distance
Approximation of Excessive Backlog Probabilities of Two Tandem Queues
Efficient arithmetic regularity and removal lemmas for induced bipartite patterns
Hierarchical Coding for Distributed Computing
Asymptotic Correlation Structure of Discounted Incurred But Not Reported Claims under Fractional Poisson Arrival Process
Towards Imperceptible and Robust Adversarial Example Attacks against Neural Networks
Sparsity-based Defense against Adversarial Attacks on Linear Classifiers
Robust capacitated trees and networks with uniform demands
Searching for Maximum Out-Degree Vertices in Tournaments
Inclusion-exclusion by ordering-free cancellation
Sensitivity analysis for multiscale stochastic reaction networks using hybrid approximations
Robust Inference for Seemingly Unrelated Regression Models
Combining Stereo Disparity and Optical Flow for Basic Scene Flow
Spectral engineering and tunable thermoelectric behavior in a quasiperiodic ladder network
The Communication-Hiding Conjugate Gradient Method with Deep Pipelines
Full Wafer Redistribution and Wafer Embedding as Key Technologies for a Multi-Scale Neuromorphic Hardware Cluster
Secure Adaptive Group Testing
Mixing Time on the Kagome Lattice
Distributionally Robust Optimization for Sequential Decision Making
Directed Strongly Regular Cayley Graphs on Dihedral groups
SAR Image Despeckling Using Quadratic-Linear Approximated L1-Norm
On the Distribution of Random Geometric Graphs
A Tight Converse to the Spectral Resolution Limit via Convex Programming
Two High-performance Schemes of Transmit Antenna Selection for Secure Spatial Modulation
Detecting dynamic spatial correlation patterns with generalized wavelet coherence and non-stationary surrogate data
Block-coordinate primal-dual method for the nonsmooth minimization over linear constraints
Subpolynomial trace reconstruction for random strings and arbitrary deletion probability
Approximating the Incremental Knapsack Problem
New LMRD bounds for constant dimension codes and improved constructions
A partial order on Motzkin paths
Predicting Movie Genres Based on Plot Summaries
Robots as Powerful Allies for the Study of Embodied Cognition from the Bottom Up
Improving Communication Patterns in Polyhedral Process Networks
Mixing Time for Square Tilings
Empirical $L^2$-distance test statistics for ergodic diffusions
System-Aware Compression
Improving Orbit Prediction Accuracy through Supervised Machine Learning
Randomized projection methods for convex feasibility problems: conditioning and convergence rates
Classification of histopathological breast cancer images using iterative VMD aided Zernike moments & textural signatures
Coding over Sets for DNA Storage
Unsupervised Cipher Cracking Using Discrete GANs
Non-Orthogonal Multiple Access for Mobile VLC Networks with Random Receiver Orientation
Sending Information Through Status Updates