Averages of Unlabeled Networks: Geometric Characterization and Asymptotic Behavior

It is becoming increasingly common to see large collections of network data objects — that is, data sets in which a network is viewed as a fundamental unit of observation. As a result, there is a pressing need to develop network-based analogues of even many of the most basic tools already standard for scalar and vector data. In this paper, our focus is on averages of unlabeled, undirected networks with edge weights. Specifically, we (i) characterize a certain notion of the space of all such networks, (ii) describe key topological and geometric properties of this space relevant to doing probability and statistics thereupon, and (iii) use these properties to establish the asymptotic behavior of a generalized notion of an empirical mean under sampling from a distribution supported on this space. Our results rely on a combination of tools from geometry, probability theory, and statistical shape analysis. In particular, the lack of vertex labeling necessitates working with a quotient space modding out permutations of labels. This results in a nontrivial geometry for the space of unlabeled networks, which in turn is found to have important implications on the types of probabilistic and statistical results that may be obtained and the techniques needed to obtain them.

A Brief Introduction to Machine Learning for Engineers

This monograph aims at providing an introduction to key concepts, algorithms, and theoretical frameworks in machine learning, including supervised and unsupervised learning, statistical learning theory, probabilistic graphical models and approximate inference. The intended readership consists of electrical engineers with a background in probability and linear algebra. The treatment builds on first principles, and organizes the main ideas according to clearly defined categories, such as discriminative and generative models, frequentist and Bayesian approaches, exact and approximate inference, directed and undirected models, and convex and non-convex optimization. The mathematical framework uses information-theoretic measures as a unifying tool. The text offers simple and reproducible numerical examples providing insights into key motivations and conclusions. Rather than providing exhaustive details on the existing myriad solutions in each specific category, for which the reader is referred to textbooks and papers, this monograph is meant as an entry point for an engineer into the literature on machine learning.

Combining LSTM and Latent Topic Modeling for Mortality Prediction

There is a great need for technologies that can predict the mortality of patients in intensive care units with both high accuracy and accountability. We present joint end-to-end neural network architectures that combine long short-term memory (LSTM) and a latent topic model to simultaneously train a classifier for mortality prediction and learn latent topics indicative of mortality from textual clinical notes. For topic interpretability, the topic modeling layer has been carefully designed as a single-layer network with constraints inspired by LDA. Experiments on the MIMIC-III dataset show that our models significantly outperform prior models that are based on LDA topics in mortality prediction. However, we achieve limited success with our method for interpreting topics from the trained models by looking at the neural network weights.

TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow

We introduce TensorFlow Agents, an efficient infrastructure paradigm for building parallel reinforcement learning algorithms in TensorFlow. We simulate multiple environments in parallel, and group them to perform the neural network computation on a batch rather than individual observations. This allows the TensorFlow execution engine to parallelize computation, without the need for manual synchronization. Environments are stepped in separate Python processes to progress them in parallel without interference of the global interpreter lock. As part of this project, we introduce BatchPPO, an efficient implementation of the proximal policy optimization algorithm. By open sourcing TensorFlow Agents, we hope to provide a flexible starting point for future projects that accelerates future research in the field.

Convolutional Dictionary Learning

Convolutional sparse representations are a form of sparse representation with a dictionary that has a structure that is equivalent to convolution with a set of linear filters. While effective algorithms have recently been developed for the convolutional sparse coding problem, the corresponding dictionary learning problem is substantially more challenging. Furthermore, although a number of different approaches have been proposed, the absence of thorough comparisons between them makes it difficult to determine which of them represents the current state of the art. The present work both addresses this deficiency and proposes some new approaches that outperform existing ones in certain contexts. A thorough set of performance comparisons indicates a very wide range of performance differences among the existing and proposed methods, and clearly identifies those that are the most effective.

Less Is More: A Comprehensive Framework for the Number of Components of Ensemble Classifiers

The number of component classifiers chosen for an ensemble has a great impact on its prediction ability. In this paper, we use a geometric framework for a priori determining the ensemble size, applicable to most of the existing batch and online ensemble classifiers. There are only a limited number of studies on the ensemble size considering Majority Voting (MV) and Weighted Majority Voting (WMV). Almost all of them are designed for batch-mode, barely addressing online environments. The big data dimensions and resource limitations in terms of time and memory make the determination of the ensemble size crucial, especially for online environments. Our framework proves, for the MV aggregation rule, that the more strong components we can add to the ensemble the more accurate predictions we can achieve. On the other hand, for the WMV aggregation rule, we prove the existence of an ideal number of components equal to the number of class labels, with the premise that components are completely independent of each other and strong enough. While giving the exact definition for a strong and independent classifier in the context of an ensemble is a challenging task, our proposed geometric framework provides a theoretical explanation of diversity and its impact on the accuracy of predictions. We conduct an experimental evaluation with two different scenarios to show the practical value of our theorems.

Approximate Stream Analytics in Apache Flink and Apache Spark Streaming

Approximate computing aims for efficient execution of workflows where an approximate output is sufficient instead of the exact output. The idea behind approximate computing is to compute over a representative sample instead of the entire input dataset. Thus, approximate computing – based on the chosen sample size – can make a systematic trade-off between the output accuracy and computation efficiency. Unfortunately, the state-of-the-art systems for approximate computing primarily target batch analytics, where the input data remains unchanged during the course of sampling. Thus, they are not well-suited for stream analytics. This motivated the design of StreamApprox – a stream analytics system for approximate computing. To realize this idea, we designed an online stratified reservoir sampling algorithm to produce approximate output with rigorous error bounds. Importantly, our proposed algorithm is generic and can be applied to two prominent types of stream processing systems: (1) batched stream processing such as Apache Spark Streaming, and (2) pipelined stream processing such as Apache Flink. We evaluated StreamApprox using a set of microbenchmarks and real-world case studies. Our results show that Spark- and Flink-based StreamApprox systems achieve a speedup of 1.15\times3\times compared to the respective native Spark Streaming and Flink executions, with varying sampling fraction of 80\% to 10\%. Furthermore, we have also implemented an improved baseline in addition to the native execution baseline – a Spark-based approximate computing system leveraging the existing sampling modules in Apache Spark. Compared to the improved baseline, our results show that StreamApprox achieves a speedup 1.1\times2.4\times while maintaining the same accuracy level.

Deep Residual Networks and Weight Initialization

Residual Network (ResNet) is the state-of-the-art architecture that realizes successful training of really deep neural network. It is also known that good weight initialization of neural network avoids problem of vanishing/exploding gradients. In this paper, simplified models of ResNets are analyzed. We argue that goodness of ResNet is correlated with the fact that ResNets are relatively insensitive to choice of initial weights. We also demonstrate how batch normalization improves backpropagation of deep ResNets without tuning initial values of weights.

RDeepSense: Reliable Deep Mobile Computing Models with Uncertainty Estimations

Recent advances in deep learning have led various applications to unprecedented achievements, which could potentially bring higher intelligence to a broad spectrum of mobile and ubiquitous applications. Although existing studies have demonstrated the effectiveness and feasibility of running deep neural network inference operations on mobile and embedded devices, they overlooked the reliability of mobile computing models. Reliability measurements such as predictive uncertainty estimations are key factors for improving the decision accuracy and user experience. In this work, we propose RDeepSense, the first deep learning model that provides well-calibrated uncertainty estimations for resource-constrained mobile and embedded devices. RDeepSense enables the predictive uncertainty by adopting a tunable proper scoring rule as the training criterion and dropout as the implicit Bayesian approximation, which theoretically proves its correctness.To reduce the computational complexity, RDeepSense employs efficient dropout and predictive distribution estimation instead of model ensemble or sampling-based method for inference operations. We evaluate RDeepSense with four mobile sensing applications using Intel Edison devices. Results show that RDeepSense can reduce around 90% of the energy consumption while producing superior uncertainty estimations and preserving at least the same model accuracy compared with other state-of-the-art methods.

Sentiment Polarity Detection for Software Development

The role of sentiment analysis is increasingly emerging to study software developers’ emotions by mining crowd-generated content within social software engineering tools. However, off-the-shelf sentiment analysis tools have been trained on non-technical domains and general-purpose social media, thus resulting in misclassifications of technical jargon and problem reports. Here, we present Senti4SD, a classifier specifically trained to support sentiment analysis in developers’ communication channels. Senti4SD is trained and validated using a gold standard of Stack Overflow questions, answers, and comments manually annotated for sentiment polarity. It exploits a suite of both lexicon- and keyword-based features, as well as semantic features based on word embedding. With respect to a mainstream off-the-shelf tool, which we use as a baseline, Senti4SD reduces the misclassifications of neutral and positive posts as emotionally negative. To encourage replications, we release a lab package including the classifier, the word embedding space, and the gold standard with annotation guidelines.

Optimal Transport for Deep Joint Transfer Learning

Training a Deep Neural Network (DNN) from scratch requires a large amount of labeled data. For a classification task where only small amount of training data is available, a common solution is to perform fine-tuning on a DNN which is pre-trained with related source data. This consecutive training process is time consuming and does not consider explicitly the relatedness between different source and target tasks. In this paper, we propose a novel method to jointly fine-tune a Deep Neural Network with source data and target data. By adding an Optimal Transport loss (OT loss) between source and target classifier predictions as a constraint on the source classifier, the proposed Joint Transfer Learning Network (JTLN) can effectively learn useful knowledge for target classification from source data. Furthermore, by using different kind of metric as cost matrix for the OT loss, JTLN can incorporate different prior knowledge about the relatedness between target categories and source categories. We carried out experiments with JTLN based on Alexnet on image classification datasets and the results verify the effectiveness of the proposed JTLN in comparison with standard consecutive fine-tuning. This Joint Transfer Learning with OT loss is general and can also be applied to other kind of Neural Networks.

Classifying Unordered Feature Sets with Convolutional Deep Averaging Networks

Unordered feature sets are a nonstandard data structure that traditional neural networks are incapable of addressing in a principled manner. Providing a concatenation of features in an arbitrary order may lead to the learning of spurious patterns or biases that do not actually exist. Another complication is introduced if the number of features varies between each set. We propose convolutional deep averaging networks (CDANs) for classifying and learning representations of datasets whose instances comprise variable-size, unordered feature sets. CDANs are efficient, permutation-invariant, and capable of accepting sets of arbitrary size. We emphasize the importance of nonlinear feature embeddings for obtaining effective CDAN classifiers and illustrate their advantages in experiments versus linear embeddings and alternative permutation-invariant and -equivariant architectures.

Robust Sparse Coding via Self-Paced Learning

Sparse coding (SC) is attracting more and more attention due to its comprehensive theoretical studies and its excellent performance in many signal processing applications. However, most existing sparse coding algorithms are nonconvex and are thus prone to becoming stuck into bad local minima, especially when there are outliers and noisy data. To enhance the learning robustness, in this paper, we propose a unified framework named Self-Paced Sparse Coding (SPSC), which gradually include matrix elements into SC learning from easy to complex. We also generalize the self-paced learning schema into different levels of dynamic selection on samples, features and elements respectively. Experimental results on real-world data demonstrate the efficacy of the proposed algorithms.

Robustness of Interdependent Random Geometric Networks

We propose an interdependent random geometric graph (RGG) model for interdependent networks. Based on this model, we study the robustness of two interdependent spatially embedded networks where interdependence exists between geographically nearby nodes in the two networks. We study the emergence of the giant mutual component in two interdependent RGGs as node densities increase, and define the percolation threshold as a pair of node densities above which the giant mutual component first appears. In contrast to the case for a single RGG, where the percolation threshold is a unique scalar for a given connection distance, for two interdependent RGGs, multiple pairs of percolation thresholds may exist, given that a smaller node density in one RGG may increase the minimum node density in the other RGG in order for a giant mutual component to exist. We derive analytical upper bounds on the percolation thresholds of two interdependent RGGs by discretization, and obtain 99\% confidence intervals for the percolation thresholds by simulation. Based on these results, we derive conditions for the interdependent RGGs to be robust under random failures and geographical attacks.

Robust Routing in Interdependent Networks

We consider a model of two interdependent networks, where every node in one network depends on one or more supply nodes in the other network and a node fails if it loses all of its supply nodes. We develop algorithms to compute the failure probability of a path, and obtain the most reliable path between a pair of nodes in a network, under the condition that each supply node fails independently with a given probability. Our work generalizes the classical shared risk group model, by considering multiple risks associated with a node and letting a node fail if all the risks occur. Moreover, we study the diverse routing problem by considering two paths between a pair of nodes. We define two paths to be d-failure resilient if at least one path survives after removing d or fewer supply nodes, which generalizes the concept of disjoint paths in a single network, and risk-disjoint paths in a classical shared risk group model. We compute the probability that both paths fail, and develop algorithms to compute the most reliable pair of paths.

A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data

Gated Recurrent Unit (GRU) is a recently published variant of the Long Short-Term Memory (LSTM) network, designed to solve the vanishing gradient and exploding gradient problems. However, its main objective is to solve the long-term dependency problem in Recurrent Neural Networks (RNNs), which prevents the network to connect an information from previous iteration with the current iteration. This study proposes a modification on the GRU model, having Support Vector Machine (SVM) as its classifier instead of the Softmax function. The classifier is responsible for the output of a network in a classification problem. SVM was chosen over Softmax for its computational efficiency. To evaluate the proposed model, it will be used for intrusion detection, with the dataset from Kyoto University’s honeypot system in 2013 which will serve as both its training and testing data.

Computational Machines in a Coexistence with Concrete Universals and Data Streams

We discuss that how the majority of traditional modeling approaches are following the idealism point of view in scientific modeling, which follow the set theoretical notions of models based on abstract universals. We show that while successful in many classical modeling domains, there are fundamental limits to the application of set theoretical models in dealing with complex systems with many potential aspects or properties depending on the perspectives. As an alternative to abstract universals, we propose a conceptual modeling framework based on concrete universals that can be interpreted as a category theoretical approach to modeling. We call this modeling framework pre-specific modeling. We further, discuss how a certain group of mathematical and computational methods, along with ever-growing data streams are able to operationalize the concept of pre-specific modeling.

WRS: Waiting Room Sampling for Accurate Triangle Counting in Real Graph Streams

If we cannot store all edges in a graph stream, which edges should we store to estimate the triangle count accurately? Counting triangles (i.e., cycles of length three) is a fundamental graph problem with many applications in social network analysis, web mining, anomaly detection, etc. Recently, much effort has been made to accurately estimate global and local triangle counts in streaming settings with limited space. Although existing methods use sampling techniques without considering temporal dependencies in edges, we observe temporal locality in real dynamic graphs. That is, future edges are more likely to form triangles with recent edges than with older edges. In this work, we propose a single-pass streaming algorithm called Waiting-Room Sampling (WRS) for global and local triangle counting. WRS exploits the temporal locality by always storing the most recent edges, which future edges are more likely to from triangles with, in the waiting room, while it uses reservoir sampling for the remaining edges. Our theoretical and empirical analyses show that WRS is: (a) Fast and ‘any time’: runs in linear time, always maintaining and updating estimates while new edges arrive, (b) Effective: yields up to 47% smaller estimation error than its best competitors, and (c) Theoretically sound: gives unbiased estimates with small variances under the temporal locality.

MBMF: Model-Based Priors for Model-Free Reinforcement Learning

Reinforcement Learning is divided in two main paradigms: model-free and model-based. Each of these two paradigms has strengths and limitations, and has been successfully applied to real world domains that are appropriate to its corresponding strengths. In this paper, we present a new approach aimed at bridging the gap between these two paradigms. We aim to take the best of the two paradigms and combine them in an approach that is at the same time data-efficient and cost-savvy. We do so by learning a probabilistic dynamics model and leveraging it as a prior for the intertwined model-free optimization. As a result, our approach can exploit the generality and structure of the dynamics model, but is also capable of ignoring its inevitable inaccuracies, by directly incorporating the evidence provided by the direct observation of the cost. As a proof-of-concept, we demonstrate on simulated tasks that our approach outperforms purely model-based and model-free approaches, as well as the approach of simply switching from a model-based to a model-free setting.

R2N2: Residual Recurrent Neural Networks for Multivariate Time Series Forecasting

Multivariate time-series modeling and forecasting is an important problem with numerous applications. Traditional approaches such as VAR (vector auto-regressive) models and more recent approaches such as RNNs (recurrent neural networks) are indispensable tools in modeling time-series data. In many multivariate time series modeling problems, there is usually a significant linear dependency component, for which VARs are suitable, and a nonlinear component, for which RNNs are suitable. Modeling such times series with only VAR or only RNNs can lead to poor predictive performance or complex models with large training times. In this work, we propose a hybrid model called R2N2 (Residual RNN), which first models the time series with a simple linear model (like VAR) and then models its residual errors using RNNs. R2N2s can be trained using existing algorithms for VARs and RNNs. Through an extensive empirical evaluation on two real world datasets (aviation and climate domains), we show that R2N2 is competitive, usually better than VAR or RNN, used alone. We also show that R2N2 is faster to train as compared to an RNN, while requiring less number of hidden units.

Debbie, the Debate Bot of the Future

Chatbots are a rapidly expanding application of dialogue systems with companies switching to bot services for customer support, and new applications for users interested in casual conversation. One style of casual conversation is argument, many people love nothing more than a good argument. Moreover, there are a number of existing corpora of argumentative dialogues, annotated for agreement and disagreement, stance, sarcasm and argument quality. This paper introduces Debbie, a novel arguing bot, that selects arguments from conversational corpora, and aims to use them appropriately in context. We present an initial working prototype of Debbie, with some preliminary evaluation and describe future work.

The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing: A User Survey

Graph processing is becoming increasingly prevalent across many application domains. In spite of this prevalence, there is little research about how graphs are actually used in practice. We conducted an online survey aimed at understanding: (i) the types of graphs users have; (ii) the graph computations users run; (iii) the types of graph software users use; and (iv) the major challenges users face when processing their graphs. We describe the responses of the participants to our questions, highlighting common patterns and challenges. The participants’ responses revealed surprising facts about graph processing in practice, which we hope can guide future research.

Data Discovery and Anomaly Detection Using Atypicality: Theory

A central question in the era of ‘big data’ is what to do with the enormous amount of information. One possibility is to characterize it through statistics, e.g., averages, or classify it using machine learning, in order to understand the general structure of the overall data. The perspective in this paper is the opposite, namely that most of the value in the information in some applications is in the parts that deviate from the average, that are unusual, atypical. We define what we mean by ‘atypical’ in an axiomatic way as data that can be encoded with fewer bits in itself rather than using the code for the typical data. We show that this definition has good theoretical properties. We then develop an implementation based on universal source coding, and apply this to a number of real world data sets.

Semi-Supervised Active Clustering with Weak Oracles

Semi-supervised active clustering (SSAC) utilizes the knowledge of a domain expert to cluster data points by interactively making pairwise ‘same-cluster’ queries. However, it is impractical to ask human oracles to answer every pairwise query. In this paper, we study the influence of allowing ‘not-sure’ answers from a weak oracle and propose algorithms to efficiently handle uncertainties. Different types of model assumptions are analyzed to cover realistic scenarios of oracle abstraction. In the first model, random-weak oracle, an oracle randomly abstains with a certain probability. We also proposed two distance-weak oracle models which simulate the case of getting confused based on the distance between two points in a pairwise query. For each weak oracle model, we show that a small query complexity is adequate for the effective k means clustering with high probability. Sufficient conditions for the guarantee include a \gamma-margin property of the data, and an existence of a point close to each cluster center. Furthermore, we provide a sample complexity with a reduced effect of the cluster’s margin and only a logarithmic dependency on the data dimension. Our results allow significantly less number of same-cluster queries if the margin of the clusters is tight, i.e. \gamma \approx 1. Experimental results on synthetic data show the effective performance of our approach in overcoming uncertainties.

Evolution of Convolutional Highway Networks

Convolutional highways are deep networks based on multiple stacked convolutional layers for feature preprocessing. We introduce an evolutionary algorithm (EA) for optimization of the structure and hyperparameters of convolutional highways and demonstrate the potential of this optimization setting on the well-known MNIST data set. The (1+1)-EA employs Rechenberg’s mutation rate control and a niching mechanism to overcome local optima adapts the optimization approach. An experimental study shows that the EA is capable of improving the state-of-the-art network contribution and of evolving highway networks from scratch.

Collaborative Reuse of Streaming Dataflows in IoT Applications

Distributed Stream Processing Systems (DSPS) like Apache Storm and Spark Streaming enable composition of continuous dataflows that execute persistently over data streams. They are used by Internet of Things (IoT) applications to analyze sensor data from Smart City cyber-infrastructure, and make active utility management decisions. As the ecosystem of such IoT applications that leverage shared urban sensor streams continue to grow, applications will perform duplicate pre-processing and analytics tasks. This offers the opportunity to collaboratively reuse the outputs of overlapping dataflows, thereby improving the resource efficiency. In this paper, we propose \emph{dataflow reuse algorithms} that given a submitted dataflow, identifies the intersection of reusable tasks and streams from a collection of running dataflows to form a \emph{merged dataflow}. Similar algorithms to unmerge dataflows when they are removed are also proposed. We implement these algorithms for the popular Apache Storm DSPS, and validate their performance and resource savings for 35 synthetic dataflows based on public OPMW workflows with diverse arrival and departure distributions, and on 21 real IoT dataflows from RIoTBench.

Simultaneous Dimension Reduction and Clustering via the NMF-EM Algorithm

Mixture models are among the most popular tools for model based clustering. However, when the dimension and the number of clusters is large, the estimation as well as the interpretation of the clusters become challenging. We propose a reduced-dimension mixture model, where the K components parameters are combinations of words from a small dictionary – say H words with H \ll K. Including a Nonnegative Matrix Factorization (NMF) in the EM algorithm allows to simultaneously estimate the dictionary and the parameters of the mixture. We propose the acronym NMF-EM for this algorithm. This original approach is motivated by passengers clustering from ticketing data: we apply NMF-EM to ticketing data from two Transdev public transport networks. In this case, the words are easily interpreted as typical slots in a timetable.

Why Do Deep Neural Networks Still Not Recognize These Images?: A Qualitative Analysis on Failure Cases of ImageNet Classification

In a recent decade, ImageNet has become the most notable and powerful benchmark database in computer vision and machine learning community. As ImageNet has emerged as a representative benchmark for evaluating the performance of novel deep learning models, its evaluation tends to include only quantitative measures such as error rate, rather than qualitative analysis. Thus, there are few studies that analyze the failure cases of deep learning models in ImageNet, though there are numerous works analyzing the networks themselves and visualizing them. In this abstract, we qualitatively analyze the failure cases of ImageNet classification results from recent deep learning model, and categorize these cases according to the certain image patterns. Through this failure analysis, we believe that it can be discovered what the final challenges are in ImageNet database, which the current deep learning model is still vulnerable to.

Cosmological Polytopes and the Wavefunction of the Universe
Globally Normalized Reader
Diversity of uniform intersecting families
Analysis of Unobserved Heterogeneity via Accelerated Failure Time Models Under Bayesian and Classical Approaches
Heat kernel estimates for non-symmetric stable-like processes
Obstructions to a small hyperbolicity in Helly graphs
Cosmic Divergence, Weak Cosmic Convergence, and Fixed Points at Infinity
Reversible Coalescing-Fragmentating Wasserstein Dynamics on the Real Line
CLaC at SemEval-2016 Task 11: Exploring linguistic and psycho-linguistic Features for Complex Word Identification
Uncertainty measurement with belief entropy on interference effect in Quantum-Like Bayesian Networks
Improving Heterogeneous Face Recognition with Conditional Adversarial Networks
Mixed Integer Programming with Convex/Concave Constraints: Fixed-Parameter Tractability and Applications to Multicovering and Voting
Roll-back Hamiltonian Monte Carlo
Towards information optimal simulation of partial differential equations
Privacy in Feedback: The Differentially Private LQG
Prosocial learning agents solve generalized Stag Hunts better than selfish ones
Dimension reduction in the context of structured deformations
On a class of quaternary complex Hadamard matrices
Variable Annealing Length and Parallelism in Simulated Annealing
Dynamic mode decomposition for interconnected control systems
Degrees of Freedom of the Broadcast Channel with Hybrid CSI at Transmitter and Receivers
Optimization assisted MCMC
Simultaneously Learning Neighborship and Projection Matrix for Supervised Dimensionality Reduction
Learning a Dilated Residual Network for SAR Image Despeckling
Estimating the theoretical error rate for prediction
Inverse Mapping for Rainfall-Runoff Models using History Matching Approach
Image Processing Operations Identification via Convolutional Neural Network
A Simple Analysis for Exp-concave Empirical Minimization with Arbitrary Convex Regularizer
A New Approximation Guarantee for Monotone Submodular Function Maximization via Discrete Convexity
Semi-Supervised Instance Population of an Ontology using Word Vector Embeddings
Sublinear-Time Algorithms for Compressive Phase Retrieval
On Low-Risk Heavy Hitters and Sparse Recovery Schemes
Graph Scaling Cut with L1-Norm for Classification of Hyperspectral Images
Joint Calibration of Panoramic Camera and Lidar Based on Supervised Learning
Model Distillation with Knowledge Transfer in Face Classification, Alignment and Verification
Identifying combinatorially symmetric Hidden Markov Models
Expected number of real zeros of random Taylor Series
Urban morphology meets deep learning: Exploring urban forms in one million cities, town and villages across the planet
How to Train Triplet Networks with 100K Identities?
Multigroup discrimination based on weighted local projections
Spectral Efficiency of Multipair Massive MIMO Two-Way Relaying with Hardware Impairments
Spectral and Energy Efficiency of Cell-Free Massive MIMO Systems with Hardware Impairments
Nonexistence of generalized strong external difference families
Sequential 3D U-Nets for Biologically-Informed Brain Tumor Segmentation
Matrix and Graph Operations for Relationship Inference: An Illustration with the Kinship Inference in the China Biographical Database
Optimization of Massive Full-Dimensional MIMO for Positioning and Communication
On exponential type Orlicz spaces of random variables
(Co)monads in Free Probability Theory
A Deep Structured Learning Approach Towards Automating Connectome Reconstruction from 3D Electron Micrographs
Energy Trade-off in Ground-to-UAV Communication via Trajectory Design
Optimal Detection for Diffusion-Based Molecular Timing Channels
A recursion on maximal chains in the Tamari lattices
Extremal $k$-forcing sets in oriented graphs
Large monochromatic components and long monochromatic cycles in random hypergraphs
Can you tell a face from a HEVC bitstream?
Balancing Communication and Computation in Distributed Optimization
Scaled Rate Optimization for Beta-Binomial Models
Identifying Irregular Power Usage by Turning Predictions into Holographic Spatial Visualizations
How to Train a CAT: Learning Canonical Appearance Transformations for Robust Direct Localization Under Illumination Change
Steering Output Style and Topic in Neural Response Generation
Support Equalities Among Ribbon Schur Functions
Global Convergence of Arbitrary-Block Gradient Methods for Generalized Polyak-Łojasiewicz Functions
One-sample aggregate data meta-analysis of medians
The Capacity of Private Information Retrieval with Private Side Information
Optimal Sensor Design and Zero-Delay Source Coding for Continuous-Time Vector Gauss-Markov Processes
Convolutional Neural Networks: Ensemble Modeling, Fine-Tuning and Unsupervised Semantic Localization
Abductive Matching in Question Answering
Distributed Block-diagonal Approximation Methods for Regularized Empirical Risk Minimization
Complete Classification of Generalized Santha-Vazirani Sources
Improving average ranking precision in user searches for biomedical research datasets
A DC Programming Approach for Solving Multicast Network Design Problems via the Nesterov Smoothing Technique
AppTechMiner: Mining Applications and Techniques from Scientific Articles
Quasi-polynomial Hitting Sets for Circuits with Restricted Parse Trees
Transversals, plexes, and multiplexes in iterated quasigroups
Constructing Strata to solve Sample Allocation Problems by Grouping Genetic Algorithm
Regularity of symbolic powers of cover ideals of graphs
A Product Shape Congruity Measure via Entropy in Shape Scale Space
Efficient Online Linear Optimization with Approximation Algorithms
The Golden Quantizer: The Complex Gaussian Random Variable Case
G-thinker: Big Graph Mining Made Easier and Faster
Cognitive networks: brains, internet, and civilizations
A Detail Based Method for Linear Full Reference Image Quality Prediction
Robust Emotion Recognition from Low Quality and Low Bit Rate Video: A Deep Learning Approach
DPC-Net: Deep Pose Correction for Visual Localization
Expectation thinning operators based on linear fractional probability generating functions
Location Privacy in Mobile Edge Clouds: A Chaff-based Approach
Methods in Estimation of Convex Sets
Fully Convolutional Neural Networks for Dynamic Object Detection in Grid Maps (Masters Thesis)
Fully Convolutional Neural Networks for Dynamic Object Detection in Grid Maps
Quiver mutation and combinatorial DT-invariants
Stable super-resolution limit and smallest singular value of restricted Fourier matrices
Recent progress in log-concave density estimation
Bayesian bandits: balancing the exploration-exploitation tradeoff via double sampling
Variational inference for the multi-armed contextual bandit
On portfolios generated by optimal transport
An Iterative Regression Approach for Face Pose Estimation from RGB Images
Hyperfinite graphings and combinatorial optimization
Rates of Convergence of Spectral Methods for Graphon Estimation
Applying ACO To Large Scale TSP Instances
Data-Driven Dialogue Systems for Social Agents
Data Discovery and Anomaly Detection Using Atypicality: Signal Processing Methods
Deep multi-frame face hallucination for face identification
A Note on Property Testing Sum of Squares and Multivariate Polynomial Interpolation
3D Densely Convolution Networks for Volumetric Segmentation
Recurrent neural networks based Indic word-wise script identification using character-wise training
Double-line rigid origami
New characterization and parametrization of LCD Codes
Enumeration of Labelled and Unlabelled Hamiltonian Cycles in Complete $k$-partite Graphs
Fairness Testing: Testing Software for Discrimination
On Revenue Monotonicity in Combinatorial Auctions
Connecting thermodynamic and dynamical anomalies of water-like liquid-liquid phase transition in the Fermi-Jagla model
Enumeration of $r$-regular Maps on the Torus. Part I: Enumeration of Rooted and Sensed Maps
The bounded derived category of a poset
Enumeration of $r$-regular Maps on the Torus. Part II: Enumeration of Unsensed Maps
Fast construction of efficient composite likelihood equations
On better training the infinite restricted Boltzmann machines
Centralized Recursive Optimal Scheduling of Parallel Buck Regulated Battery Modules
Evaluation of Classical Features and Classifiers in Brain-Computer Interface Tasks
A Short Note on Proximity-based Scoring of Documents with Multiple Fields
Report: Performance comparison between C2075 and P100 GPU cards using cosmological correlation functions
Secure and Trustable Distributed Aggregation based on Kademlia
Local null controllability of the control-affine nonlinear systems with time-varying disturbances. Direct calculation of the null controllable region
Mining relevant interval rules
Expert Opinion Extraction from a Biomedical Database
Beyond Empirical Models: Pattern Formation Driven Placement of UAV Base Stations
Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
Uncertainty quantification in urban drainage simulation: fast surrogates for sensitivity analysis and model calibration
Note on list star edge-coloring of subcubic graphs
Root Separation for Trinomials
Cellular Automaton Based Simulation of Large Pedestrian Facilities – A Case Study on the Staten Island Ferry Terminals
Pilot Optimization and Power Allocation for OFDM-based Full-duplex Relay Networks with IQ-imbalances
Discriminant chronicles mining: Application to care pathways analytics
Additive energy forward curves in a Heath-Jarrow-Morton framework
A determinant-free method to simulate the parameters of large Gaussian fields
What does fault tolerant Deep Learning need from MPI?
Odd length in Weyl groups
Performance Analysis of Massive MIMO Networks with Random Unitary Pilot Matrices
weedNet: Dense Semantic Weed Classification Using Multispectral Images and MAV for Smart Farming
Twin subgraphs and core-semiperiphery-periphery structures
Coherence resonance in a network of FitzHugh-Nagumo systems: interplay of noise, time-delay and topology
Autonomous Quadrotor Landing using Deep Reinforcement Learning
Optimal non-asymptotic bound of the Ruppert-Polyak averaging without strong convexity
Strong convergence of the Euler–Maruyama approximation for a class of Lévy-driven SDEs
The Ramsey-Turán problem for cliques
Chromatic symmetric functions and H-free graphs
Crossover from impurity-controlled to granular superconductivity in (TMTSF)2ClO4
A Planning Approach to Monitoring Behavior of Computer Programs
Stack-Captioning: Coarse-to-Fine Learning for Image Captioning
Ghost Penalties in Nonconvex Constrained Optimization: Diminishing Stepsizes and Iteration Complexity
Automated Identification of Trampoline Skills Using Computer Vision Extracted Pose Estimation
A Domain-specific Language for High-reliability Software used in the JUICE SWI Instrument – The hO Language Manual
Social Media Text Processing and Semantic Analysis for Smart Cities
Asymptotic normality of Laplacian coefficients of graphs
Generic Sketch-Based Retrieval Learned without Drawing a Single Sketch
One-Shot Learning for Semantic Segmentation
The optimal exponential rate for acute sets
Ruin Probability of Mixed fractional Brownian motion
Ensemble Methods as a Defense to Adversarial Perturbations Against Deep Neural Networks
Constant-Weight Array Codes
A Longitudinal Diagnostic Classification Model
Positive polynomials on unbounded domains
Coin-flipping, ball-dropping, and grass-hopping for generating random graphs from matrices of edge probabilities
The Diverse Cohort Selection Problem: Multi-Armed Bandits with Varied Pulls
Optimal Control Problem in a Stochastic Model with Periodic Hits on the Boundary of a Given Subset of the State Set (Tuning Problem)
Control of a single-particle localization in open quantum systems
UI-Net: Interactive Artificial Neural Networks for Iterative Image Segmentation Based on a User Model
Lattice size of 2D and 3D polytopes with respect to the cube
On the use of the Edgeworth expansion in cosmology I: how to foresee and evade its pitfalls
Lattice size of polygons with respect to the standard simplex
CLAD: A Complex and Long Activities Dataset with Rich Crowdsourced Annotations
Online Learning in Weakly Coupled Markov Decision Processes: A Convergence Time Study
Optimal subgraph structures in scale-free configuration models
On the TAP free energy in the mixed $p$-spin models
Bayesian inference, model selection and likelihood estimation using fast rejection sampling: the Conway-Maxwell-Poisson distribution
Is completeness necessary? Penalized estimation in non-identified models
Cutoff for biased transpositions
Exploring the Single-Particle Mobility Edge in a One-Dimensional Quasiperiodic Optical Lattice
Combining Strategic Learning and Tactical Search in Real-Time Strategy Games
Deep Generative Filter for Motion Deblurring
NiftyNet: a deep-learning platform for medical imaging
On compact packings of the plane with circles of three radii
Energy Harvesting Communications under Explicit and Implicit Temperature Constraints