Online Dynamic Programming

We consider the problem of repeatedly solving a variant of the same dynamic programming problem in successive trials. An instance of the type of problems we consider is to find the optimal binary search tree. At the beginning of each trial, the learner probabilistically chooses a tree with the n keys at the internal nodes and the n + 1 gaps between keys at the leaves. It is then told the frequencies of the keys and gaps and is charged by the average search cost for the chosen tree. The problem is online because the frequencies can change between trials. The goal is to develop algorithms with the property that their total average search cost (loss) in all trials is close to the total loss of the best tree chosen in hind sight for all trials. The challenge, of course, is that the algorithm has to deal with exponential number of trees. We develop a methodology for tackling such problems for a wide class of dynamic programming algorithms. Our framework allows us to extend online learning algorithms like Hedge and Component Hedge to a significantly wider class of combinatorial objects than was possible before.

MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU

In this paper, we explore optimizations to run Recurrent Neural Network (RNN) models locally on mobile devices. RNN models are widely used for Natural Language Processing, Machine Translation, and other tasks. However, existing mobile applications that use RNN models do so on the cloud. To address privacy and efficiency concerns, we show how RNN models can be run locally on mobile devices. Existing work on porting deep learning models to mobile devices focus on Convolution Neural Networks (CNNs) and cannot be applied directly to RNN models. In response, we present MobiRNN, a mobile-specific optimization framework that implements GPU offloading specifically for mobile GPUs. Evaluations using an RNN model for activity recognition shows that MobiRNN does significantly decrease the latency of running RNN models on phones.

Task-specific Word Identification from Short Texts Using a Convolutional Neural Network

Task-specific word identification aims to choose the task-related words that best describe a short text. Existing approaches require well-defined seed words or lexical dictionaries (e.g., WordNet), which are often unavailable for many applications such as social discrimination detection and fake review detection. However, we often have a set of labeled short texts where each short text has a task-related class label, e.g., discriminatory or non-discriminatory, specified by users or learned by classification algorithms. In this paper, we focus on identifying task-specific words and phrases from short texts by exploiting their class labels rather than using seed words or lexical dictionaries. We consider the task-specific word and phrase identification as feature learning. We train a convolutional neural network over a set of labeled texts and use score vectors to localize the task-specific words and phrases. Experimental results on sentiment word identification show that our approach significantly outperforms existing methods. We further conduct two case studies to show the effectiveness of our approach. One case study on a crawled tweets dataset demonstrates that our approach can successfully capture the discrimination-related words/phrases. The other case study on fake review detection shows that our approach can identify the fake-review words/phrases.

IDK Cascades: Fast Deep Learning by Learning not to Overthink

Advances in deep learning have led to substantial increases in prediction accuracy as well as the cost of rendering predictions. We conjecture that for a majority of real-world inputs, the recent advances in deep learning have created models that effectively ‘over-think’ on simple inputs. In this paper we revisit the classic idea of prediction cascades to reduce prediction costs. We introduce the ‘I Don’t Know’ (IDK) prediction cascades framework, a general framework for constructing prediction cascades for arbitrary multi-class prediction tasks. We propose two baseline methods for constructing cascades as well as a new objective within this framework and evaluate these techniques on a range of benchmark and real-world datasets to demonstrate the prediction cascades can achieve 1.7-10.5x speedups in image classification tasks while maintaining comparable accuracy to state-of-the-art models. When combined with human experts, prediction cascades can achieve nearly perfect accuracy(within 5%) while requiring human intervention on less than 30% of the queries.

Learning by Association – A versatile semi-supervised training method for neural networks

In many real-world scenarios, labeled data for a specific machine learning task is costly to obtain. Semi-supervised training methods make use of abundantly available unlabeled data and a smaller number of labeled examples. We propose a new framework for semi-supervised training of deep neural networks inspired by learning in humans. ‘Associations’ are made from embeddings of labeled samples to those of unlabeled ones and back. The optimization schedule encourages correct association cycles that end up at the same class from which the association was started and penalizes wrong associations ending at a different class. The implementation is easy to use and can be added to any existing end-to-end training setup. We demonstrate the capabilities of learning by association on several data sets and show that it can improve performance on classification tasks tremendously by making use of additionally available unlabeled data. In particular, for cases with few labeled data, our training scheme outperforms the current state of the art on SVHN.

Financial Series Prediction: Comparison Between Precision of Time Series Models and Machine Learning Methods

Precise financial series predicting has long been a difficult problem because of unstableness and many noises within the series. Although Traditional time series models like ARIMA and GARCH have been researched and proved to be effective in predicting, their performances are still far from satisfying. Machine Learning, as an emerging research field in recent years, has brought about many incredible improvements in tasks such as regressing and classifying, and it’s also promising to exploit the methodology in financial time series predicting. In this paper, the predicting precision of financial time series between traditional time series models and mainstream machine learning models including some state-of-the-art ones of deep learning are compared through experiment using real stock index data from history. The result shows that machine learning as a modern method far surpasses traditional models in precision.

Semantic Vector Encoding and Similarity Search Using Fulltext Search Engines

Vector representations and vector space modeling (VSM) play a central role in modern machine learning. We propose a novel approach to `vector similarity searching’ over dense semantic representations of words and documents that can be deployed on top of traditional inverted-index-based fulltext engines, taking advantage of their robustness, stability, scalability and ubiquity. We show that this approach allows the indexing and querying of dense vectors in text domains. This opens up exciting avenues for major efficiency gains, along with simpler deployment, scaling and monitoring. The end result is a fast and scalable vector database with a tunable trade-off between vector search performance and quality, backed by a standard fulltext engine such as Elasticsearch. We empirically demonstrate its querying performance and quality by applying this solution to the task of semantic searching over a dense vector representation of the entire English Wikipedia.

Semi-supervised Classification: Cluster and Label Approach using Particle Swarm Optimization

Classification predicts classes of objects using the knowledge learned during the training phase. This process requires learning from labeled samples. However, the labeled samples usually limited. Annotation process is annoying, tedious, expensive, and requires human experts. Meanwhile, unlabeled data is available and almost free. Semi-supervised learning approaches make use of both labeled and unlabeled data. This paper introduces cluster and label approach using PSO for semi-supervised classification. PSO is competitive to traditional clustering algorithms. A new local best PSO is presented to cluster the unlabeled data. The available labeled data guides the learning process. The experiments are conducted using four state-of-the-art datasets from different domains. The results compared with Label Propagation a popular semi-supervised classifier and two state-of-the-art supervised classification models, namely k-nearest neighbors and decision trees. The experiments show the efficiency of the proposed model.

Swarm Intelligence in Semi-supervised Classification

This Paper represents a literature review of Swarm intelligence algorithm in the area of semi-supervised classification. There are many research papers for applying swarm intelligence algorithms in the area of machine learning. Some algorithms of SI are applied in the area of ML either solely or hybrid with other ML algorithms. SI algorithms are also used for tuning parameters of ML algorithm, or as a backbone for ML algorithms. This paper introduces a brief literature review for applying swarm intelligence algorithms in the field of semi-supervised learning

Planning with Multiple Biases

Recent work has considered theoretical models for the behavior of agents with specific behavioral biases: rather than making decisions that optimize a given payoff function, the agent behaves inefficiently because its decisions suffer from an underlying bias. These approaches have generally considered an agent who experiences a single behavioral bias, studying the effect of this bias on the outcome. In general, however, decision-making can and will be affected by multiple biases operating at the same time. How do multiple biases interact to produce the overall outcome? Here we consider decisions in the presence of a pair of biases exhibiting an intuitively natural interaction: present bias — the tendency to value costs incurred in the present too highly — and sunk-cost bias — the tendency to incorporate costs experienced in the past into one’s plans for the future. We propose a theoretical model for planning with this pair of biases, and we show how certain natural behavioral phenomena can arise in our model only when agents exhibit both biases. As part of our model we differentiate between agents that are aware of their biases (sophisticated) and agents that are unaware of them (naive). Interestingly, we show that the interaction between the two biases is quite complex: in some cases, they mitigate each other’s effects while in other cases they might amplify each other. We obtain a number of further results as well, including the fact that the planning problem in our model for an agent experiencing and aware of both biases is computationally hard in general, though tractable under more relaxed assumptions.

CRNN: A Joint Neural Network for Redundancy Detection

This paper proposes a novel framework for detecting redundancy in supervised sentence categorisation. Unlike traditional singleton neural network, our model incorporates character-aware convolutional neural network (Char-CNN) with character-aware recurrent neural network (Char-RNN) to form a convolutional recurrent neural network (CRNN). Our model benefits from Char-CNN in that only salient features are selected and fed into the integrated Char-RNN. Char-RNN effectively learns long sequence semantics via sophisticated update mechanism. We compare our framework against the state-of-the-art text classification algorithms on four popular benchmarking corpus. For instance, our model achieves competing precision rate, recall ratio, and F1 score on the Google-news data-set. For twenty-news-groups data stream, our algorithm obtains the optimum on precision rate, recall ratio, and F1 score. For Brown Corpus, our framework obtains the best F1 score and almost equivalent precision rate and recall ratio over the top competitor. For the question classification collection, CRNN produces the optimal recall rate and F1 score and comparable precision rate. We also analyse three different RNN hidden recurrent cells’ impact on performance and their runtime efficiency. We observe that MGU achieves the optimal runtime and comparable performance against GRU and LSTM. For TFIDF based algorithms, we experiment with word2vec, GloVe, and sent2vec embeddings and report their performance differences.

Stochastic Reformulations of Linear Systems: Algorithms and Convergence Theory

We develop a family of reformulations of an arbitrary consistent linear system into a stochastic problem. The reformulations are governed by two user-defined parameters: a positive definite matrix defining a norm, and an arbitrary discrete or continuous distribution over random matrices. Our reformulation has several equivalent interpretations, allowing for researchers from various communities to leverage their domain specific insights. In particular, our reformulation can be equivalently seen as a stochastic optimization problem, stochastic linear system, stochastic fixed point problem and a probabilistic intersection problem. We prove sufficient, and necessary and sufficient conditions for the reformulation to be exact. Further, we propose and analyze three stochastic algorithms for solving the reformulated problem—basic, parallel and accelerated methods—with global linear convergence rates. The rates can be interpreted as condition numbers of a matrix which depends on the system matrix and on the reformulation parameters. This gives rise to a new phenomenon which we call stochastic preconditioning, and which refers to the problem of finding parameters (matrix and distribution) leading to a sufficiently small condition number. Our basic method can be equivalently interpreted as stochastic gradient descent, stochastic Newton method, stochastic proximal point method, stochastic fixed point method, and stochastic projection method, with fixed stepsize (relaxation parameter), applied to the reformulations.

InfiniteBoost: building infinite ensembles with gradient descent

In machine learning ensemble methods have demonstrated high accuracy for the variety of problems in different areas. The most known algorithms intensively used in practice are random forests and gradient boosting. In this paper we present InfiniteBoost – a novel algorithm, which combines the best properties of these two approaches. The algorithm constructs the ensemble of trees for which two properties hold: trees of the ensemble incorporate the mistakes done by others; at the same time the ensemble could contain the infinite number of trees without the over-fitting effect. The proposed algorithm is evaluated on the regression, classification, and ranking tasks using large scale, publicly available datasets.

Deep MIMO Detection

In this paper, we consider the use of deep neural networks in the context of Multiple-Input-Multiple-Output (MIMO) detection. We give a brief introduction to deep learning and propose a modern neural network architecture suitable for this detection task. First, we consider the case in which the MIMO channel is constant, and we learn a detector for a specific system. Next, we consider the harder case in which the parameters are known yet changing and a single detector must be learned for all multiple varying channels. We demonstrate the performance of our deep MIMO detector using numerical simulations in comparison to competing methods including approximate message passing and semidefinite relaxation. The results show that deep networks can achieve state of the art accuracy with significantly lower complexity while providing robustness against ill conditioned channels and mis-specified noise variance.

Deep Frame Interpolation

This work presents a supervised learning based approach to the computer vision problem of frame interpolation. The presented technique could also be used in the cartoon animations since drawing each individual frame consumes a noticeable amount of time. The most existing solutions to this problem use unsupervised methods and focus only on real life videos with already high frame rate. However, the experiments show that such methods do not work as well when the frame rate becomes low and object displacements between frames becomes large. This is due to the fact that interpolation of the large displacement motion requires knowledge of the motion structure thus the simple techniques such as frame averaging start to fail. In this work the deep convolutional neural network is used to solve the frame interpolation problem. In addition, it is shown that incorporating the prior information such as optical flow improves the interpolation quality significantly.

Empirical Bayes Matrix Completion

We develop an empirical Bayes (EB) algorithm for the matrix completion problems. The EB algorithm is motivated from the singular value shrinkage estimator for matrix means by Efron and Morris (1972). Unlike existing methods, the EB algorithm does not require heuristic parameter tuning other than tolerance. Numerical results demonstrated that the EB algorithm achieves a good trade-off between accuracy and efficiency compared to existing algorithms and that it works particularly well when the difference between the number of rows and columns is large. Application to real data also shows the practical utility of the EB algorithm.

Learning Neural Programs To Parse Programs

In this work, we study an important problem: learning programs from input-output examples. We propose a novel method to learn a neural program operating a domain-specific non-differentiable machine, and demonstrate that this method can be applied to learn programs that are significantly more complex than the ones synthesized before: programming language parsers from input-output pairs without knowing the underlying grammar. The main challenge is to train the neural program without supervision on execution traces. To tackle it, we propose: (1) LL machines and neural programs operating them to effectively regularize the space of the learned programs; and (2) a two-phase reinforcement learning-based search technique to train the model. Our evaluation demonstrates that our approach can successfully learn to parse programs in both an imperative language and a functional language, and achieve 100% test accuracy, while existing approaches’ accuracies are almost 0%. This is the first successful demonstration of applying reinforcement learning to train a neural program operating a non-differentiable machine that can fully generalize to test sets on a non-trivial task.

Submanifold Sparse Convolutional Networks

Convolutional network are the de-facto standard for analysing spatio-temporal data such as images, videos, 3D shapes, etc. Whilst some of this data is naturally dense (for instance, photos), many other data sources are inherently sparse. Examples include pen-strokes forming on a piece of paper, or (colored) 3D point clouds that were obtained using a LiDAR scanner or RGB-D camera. Standard ‘dense’ implementations of convolutional networks are very inefficient when applied on such sparse data. We introduce a sparse convolutional operation tailored to processing sparse data that differs from prior work on sparse convolutional networks in that it operates strictly on submanifolds, rather than ‘dilating’ the observation with every layer in the network. Our empirical analysis of the resulting submanifold sparse convolutional networks shows that they perform on par with state-of-the-art methods whilst requiring substantially less computation.

Deep learning evaluation using deep linguistic processing

We discuss problems with the standard approaches to evaluation for tasks like visual question answering, and argue that artificial data can be used to address these as a complement to current practice. We demonstrate that with the help of existing ‘deep’ linguistic processing technology we are able to create challenging abstract datasets, which enable us to investigate the language understanding abilities of multimodal deep learning models in detail.

Event Representations for Automated Story Generation with Deep Neural Nets

Automated story generation is the problem of automatically selecting a sequence of events, actions, or words that can be told as a story. We seek to develop a system that can generate stories by learning everything it needs to know from textual story corpora. To date, recurrent neural networks that learn language models at character, word, or sentence levels have had little success generating coherent stories. We explore the question of event representations that provide a mid-level of abstraction between words and sentences in order to retain the semantic information of the original data while minimizing event sparsity. We present a technique for preprocessing textual story data into event sequences. We then present a technique for automated story generation whereby we decompose the problem into the generation of successive events (event2event) and the generation of natural language sentences from events (event2sentence). We give empirical results comparing different event representations and their effects on event successor generation and the translation of events to natural language.

ToPs: Ensemble Learning with Trees of Predictors

We present a new approach to ensemble learning. Our approach constructs a tree of subsets of the feature space and associates a predictor (predictive model) – determined by training one of a given family of base learners on an endogenously determined training set – to each node of the tree; we call the resulting object a tree of predictors. The (locally) optimal tree of predictors is derived recursively; each step involves jointly optimizing the split of the terminal nodes of the previous tree and the choice of learner and training set (hence predictor) for each set in the split. The feature vector of a new instance determines a unique path through the optimal tree of predictors; the final prediction aggregates the predictions of the predictors along this path. We derive loss bounds for the final predictor in terms of the Rademacher complexity of the base learners. We report the results of a number of experiments on a variety of datasets, showing that our approach provides statistically significant improvements over state-of-the-art machine learning algorithms, including various ensemble learning methods. Our approach works because it allows us to endogenously create more complex learners – when needed – and endogenously match both the learner and the training set to the characteristics of the dataset while still avoiding over-fitting.

NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps

Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving many state-of-the-art (SOA) visual processing tasks. Even though Graphical Processing Units (GPUs) are most often used in training and deploying CNNs, their power consumption becomes a problem for real time mobile applications. We propose a flexible and efficient CNN accelerator architecture which can support the implementation of SOA CNNs in low-power and low-latency application scenarios. This architecture exploits the sparsity of neuron activations in CNNs to accelerate the computation and reduce memory requirements. The flexible architecture allows high utilization of available computing resources across a wide range of convolutional network kernel sizes; and numbers of input and output feature maps. We implemented the proposed architecture on an FPGA platform and present results showing how our implementation reduces external memory transfers and compute time in five different CNNs ranging from small ones up to the widely known large VGG16 and VGG19 CNNs. We show how in RTL simulations in a 28nm process with a clock frequency of 500MHz, the NullHop core is able to reach over 450 GOp/s and efficiency of 368%, maintaining over 98% utilization of the MAC units and achieving a power efficiency of over 3TOp/s/W in a core area of 5.8mm2

Double Sparsity Kernel Learning with Automatic Variable Selection and Data Extraction

Learning with Reproducing Kernel Hilbert Spaces (RKHS) has been widely used in many scientific disciplines. Because a RKHS can be very flexible, it is common to impose a regularization term in the optimization to prevent overfitting. Standard RKHS learning employs the squared norm penalty of the learning function. Despite its success, many challenges remain. In particular, one cannot directly use the squared norm penalty for variable selection or data extraction. Therefore, when there exists noise predictors, or the underlying function has a sparse representation in the dual space, the performance of standard RKHS learning can be suboptimal. In the literature,work has been proposed on how to perform variable selection in RKHS learning, and a data sparsity constraint was considered for data extraction. However, how to learn in a RKHS with both variable selection and data extraction simultaneously remains unclear. In this paper, we propose a unified RKHS learning method, namely, DOuble Sparsity Kernel (DOSK) learning, to overcome this challenge. An efficient algorithm is provided to solve the corresponding optimization problem. We prove that under certain conditions, our new method can asymptotically achieve variable selection consistency. Simulated and real data results demonstrate that DOSK is highly competitive among existing approaches for RKHS learning.

Visual Interaction Networks

From just a glance, humans can make rich predictions about the future state of a wide range of physical systems. On the other hand, modern approaches from engineering, robotics, and graphics are often restricted to narrow domains and require direct measurements of the underlying states. We introduce the Visual Interaction Network, a general-purpose model for learning the dynamics of a physical system from raw visual observations. Our model consists of a perceptual front-end based on convolutional neural networks and a dynamics predictor based on interaction networks. Through joint training, the perceptual front-end learns to parse a dynamic visual scene into a set of factored latent object representations. The dynamics predictor learns to roll these states forward in time by computing their interactions and dynamics, producing a predicted physical trajectory of arbitrary length. We found that from just six input video frames the Visual Interaction Network can generate accurate future trajectories of hundreds of time steps on a wide range of physical systems. Our model can also be applied to scenes with invisible objects, inferring their future states from their effects on the visible objects, and can implicitly infer the unknown mass of objects. Our results demonstrate that the perceptual module and the object-based dynamics predictor module can induce factored latent representations that support accurate dynamical predictions. This work opens new opportunities for model-based decision-making and planning from raw sensory observations in complex physical environments.

Ensemble Bayesian Optimization

Bayesian Optimization (BO) has been shown to be a very effective paradigm for tackling hard black-box and non-convex optimization problems encountered in Machine Learning. Despite these successes, the computational complexity of the underlying function approximation has restricted the use of BO to problems that can be handled with less than a few thousand function evaluations. Harder problems like those involving functions operating in very high dimensional spaces may require hundreds of thousands or millions of evaluations or more and become computationally intractable to handle using standard Bayesian Optimization methods. In this paper, we propose Ensemble Bayesian Optimization (EBO) to overcome this problem. Unlike conventional BO methods that operate on a single posterior GP model, EBO works with an ensemble of posterior GP models. Further, we represent each GP model using tile coding random features and an additive function structure. Our approach generates speedups by parallelizing the time consuming hyper-parameter posterior inference and functional evaluations on hundreds of cores and aggregating the models in every iteration of BO. Our extensive experimental evaluation shows that EBO can speed up the posterior inference between 2-3 orders of magnitude (400 times in one experiment) compared to the state-of-the-art by putting data into Mondrian bins without sacrificing the sample quality. We demonstrate the ability of EBO to handle sample-intensive hard optimization problems by applying it to a rover navigation problem with tens of thousands of observations.

SimDex: Exploiting Model Similarity in Exact Matrix Factorization Recommendations

We present SimDex, a new technique for serving exact top-K recommendations on matrix factorization models that measures and optimizes for the similarity between users in the model. Previous serving techniques presume a high degree of similarity (e.g., L2 or cosine distance) among users and/or items in MF models; however, as we demonstrate, the most accurate models are not guaranteed to exhibit high similarity. As a result, brute-force matrix multiply outperforms recent proposals for top-K serving on several collaborative filtering tasks. Based on this observation, we develop SimDex, a new technique for serving matrix factorization models that automatically optimizes serving based on the degree of similarity between users, and outperforms existing methods in both the high-similarity and low-similarity regimes. SimDexfirst measures the degree of similarity among users via clustering and uses a cost-based optimizer to either construct an index on the model or defer to blocked matrix multiply. It leverages highly efficient linear algebra primitives in both cases to deliver predictions either from its index or from brute-force multiply. Overall, SimDex runs an average of 2x and up to 6x faster than highly optimized baselines for the most accurate models on several popular collaborative filtering datasets.

Dynamics of many-body localization in the presence of particle loss
Zero forcing number, Grundy domination number, and their variants
Two-Point Codes for the Generalized GK curve
A watershed-based algorithm to segment and classify cells in fluorescence microscopy images
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Search Engine
Information, Privacy and Stability in Adaptive Data Analysis
Comparative Performance Analysis of the Cumulative Sum Chart and the Shiryaev-Roberts Procedure for Detecting Changes in Autocorrelated Data
One-Sided Unsupervised Domain Mapping
Multi-Class Model Fitting by Energy Minimization and Mode-Seeking
The star sequence and the general first Zagreb index
On the fast convergence of random perturbations of the gradient flow
Higher-order meshing of implicit geometries – part II: Approximations on manifolds
Construction of q-ary Constant Weight Sequences using a Knuth-like Approach
Neureal Network-Based Automatic Liver Tumor Segmentation With Random Forest-Based Candidate Filtering
A Game of Nontransitive Dice
Minimax Optimal Rates of Estimation In Functional ANOVA Models When Data On Derivatives Are Available
A combinatorial proof of Bass’s determinant formula for the zeta function of regular graphs
A Construction for Balancing Non-Binary Sequences Based on Gray Code Prefixes
Multivariate initial sequence estimators in Markov chain Monte Carlo
Multiple Kernel Learning and Automatic Subspace Relevance Determination for High-dimensional Neuroimaging Data
Homogeneity Pursuit in Single Index Models based Panel Data Analysis
Efficient Textual Representation of Structure
On the reduced Euler characteristic of independence complexes of circulant graphs
Inference for penalized spline regression: Improving confidence intervals by reducing the penalty
Active learning machine learns to create new quantum experiments
Spectral Simplicity of Apparent Complexity, Part II: Exact Complexities and Complexity Spectra
Controlling Sources of Inaccuracy in Stochastic Kriging
Wikipedia Vandal Early Detection: from User Behavior to User Embedding
Spectrum-based deep neural networks for fraud detection
Learning Person Trajectory Representations for Team Activity Analysis
Using Negative Curvature in Solving Nonlinear Programs
Optimal Envelope Approximation in Fourier Basis with Applications in TV White Space
Generalized Fit for Asymptotic Predictions of Heavy-Tail Experimental Transients
X-TCP: A Cross Layer Approach for TCP Uplink Flows in mmWave Networks
Heterogeneous Face Attribute Estimation: A Deep Multi-Task Learning Approach
Iterative Particle Approximation for McKean-Vlasov SDEs with application to Multilevel Monte Carlo estimation
Analyzing Random Permutations for Cyclic Coordinate Descent
Bilinear log n – log p relation and critical power-law grain size distribution of crushable aggregates under compression and shear
Inferring protein-protein interaction and protein-DNA interaction directions based on cause-effect pairs in undirected and mixed networks
Deep-Learning Convolutional Neural Networks for scattered shrub detection with Google Earth Imagery
Higher-order meshing of implicit geometries – part III: Conformal Decomposition FEM (CDFEM)
Lorden’s inequality and coupling method for backward renewal process
Simultaneous Inference of User Representations and Trust
Concept Transfer Learning for Adaptive Language Understanding
Concurrence-Aware Long Short-Term Sub-Memories for Person-Person Action Recognition
See, Hear, and Read: Deep Aligned Representations
Non-flat regular polytopes and restrictions on chiral polytopes
Context-aware, Adaptive and Scalable Android Malware Detection through Online Learning (extended version)
On singularity of distribution of random variables with independent symbols of Oppenheim expansions
Experiment Software and Projects on the Web with VISPA
Design and Execution of make-like, distributed Analyses based on Spotify’s Pipelining Package Luigi
Local systems on arrangements of smooth, complex algebraic hypersurfaces
Fermionic approach to weighted Hurwitz numbers and topological recursion
A Bayesian General Linear Modeling Approach to Cortical Surface fMRI Data Analysis
Rates of estimation for determinantal point processes
M/G/c/c state dependent queuing model for a road traffic system of two sections in tandem
$L^1$ solutions of non-reflected BSDEs and reflected BSDEs with one and two continuous barriers under general assumptions
Precious Time: Understanding Social Stratification in the Knowledge Society Through Time Allocation
Flip-distance between α-orientations of graphs embedded on plane and sphere
More Accurate Entity Ranking Using Knowledge Graph and Web Corpus
Thompson Sampling for the MNL-Bandit
Graph-Cut RANSAC
Isolated partial Hadamard matrices, and related topics
Visuospatial Skill Learning for Robots
A Fast x86 Implementation of Select
Linear response and moderate deviations: hierarchical approach. II
Block patterns in generalized Euler Permutations
Center of Gravity PSO for Partitioning Clustering
Order embeddings and character-level convolutions for multimodal alignment
Image Compression Based on Compressive Sensing: End-to-End Comparison with JPEG
A spectral analysis of discrete-time quantum walks with related to birth and death chains
DeepSF: deep convolutional neural network for mapping protein sequences to folds
Path Integral Quantization of Volume
Non-convex Penalties with Analytical Solutions for One-bit Compressive Sensing
The split-and-drift random graph, a null model for speciation
Where and Who? Automatic Semantic-Aware Person Composition
Distributed Contingency Analysis over Wide Area Network among Dispatch Centers
Adaptive Multiple-Arm Identification
Pfaffian Formulas and Schur Q-Function Identities
A note on conditional versus joint unconditional weak convergence in bootstrap consistency results
Improving Legal Information Retrieval by Distributional Composition with Term Order Probabilities
Personalized Age Progression with Bi-level Aging Dictionary Learning
Brain Intelligence: Go Beyond Artificial Intelligence
Why do launch trajectories end downwards
Slow Spin Dynamics and Self-Sustained Clusters in Sparsely Connected Systems
Load Balancing in Large-Scale Systems with Multiple Dispatchers
Face R-CNN
Minimal Dominating Set problem studied by simulated annealing and cavity method: Analytics and population dynamics
Asymptotic Goodness-of-Fit Tests for Point Processes Based on Scaled Empirical K-Functions
Actor-Critic for Linearly-Solvable Continuous MDP with Partially Known Dynamics
Geometry of the Gibbs measure for the discrete 2D Gaussian free field with scale dependent variance
Nearly Optimal Sampling Algorithms for Combinatorial Pure Exploration
In elections, irrelevant alternatives provide relevant data
Joint Text Embedding for Personalized Content-based Recommendation
Latent Estimation of GDP, GDP per capita, and Population from Historic and Contemporary Sources
SocioSense: Robot Navigation Amongst Pedestrians with Social and Psychological Constraints
A Random-Fern based Feature Approach for Image Matching
Evolving imputation strategies for missing data in classification problems with TPOT
Optimal learning via local entropies and sample compression
Virtual Constraints and Hybrid Zero Dynamics for Realizing Underactuated Bipedal Locomotion
Strategic Dynamic Pricing with Network Externalities
Segmentation of Intracranial Arterial Calcification with Deeply Supervised Residual Dropout Networks
Linear Capacity of Networks over Ring Alphabets
High-dimensional GARCH process segmentation with an application to Value-at-Risk
On the game total domination number
Graphical Nonconvex Optimization for Optimal Estimation in Gaussian Graphical Models
On the properties of new families of generalized Fibonacci numbers
Distributional Compatibility for Change of Measures
Greedy Approaches to Symmetric Orthogonal Tensor Decomposition
Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification
Improved Consistent Weighted Sampling Revisited
Maximum Likelihood Signal Amplitude Estimation Based on Permuted Blocks of Differently Binary Quantized Observations of a Signal in Noise
Optimal Rates for Community Estimation in the Weighted Stochastic Block Model
A weighted global GMRES algorithm with deflation for solving large Sylvester matrix equations
PReP: Path-Based Relevance from a Probabilistic Perspective in Heterogeneous Information Networks
The Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square
Formation Control of Rigid Graphs with a Flex Node Addition
One-step and Two-step Classification for Abusive Language Detection on Twitter
The number of hypergraphs without linear cycles
A Kind of Affine Weighted Moment Invariants
Pile-up Reduction, Bayesian Decomposition and Applications of Silicon Drift Detectors at LCLS
Forbidden subposet problems for traces of set families
Inconsistent Node Flattening for Improving Top-down Hierarchical Classification
Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework
Computing a minimal partition of partial orders into heapable subsets
Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning
Learning Structured Semantic Embeddings for Visual Recognition
One look at the rating of scientific publications and corresponding toy-model
On the Identifiability of Diagnostic Classification Models
Bayesian LSTMs in medicine
Moral hazard in welfare economics: on the advantage of Planner’s advices to manage employees’ actions
The Classical Complexity of Boson Sampling
Performance evaluation of coherent Ising machines against classical neural networks
Modelling the evaporation of nanoparticle suspensions from heterogeneous surfaces
On the bias caused by interim analyses
The Smith group and the critical group of the Grassmann graph of lines in finite projective space and of its complement
Hybrid Beamforming with Reduced Number of Phase Shifters for Massive MIMO Systems
Spin-charge conversion in disordered two-dimensional electron gases lacking inversion symmetry
3D Pathfinding and Collision Avoidance Using Uneven Search-space Quantization and Visual Cone Search
Power series, the Riordan group and Hopf algebras
Neuroevolution on the Edge of Chaos
Solver composition across the PDE/linear algebra barrier
Balanced Facilities on Random Graphs
On the Emergence of Invariance and Disentangling in Deep Representations
The Schrödinger equation with spatial white noise: the average wave function
Characterization of multivariate Bernoulli distributions with given margins
The Geometry of Nodal Sets and Outlier Detection
Stochastic Integration and Stochastic PDEs Driven by Jumps on the Dual of a Nuclear Space
Sparse Tableau Formulation for Optimal Power Flow Applications
Optimal Design Method of MIMO Antenna Directivities and Corresponding Current Distributions by Using Spherical Mode Expansion
Heat transport coefficients from optimally short molecular dynamics simulations
Neuro-RAM Unit with Applications to Similarity Testing and Compression in Spiking Neural Networks
Sparse Stochastic Bandits
Mendelian Randomization when Many Instruments are Invalid: Hierarchical Empirical Bayes Estimation
Jump Type Stochastic Differential Equations with Non-Lipschitz Coefficients and Feller and Strong Feller Properties
Multi-Observation Elicitation
Language Generation with Recurrent Generative Adversarial Networks without Pre-training
Decay Estimates for 1-D Parabolic PDEs with Boundary Disturbances
A method for the online construction of the set of states of a Markov Decision Process using Answer Set Programming
Learning Whenever Learning is Possible: Universal Learning under General Stochastic Processes
Asymptotically normal estimators for Zipf’s law
Ipanema-β: tools and examples for HEP analysis on GPU
A simple neural network module for relational reasoning
A correspondence between thermodynamics and inference
Strategic Equilibria in Queues with Dynamic Service Rate and Full Information
Hamiltonian Monte Carlo Methods for Subset Simulation in Reliability Analysis
The Capacity of Private Information Retrieval from Byzantine and Colluding Databases
Types of Cognition and its Implications for future High-Level Cognitive Machines
A Joint Model for Question Answering and Question Generation