If you did not already know

Internet Shopping Problem google
Introduced by Blazewicz et al. (2010), where a customer wants to buy a list of products at the lowest possible total cost from shops which offer discounts when purchases exceed a certain threshold. The problem is NP-hard. …

EigenPro 2.0 google
In recent years machine learning methods that nearly interpolate the data have achieved remarkable success. In many settings achieving near-zero training error leads to excellent test results. In this work we show how the mathematical and conceptual simplicity of interpolation can be harnessed to construct a framework for very efficient, scalable and accurate kernel machines. Our main innovation is in constructing kernel machines that output solutions mathematically equivalent to those obtained using standard kernels, yet capable of fully utilizing the available computing power of a parallel computational resource, such as GPU. Such utilization is key to strong performance since much of the computational resource capability is wasted by the standard iterative methods. The computational resource and data adaptivity of our learned kernels is based on theoretical convergence bounds. The resulting algorithm, which we call EigenPro 2.0, is accurate, principled and very fast. For example, using a single GPU, training on ImageNet with $1.3\times 10^6$ data points and $1000$ labels takes under an hour, while smaller datasets, such as MNIST, take seconds. Moreover, as the parameters are chosen analytically, based on the theory, little tuning beyond selecting the kernel and kernel parameter is needed, further facilitating the practical use of these methods. …

Customer Experience Management (CEM) google
Customer experience management (CEM or CXM) is the process that companies use to oversee and track all interactions with a customer during the duration of their relationship. This involves the strategy of building around the needs of individual customers. According to Jeananne Rae, companies are realizing that ‘building great consumer experiences is a complex enterprise, involving strategy, integration of technology, orchestrating business models, brand management and CEO commitment.’ …


Document worth reading: “Deep Probabilistic Programming Languages: A Qualitative Study”

Deep probabilistic programming languages try to combine the advantages of deep learning with those of probabilistic programming languages. If successful, this would be a big step forward in machine learning and programming languages. Unfortunately, as of now, this new crop of languages is hard to use and understand. This paper addresses this problem directly by explaining deep probabilistic programming languages and indirectly by characterizing their current strengths and weaknesses. Deep Probabilistic Programming Languages: A Qualitative Study

Distilled News

Granger Causality Online Visualization Tool

This tool finds the Granger causality relationship among the input time series and visualizes the results in a directed causal graph and a directed adjacency matrix. It applies the Lasso-Granger and Copula-Granger algorithms with length of lag l=1.
For more information, please see the following papers:
• Andrew Arnold, Yan Liu, and Naoki Abe. Temporal Causal Modeling with Graphical Granger Methods, KDD 2007.
• Taha Bahadori, Yan Liu. An Examination of Practical Granger Causality Inference, SDM 2013

Measuring Levels of Alignment

In my most recent blog, I discussed the idea of aligning the supply of services to market demand. My conceptualization of ‘alignment’ specifically relates to time intervals: i.e. having people at the right place and at the right time – for example, to take advantage of opportunities – is a sign of alignment. Alignment for me is often about the relationship between capacity and incapacity: the ability to supply services versus the inability to satisfy the market demand for those services. In this blog I will be considering the interpretation of charts to evaluate the effectiveness of strategic allocation.

2018 World Cup Predictions using decision trees

In this study, we predict the outcome of the football matches in the FIFA World Cup 2018 to be held in Russia this summer. We do this using classification models over a dataset of historic football results that includes attributes from the playing teams by rating them in attack, midfield, defence, aggression, pressure, chance creation and building ability. This last training data was a result of merging international matches results with AE games ratings of the teams considering the timeline of the matches with their respective statistics. Final predictions show the four countries with the most chances of getting to the semifinals as France, Brazil, Spain and Germany while giving Spain as the winner.

Intro To Time Series Analysis Part 2 :Exercises

In the exercises below, we will explore more in Time Series analysis.The previous exercise is here,Please follow this in sequence.

Simple Solution to Feature Selection Problems

We discuss a new approach for selecting features from a large set of features, in an unsupervised machine learning framework. In supervised learning such as linear regression or supervised clustering, it is possible to test the predicting power of a set of features (also called independent variables by statisticians, or predictors) using metrics such as goodness of fit with the response (the dependent variable), for instance using the R-squared coefficient. This makes the process of feature selection rather easy. Here this is not feasible. The context could be pure clustering, with no training sets available, for instance in a fraud detection problem. We are also dealing with discrete and continuous variables, possibly including dummy variables that represent categories, such as gender. We assume that no simple statistical model explains the data, so the framework here is model-free, data-driven. In this context, traditional methods are based on information theory metrics to determine which subset of features brings the largest amount of information.

On Deep Tensor Networks and the Nature of Non-Linearity

Teaser: Tensor Networks can be seen as a higher-order generalization of traditional deep neural networks, and yet they lack an explicit non-linearity such as applying the ReLU or sigmoid function as we do with neural nets. A deeper understanding of what nonlinearity actually means, however, reveals that tensor networks can indeed learn non-linear functions. The non-linearity of tensor networks arises soley from the architecture and topology of the network itself.

The 5 Clustering Algorithms Data Scientists Need to Know

In Data Science, we can use clustering analysis to gain some valuable insights from our data by seeing what groups the data points fall into when we apply a clustering algorithm. Today, we´re going to look at 5 popular clustering algorithms that data scientists need to know and their pros and cons!
• K-Means Clustering
• Mean-Shift Clustering
• Density-Based Spatial Clustering of Applications with Noise (DBSCAN)
• Expectation-Maximization (EM) Clustering using Gaussian Mixture Models (GMM)
• Agglomerative Hierarchical Clustering

MNIST For Machine Learning Beginners With Softmax Regression

This is a tutorial for beginners interested in learning about MNIST and Softmax regression using machine learning (ML) and TensorFlow. When we start learning programming, the first thing we learned to do was to print ‘Hello World.’ It´s like Hello World, the entry point to programming, and MNIST, the starting point for machine learning.

Technical Implementation of Content Personalization

At first glance, this reminds us of AI, when a machine decides how to manage a task based on statistical data. In fact, this concept is part of the AI phenomenon and makes it possible to develop machine intelligence and improve the decision-making process. According to NVidia, ‘Machine Learning at its most basic is the practice of using algorithms to parse data, learn from it, and then make a determination or prediction about something in the world’. In a nutshell, a machine analyzes and recommends information without human participation. This work can be done manually but takes a plenty of time and effort. Thanks to huge computing power, modern machines perform data mining, data analytics and predictive modeling more effectively than people do. The next section will be dedicated to a recommender technique based on the machine learning approach.

Whats new on arXiv

Learning Equations for Extrapolation and Control

We present an approach to identify concise equations from data using a shallow neural network approach. In contrast to ordinary black-box regression, this approach allows understanding functional relations and generalizing them from observed data to unseen parts of the parameter space. We show how to extend the class of learnable equations for a recently proposed equation learning network to include divisions, and we improve the learning and model selection strategy to be useful for challenging real-world data. For systems governed by analytical expressions, our method can in many cases identify the true underlying equation and extrapolate to unseen domains. We demonstrate its effectiveness by experiments on a cart-pendulum system, where only 2 random rollouts are required to learn the forward dynamics and successfully achieve the swing-up task.

Identifying Causal Effects with the R Package causaleffect

Do-calculus is concerned with estimating the interventional distribution of an action from the observed joint probability distribution of the variables in a given causal structure. All identifiable causal effects can be derived using the rules of do-calculus, but the rules themselves do not give any direct indication whether the effect in question is identifiable or not. Shpitser and Pearl constructed an algorithm for identifying joint interventional distributions in causal models, which contain unobserved variables and induce directed acyclic graphs. This algorithm can be seen as a repeated application of the rules of do-calculus and known properties of probabilities, and it ultimately either derives an expression for the causal distribution, or fails to identify the effect, in which case the effect is non-identifiable. In this paper, the R package causaleffect is presented, which provides an implementation of this algorithm. Functionality of causaleffect is also demonstrated through examples.

Simplifying Probabilistic Expressions in Causal Inference

Obtaining a non-parametric expression for an interventional distribution is one of the most fundamental tasks in causal inference. Such an expression can be obtained for an identifiable causal effect by an algorithm or by manual application of do-calculus. Often we are left with a complicated expression which can lead to biased or inefficient estimates when missing data or measurement errors are involved. We present an automatic simplification algorithm that seeks to eliminate symbolically unnecessary variables from these expressions by taking advantage of the structure of the underlying graphical model. Our method is applicable to all causal effect formulas and is readily available in the R package causaleffect.

Evaluating Ex Ante Counterfactual Predictions Using Ex Post Causal Inference

We derive a formal, decision-based method for comparing the performance of counterfactual treatment regime predictions using the results of experiments that give relevant information on the distribution of treated outcomes. Our approach allows us to quantify and assess the statistical significance of differential performance for optimal treatment regimes estimated from structural models, extrapolated treatment effects, expert opinion, and other methods. We apply our method to evaluate optimal treatment regimes for conditional cash transfer programs across countries where predictions are generated using data from experimental evaluations in other countries and pre-program data in the country of interest.

Neural Ordinary Differential Equations

We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a blackbox differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.

Neural Code Comprehension: A Learnable Representation of Code Semantics

With the recent success of embeddings in natural language processing, research has been conducted into applying similar methods to code analysis. Most works attempt to process the code directly or use a syntactic tree representation, treating it like sentences written in a natural language. However, none of the existing methods are sufficient to comprehend program semantics robustly, due to structural features such as function calls, branching, and interchangeable order of statements. In this paper, we propose a novel processing technique to learn code semantics, and apply it to a variety of program analysis tasks. In particular, we stipulate that a robust distributional hypothesis of code applies to both human- and machine-generated programs. Following this hypothesis, we define an embedding space, inst2vec, based on an Intermediate Representation (IR) of the code that is independent of the source programming language. We provide a novel definition of contextual flow for this IR, leveraging both the underlying data- and control-flow of the program. We then analyze the embeddings qualitatively using analogies and clustering, and evaluate the learned representation on three different high-level tasks. We show that with a single RNN architecture and pre-trained fixed embeddings, inst2vec outperforms specialized approaches for performance prediction (compute device mapping, optimal thread coarsening); and algorithm classification from raw code (104 classes), where we set a new state-of-the-art.

Forest Packing: Fast, Parallel Decision Forests

Machine learning has an emerging critical role in high-performance computing to modulate simulations, extract knowledge from massive data, and replace numerical models with efficient approximations. Decision forests are a critical tool because they provide insight into model operation that is critical to interpreting learned results. While decision forests are trivially parallelizable, the traversals of tree data structures incur many random memory accesses and are very slow. We present memory packing techniques that reorganize learned forests to minimize cache misses during classification. The resulting layout is hierarchical. At low levels, we pack the nodes of multiple trees into contiguous memory blocks so that each memory access fetches data for multiple trees. At higher levels, we use leaf cardinality to identify the most popular paths through a tree and collocate those paths in cache lines. We extend this layout with out-of-order execution and cache-line prefetching to increase memory throughput. Together, these optimizations increase the performance of classification in ensembles by a factor of four over an optimized C++ implementation and a actor of 50 over a popular R language implementation.

Learning from Chunk-based Feedback in Neural Machine Translation

We empirically investigate learning from partial feedback in neural machine translation (NMT), when partial feedback is collected by asking users to highlight a correct chunk of a translation. We propose a simple and effective way of utilizing such feedback in NMT training. We demonstrate how the common machine translation problem of domain mismatch between training and deployment can be reduced solely based on chunk-level user feedback. We conduct a series of simulation experiments to test the effectiveness of the proposed method. Our results show that chunk-level feedback outperforms sentence based feedback by up to 2.61% BLEU absolute.

SMarTplan: a Task Planner for Smart Factories

Smart factories are on the verge of becoming the new industrial paradigm, wherein optimization permeates all aspects of production, from concept generation to sales. To fully pursue this paradigm, flexibility in the production means as well as in their timely organization is of paramount importance. AI is planning a major role in this transition, but the scenarios encountered in practice might be challenging for current tools. Task planning is one example where AI enables more efficient and flexible operation through an online automated adaptation and rescheduling of the activities to cope with new operational constraints and demands. In this paper we present SMarTplan, a task planner specifically conceived to deal with real-world scenarios in the emerging smart factory paradigm. Including both special-purpose and general-purpose algorithms, SMarTplan is based on current automated reasoning technology and it is designed to tackle complex application domains. In particular, we show its effectiveness on a logistic scenario, by comparing its specialized version with the general purpose one, and extending the comparison to other state-of-the-art task planners.

Instance-Level Explanations for Fraud Detection: A Case Study

Fraud detection is a difficult problem that can benefit from predictive modeling. However, the verification of a prediction is challenging; for a single insurance policy, the model only provides a prediction score. We present a case study where we reflect on different instance-level model explanation techniques to aid a fraud detection team in their work. To this end, we designed two novel dashboards combining various state-of-the-art explanation techniques. These enable the domain expert to analyze and understand predictions, dramatically speeding up the process of filtering potential fraud cases. Finally, we discuss the lessons learned and outline open research issues.

Restricted Boltzmann Machines: Introduction and Review

The restricted Boltzmann machine is a network of stochastic units with undirected interactions between pairs of visible and hidden units. This model was popularized as a building block of deep learning architectures and has continued to play an important role in applied and theoretical machine learning. Restricted Boltzmann machines carry a rich structure, with connections to geometry, applied algebra, probability, statistics, machine learning, and other areas. The analysis of these models is attractive in its own right and also as a platform to combine and generalize mathematical tools for graphical models with hidden variables. This article gives an introduction to the mathematical analysis of restricted Boltzmann machines, reviews recent results on the geometry of the sets of probability distributions representable by these models, and suggests a few directions for further investigation.

Deep Neural Decision Trees

Deep neural networks have been proven powerful at processing perceptual data, such as images and audio. However for tabular data, tree-based models are more popular. A nice property of tree-based models is their natural interpretability. In this work, we present Deep Neural Decision Trees (DNDT) — tree models realised by neural networks. A DNDT is intrinsically interpretable, as it is a tree. Yet as it is also a neural network (NN), it can be easily implemented in NN toolkits, and trained with gradient descent rather than greedy splitting. We evaluate DNDT on several tabular datasets, verify its efficacy, and investigate similarities and differences between DNDT and vanilla decision trees. Interestingly, DNDT self-prunes at both split and feature-level.

A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress

Inverse reinforcement learning is the problem of inferring the reward function of an observed agent, given its policy or behavior. Researchers perceive IRL both as a problem and as a class of methods. By categorically surveying the current literature in IRL, this article serves as a reference for researchers and practitioners in machine learning to understand the challenges of IRL and select the approaches best suited for the problem on hand. The survey formally introduces the IRL problem along with its central challenges which include accurate inference, generalizability, correctness of prior knowledge, and growth in solution complexity with problem size. The article elaborates how the current methods mitigate these challenges. We further discuss the extensions of traditional IRL methods: (i) inaccurate and incomplete perception, (ii) incomplete model, (iii) multiple rewards, and (iv) non-linear reward functions. This discussion concludes with some broad advances in the research area and currently open research questions.

Tensor-Tensor Product Toolbox

Tensors are higher-order extensions of matrices. In recent work [Kilmer and Martin, 2011], the authors introduced the notion of the t-product, a generalization of matrix multiplication for tensors of order three. The multiplication is based on a convolution-like operation, which can be implemented efficiently using the Fast Fourier Transform (FFT). Based on t-product, there has a similar linear algebraic structure of tensors to matrices. For example, there has the tensor SVD (t-SVD) which is computable. By using some properties of FFT, we have a more efficient way for computing t-product and t-SVD in [C. Lu, et al., 2018]. We develop a Matlab toolbox to implement several basic operations on tensors based on t-product. The toolbox is available at https://…/tproduct.

In situ TensorView: In situ Visualization of Convolutional Neural Networks

Convolutional Neural Networks(CNNs) are complex systems. They are trained so they can adapt their internal connections to recognize images, texts and more. It is both interesting and helpful to visualize the dynamics within such deep artificial neural networks so that people can understand how these artificial networks are learning and making predictions. In the field of scientific simulations, visualization tools like Paraview have long been utilized to provide insights and understandings. We present in situ TensorView to visualize the training and functioning of CNNs as if they are systems of scientific simulations. In situ TensorView is a loosely coupled in situ visualization open framework that provides multiple viewers to help users to visualize and understand their networks. It leverages the capability of co-processing from Paraview to provide real-time visualization during training and predicting phases. This avoid heavy I/O overhead for visualizing large dynamic systems. Only a small number of lines of codes are injected in TensorFlow framework. The visualization can provide guidance to adjust the architecture of networks, or compress the pre-trained networks. We showcase visualizing the training of LeNet-5 and VGG16 using in situ TensorView.

Meta Continual Learning

Using neural networks in practical settings would benefit from the ability of the networks to learn new tasks throughout their lifetimes without forgetting the previous tasks. This ability is limited in the current deep neural networks by a problem called catastrophic forgetting, where training on new tasks tends to severely degrade performance on previous tasks. One way to lessen the impact of the forgetting problem is to constrain parameters that are important to previous tasks to stay close to the optimal parameters. Recently, multiple competitive approaches for computing the importance of the parameters with respect to the previous tasks have been presented. In this paper, we propose a learning to optimize algorithm for mitigating catastrophic forgetting. Instead of trying to formulate a new constraint function ourselves, we propose to train another neural network to predict parameter update steps that respect the importance of parameters to the previous tasks. In the proposed meta-training scheme, the update predictor is trained to minimize loss on a combination of current and past tasks. We show experimentally that the proposed approach works in the continual learning setting.

A Graph-Theoretic Analysis of Distributed Replicator Dynamic
Relating the cut distance and the weak* topology for graphons
State equation from the spectral structure of human brain activity
GOE Statistics for Levy Matrices
Quadratic Approximation of Generalized Tribonacci Sequences
No Threshold graphs are cospectral
Records from partial comparisons and discrete approximations
Deterministic $O(1)$-Approximation Algorithms to 1-Center Clustering with Outliers
Faster SGD training by minibatch persistency
Opportunistic Scheduling in Underlay Cognitive Radio based Systems: User Selection Probability Analysis
Statistical Optimal Transport via Geodesic Hubs
Couplings for determinantal point processes and their reduced Palm distributions with a view to quantifying repulsiveness
Reducing Property Graph Queries to Relational Algebra for Incremental View Maintenance
Quantum Nash equilibrium in the thermodynamic limit
A Reputation System for Artificial Societies
Movement-efficient Sensor Deployment in Wireless Sensor Networks with Limited Communication Range
Rate-Memory Trade-Off for Caching and Delivery of Correlated Sources
Hybrid Coordination and Control for Multiagent Systems with Input Constraints
Simultaneous Signal Subspace Rank and Model Selection with an Application to Single-snapshot Source Localization
Cluster-robust Standard Errors for Linear Regression Models with Many Controls
Recommending Scientific Videos based on Metadata Enrichment using Linked Open Data
A Novel Mobile Data Contract Design with Time Flexibility
Estimation from Non-Linear Observations via Convex Programming with Application to Bilinear Regression
A variational approach to Data Assimilation in the Solar Wind
Dynamic Multi-Level Multi-Task Learning for Sentence Simplification
Canonical Tensor Decomposition for Knowledge Base Completion
End-to-End Neural Ranking for eCommerce Product Search: an application of task models and textual embeddings
Variance Reduced Three Operator Splitting
On pathwise quadratic variation for cadlag functions
A one-shot quantum joint typicality lemma
Inner bounds via simultaneous decoding in quantum network information theory
Efficient data augmentation for multivariate probit models with panel data: An application to general practitioner decision-making about contraceptives
Unsupervised Deep Multi-focus Image Fusion
COUNTDOWN – three, two, one, low power! A Run-time Library for Energy Saving in MPI Communication Primitives
vsgoftest: An Package for Goodness-of-Fit Testing Based on Kullback-Leibler Divergence
Learning Conditioned Graph Structures for Interpretable Visual Question Answering
NISQ circuit compilers: search space structure and heuristics
PaMpeR: Proof Method Recommendation System for Isabelle/HOL
Magnetic Resonance Spectroscopy Quantification using Deep Learning
ASIC Implementation of Time-Domain Digital Backpropagation with Deep-Learned Chromatic Dispersion Filters
Self-adaptive Privacy Concern Detection for User-generated Content
Solving Fractional Polynomial Problems by Polynomial Optimization Theory
Painting and Correspondence Coloring of Squares of Planar Graphs with no 4-cycles
Unsupervised Imitation Learning
Agent-Mediated Social Choice
Independent graph of the finite group
Stable Gaussian Process based Tracking Control of Euler-Lagrange Systems
Recurrent DNNs and its Ensembles on the TIMIT Phone Recognition Task
Mixed batches and symmetric discriminators for GAN training
LIL type behaviour of multivariate Levy processes at zero
Belousov-Zhabotinsky reaction in liquid marbles
When Is the Achievable Rate Region Convex in Two-User Massive MIMO Systems
Letter to the Editor
FRnet-DTI: Convolutional Neural Networks for Drug-Target Interaction
Surrogate Outcomes and Transportability
Non-deterministic Behavior of Ranking-based Metrics when Evaluating Embeddings
Markov chains with heavy-tailed increments and asymptotically zero drift
Approximation Strategies for Incomplete MaxSAT
The determinant of the second additive compound of a square matrix: a formula and applications
Semi-supervised Hashing for Semi-Paired Cross-View Retrieval
Automatic segmentation of prostate zones
Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models
Large-Scale Stochastic Sampling from the Probability Simplex
Feature learning based on visual similarity triplets in medical image analysis: A case study of emphysema in chest CT scans
FineTag: Multi-label Retrieval of Attributes at Fine-grained Level in Images
Cooperative Queuing Policies for Effective Human-Multi-Robot Interaction
Gradient flow approach to local mean-field spin systems
Infrared and Visible Image Fusion with ResNet and zero-phase component analysis
Positioning Data-Rate Trade-off in mm-Wave Small Cells and Service Differentiation for 5G Networks
ConFusion: Sensor Fusion for Complex Robotic Systems using Nonlinear Optimization
Facing Multiple Attacks in Adversarial Patrolling Games with Alarmed Targets
Modality Distillation with Multiple Stream Networks for Action Recognition
Diffeomorphic brain shape modelling using Gauss-Newton optimisation
Improving brain computer interface performance by data augmentation with conditional Deep Convolutional Generative Adversarial Networks
Nivat’s Conjecture and Pattern Complexity in Algebraic Subshifts
Online Linear Quadratic Control
End-to-End Speech Recognition From the Raw Waveform
Breaking the 6/5 threshold for sums and products modulo a prime
Enhancing Identification of Causal Effects by Pruning
Itemsets of interest for negative association rules
Distributed Optimization over Directed Graphs with Row Stochasticity and Constraint Regularity
Learning to Update for Object Tracking
Transfer Learning with Human Corneal Tissues: An Analysis of Optimal Cut-Off Layer
A New COLD Feature based Handwriting Analysis for Ethnicity/Nationality Identification
Optimizing Leader Influence in Networks through Selection of Direct Followers
A new distance-regular graph of diameter $3$ on $1024$ vertices
Cancer Metastasis Detection With Neural Conditional Random Field
A model-driven approach for a new generation of adaptive libraries
Effect of Hyper-Parameter Optimization on the Deep Learning Model Proposed for Distributed Attack Detection in Internet of Things Environment
Capacitor Based Activity Sensing for Kinetic Powered Wearable IoTs
Impact of Building-Level Motor Protection on Power System Transient Behaviors
MoE-SPNet: A Mixture-of-Experts Scene Parsing Network
Bayesian Sequential Inference in Dynamic Survival Models
Fast Mixing of Metropolis-Hastings with Unimodal Targets
Matrix valued inverse problems on graphs with application to elastodynamic networks
Response Generation by Context-aware Prototype Editing
Defective and Clustered Colouring of Sparse Graphs
EmotionX-DLC: Self-Attentive BiLSTM for Detecting Sequential Emotions in Dialogue
Translating MFM into FOL: towards plant operation planning
Deep neural network based sparse measurement matrix for image compressed sensing
On the Metric Distortion of Embedding Persistence Diagrams into Reproducing Kernel Hilbert Spaces
Complete regular dessins and skew-morphisms of cyclic groups
Maximum average degree and relaxed coloring
On the Cauchy problem for parabolic integro-differential equations in generalized Hölder spaces
The strong chromatic index of $(3,Δ)$-bipartite graphs
Covering 2-connected 3-regular graphs with disjoint paths
Strong chromatic index of graphs with maximum degree four
VirtualHome: Simulating Household Activities via Programs
Thermodynamics of the Minimum Description Length on Community Detection
Maximally Invariant Data Perturbation as Explanation
Theoretical Analysis of Image-to-Image Translation with Adversarial Learning
Emotional Conversation Generation Orientated Syntactically Constrained Bidirectional-asynchronous Framework
Private Text Classification
Optimization over Nonnegative and Convex Polynomials With and Without Semidefinite Programming
Smoothed SVD-based Beamforming for FBMC/OQAM Systems Based on Frequency Spreading
Fast Multiple Landmark Localisation Using a Patch-based Iterative Network
Soft Sampling for Robust Object Detection
Classification of remote sensing images using attribute profiles and feature profiles from different trees: a comparative study
Repetition Estimation
GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking
A Web of Blocks
Using Mode Connectivity for Loss Landscape Analysis
Towards Gene Expression Convolutions using Gene Interaction Graphs
Bayesian monotonic errors-in-variables models with applications to pathogen susceptibility testing
On the Bias of Reed-Muller Codes over Odd Prime Fields
Comparative Analysis of Neural QA models on SQuAD
Deconvolving convolution neural network for cell detection
Proportional Choosability: A New List Analogue of Equitable Coloring
High-frequency analysis of parabolic stochastic PDEs
A Comparison of Transformer and Recurrent Neural Networks on Multilingual Neural Machine Translation
Overlapping Clustering Models, and One (class) SVM to Bind Them All
Reconstruction methods for networks: the case of economic and financial systems
Bayesian Prediction of Future Street Scenes through Importance Sampling based Optimization
Delegated Search Approximates Efficient Search
The domination number of plane triangulations
Paths in ordered trees
Implementation of Peridynamics utilizing HPX — the C++ standard library for parallelism and concurrency
Designing Optimal Binary Rating Systems
Cyclic triangle factors in regular tournaments
Some remarks on the bias distribution analysis of discrete-time identification algorithms based on pseudo-linear regressions
Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images
Learning to Decode 7T-like MR Image Reconstruction from 3T MR Images
The Minimax Learning Rate of Normal and Ising Undirected Graphical Models
Manifold Learning & Stacked Sparse Autoencoder for Robust Breast Cancer Classification from Histopathological Images
Learning Distributed Representations from Reviews for Collaborative Filtering
Combining Word Feature Vector Method with the Convolutional Neural Network for Slot Filling in Spoken Language Understanding
Continuous-variable quantum neural networks
The Off-Topic Memento Toolkit
Strong coupling limit of the Polaron measure and the Pekar process
Age-Minimal Transmission for Energy Harvesting Sensors with Finite Batteries: Online Policies
Beyond Local Nash Equilibria for Adversarial Networks
Coupled Fluid Density and Motion from Single Views
The graphs with all but two eigenvalues equal to $2$ or $-1$
A Hybrid Fuzzy Regression Model for Optimal Loss Reserving in Insurance
Emergent Open-Endedness from Contagion of the Fittest
On the relation between Sion’s minimax theorem and existence of Nash equilibrium in asymmetric multi-players zero-sum game with only one alien
Two Stream Self-Supervised Learning for Action Recognition
G2D: from GTA to Data
A Proof of Delta Conjecture
A Scalable Machine Learning Approach for Inferring Probabilistic US-LI-RADS Categorization
Semantic Image Retrieval by Uniting Deep Neural Networks and Cognitive Architectures
Implicit Quantile Networks for Distributional Reinforcement Learning
Maximum a Posteriori Policy Optimisation
Qualitative Measurements of Policy Discrepancy for Return-based Deep Q-Network
Deep Sequence Learning with Auxiliary Information for Traffic Prediction
Deep Learning based Estimation of Weaving Target Maneuvers
Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control
A One-Sided Classification Toolkit with Applications in the Analysis of Spectroscopy Data
DeepTerramechanics: Terrain Classification and Slip Estimation for Ground Robots via Deep Learning
A Graph Transduction Game for Multi-target Tracking
Pressure Predictions of Turbine Blades with Deep Learning
Understanding Patch-Based Learning by Explaining Predictions
Task Driven Generative Modeling for Unsupervised Domain Adaptation: Application to X-ray Image Segmentation
DropBack: Continuous Pruning During Training
Multilingual Scene Character Recognition System using Sparse Auto-Encoder for Efficient Local Features Representation in Bag of Features
An optimized system to solve text-based CAPTCHA
DFNet: Semantic Segmentation on Panoramic Images with Dynamic Loss Weights and Residual Fusion Block
Auto-Meta: Automated Gradient Based Meta Learner Search
Distributional Advantage Actor-Critic
Localizing and Quantifying Damage in Social Media Images
A maximal energy pointset configuration problem

R Packages worth a look

Facilities for Simulating from ODE-Based Models (RxODE)
Facilities for running simulations from ordinary differential equation (ODE) models, such as pharmacometrics and other compartmental models. A compilation manager translates the ODE model into C, compiles it, and dynamically loads the object code into R for improved computational efficiency. An event table object facilitates the specification of complex dosing regimens (optional) and sampling schedules. NB: The use of this package requires both C and Fortran compilers, for details on their use with R please see Section 6.3, Appendix A, and Appendix D in the ‘R Administration and Installation’ manual. Also the code is mostly released under GPL. The VODE and LSODA are in the public domain. The information is available in the inst/COPYRIGHTS.

Stanford ‘ATLAS’ Search Engine API (atlas)
Stanford ‘ATLAS’ (Advanced Temporal Search Engine) is a powerful tool that allows constructing cohorts of patients extremely quickly and efficiently. This package is designed to interface directly with an instance of ‘ATLAS’ search engine and facilitates API queries and data dumps. Prerequisite is a good knowledge of the temporal language to be able to efficiently construct a query. More information available at <https://…/start>.

In-place Operators for R (inplace)
It provides in-place operators for R that are equivalent to ‘+=’, ‘-=’, ‘*=’, ‘/=’ in C++. Those can be applied on integer|double vectors|matrices. You have also access to sweep operations (in-place).

Simulation Extrapolation Inverse Probability Weighted Generalized Estimating Equations (swgee)
Simulation extrapolation and inverse probability weighted generalized estimating equations method for longitudinal data with missing observations and measurement error in covariates. References: Yi, G. Y. (2008) <doi:10.1093/biostatistics/kxm054>; Cook, J. R. and Stefanski, L. A. (1994) <doi:10.1080/01621459.1994.10476871>; Little, R. J. A. and Rubin, D. B. (2002, ISBN:978-0-471-18386-0).

A User-Oriented Statistical Toolkit for Analytical Variance Estimation (gustave)
Provides a toolkit for analytical variance estimation in survey sampling. Apart from the implementation of standard variance estimators, its main feature is to help the sampling expert produce easy-to-use variance estimation ‘wrappers’, where systematic operations (linearization, domain estimation) are handled in a consistent and transparent way for the end user.

Book Memo: “Practical Text Analytics”

Maximizing the Value of Your Text Data
This book explores the process of text analytics in order to increase the accessibility of information available in unstructured text data. Unlike other books available in the text analytics field, Practical Text Analytics opens the door to business analysts and practitioners that may not have extensive coding experience or knowledge of the area. This allows readers without a programming background to take advantage of the nearly limitless information currently shrouded by text. Text analytics can help organizations derive insights for their business from text-based content like emails, documents, or social media posts. This book covers the elements involved in creating a text-mining pipeline. While analysts will not use every element in every project, each tool provides a potential segment in the final pipeline. Understanding the options is key to choosing the appropriate elements in designing and conducting text analysis.

Document worth reading: “Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review”

The framework of reinforcement learning or optimal control provides a mathematical formalization of intelligent decision making that is powerful and broadly applicable. While the general form of the reinforcement learning problem enables effective reasoning about uncertainty, the connection between reinforcement learning and inference in probabilistic models is not immediately obvious. However, such a connection has considerable value when it comes to algorithm design: formalizing a problem as probabilistic inference in principle allows us to bring to bear a wide array of approximate inference tools, extend the model in flexible and powerful ways, and reason about compositionality and partial observability. In this article, we will discuss how a generalization of the reinforcement learning or optimal control problem, which is sometimes termed maximum entropy reinforcement learning, is equivalent to exact probabilistic inference in the case of deterministic dynamics, and variational inference in the case of stochastic dynamics. We will present a detailed derivation of this framework, overview prior work that has drawn on this and related ideas to propose new reinforcement learning and control algorithms, and describe perspectives on future research. Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review

If you did not already know

CYCLOSA google
By regularly querying Web search engines, users (unconsciously) disclose large amounts of their personal data as part of their search queries, among which some might reveal sensitive information (e.g. health issues, sexual, political or religious preferences). Several solutions exist to allow users querying search engines while improving privacy protection. However, these solutions suffer from a number of limitations: some are subject to user re-identification attacks, while others lack scalability or are unable to provide accurate results. This paper presents CYCLOSA, a secure, scalable and accurate private Web search solution. CYCLOSA improves security by relying on trusted execution environments (TEEs) as provided by Intel SGX. Further, CYCLOSA proposes a novel adaptive privacy protection solution that reduces the risk of user re- identification. CYCLOSA sends fake queries to the search engine and dynamically adapts their count according to the sensitivity of the user query. In addition, CYCLOSA meets scalability as it is fully decentralized, spreading the load for distributing fake queries among other nodes. Finally, CYCLOSA achieves accuracy of Web search as it handles the real query and the fake queries separately, in contrast to other existing solutions that mix fake and real query results. …

Stochastic Decorrelation Loss (SDL) google
Multi-view learning aims to learn an embedding space where multiple views are either maximally correlated for cross-view recognition, or decorrelated for latent factor disentanglement. A key challenge for deep multi-view representation learning is scalability. To correlate or decorrelate multi-view signals, the covariance of the whole training set should be computed which does not fit well with the mini-batch based training strategy, and moreover (de)correlation should be done in a way that is free of SVD-based computation in order to scale to contemporary layer sizes. In this work, a unified approach is proposed for efficient and scalable deep multi-view learning. Specifically, a mini-batch based Stochastic Decorrelation Loss (SDL) is proposed which can be applied to any network layer to provide soft decorrelation of the layer’s activations. This reveals the connection between deep multi-view learning models such as Deep Canonical Correlation Analysis (DCCA) and Factorisation Autoencoder (FAE), and allows them to be easily implemented. We further show that SDL is superior to other decorrelation losses in terms of efficacy and scalability. …

Markov Chain Las Vegas (MCLV) google
We propose a Las Vegas transformation of Markov Chain Monte Carlo (MCMC) estimators of Restricted Boltzmann Machines (RBMs). We denote our approach Markov Chain Las Vegas (MCLV). MCLV gives statistical guarantees in exchange for random running times. MCLV uses a stopping set built from the training data and has maximum number of Markov chain steps K (referred as MCLV-K). We present a MCLV-K gradient estimator (LVS-K) for RBMs and explore the correspondence and differences between LVS-K and Contrastive Divergence (CD-K), with LVS-K significantly outperforming CD-K training RBMs over the MNIST dataset, indicating MCLV to be a promising direction in learning generative models. …