Efficient Structure Learning and Sampling of Bayesian Networks

Bayesian networks are probabilistic graphical models widely employed to understand dependencies in high dimensional data, and even to facilitate causal discovery. Learning the underlying network structure, which is encoded as a directed acyclic graph (DAG) is highly challenging mainly due to the vast number of possible networks. Efforts have focussed on two fronts: constraint based methods that perform conditional independence tests to exclude edges and score and search approaches which explore the DAG space with greedy or MCMC schemes. Here we synthesise these two fields in a novel hybrid method which reduces the complexity of MCMC approaches to that of a constraint based method. Individual steps in the MCMC scheme only require simple table lookups so that very long chains can be efficiently obtained. Furthermore, the scheme includes an iterative procedure to correct for errors from the conditional independence tests. The algorithm not only offers markedly superior performance to alternatives, but DAGs can also be sampled from the posterior distribution enabling full Bayesian modelling averaging for much larger Bayesian networks.

PaaS Cloud: The Business Perspective

The next generation of PaaS technology accomplishes the true promise of object-oriented and 4GLs development with less effort. Now PaaS is becoming one of the core technical services for application development organizations. PaaS offers a resourceful and agile approach to develop, operate and deploy applications in a cost-effective manner. It is now turning out to be one of the preferred choices throughout the world, especially for globally distributed development environment. However it still lacks the scale of popularity and acceptance which Software-as-a-Service (SaaS) and Infrastructure-as-a-Service (IaaS) have attained. PaaS offers a promising future with novel technology architecture and evolutionary development approach. In this article, we identify the strengths, weaknesses, opportunities and threats for the PaaS industry. We then identify the various issues that will affect the different stakeholders of PaaS industry. This research will outline a set of recommendations for the PaaS practitioners to better manage this technology. For PaaS technology researchers, we also outline the number of research areas that need attention in coming future. Finally, we also included an online survey to outline PaaS technology market leaders. This will facilitate PaaS technology practitioners to have a more deep insight into market trends and technologies.

Causal Inference on Discrete Data via Estimating Distance Correlations

In this paper, we deal with the problem of inferring causal directions when the data is on discrete domain. By considering the distribution of the cause P(X) and the conditional distribution mapping cause to effect P(Y|X) as independent random variables, we propose to infer the causal direction via comparing the distance correlation between P(X) and P(Y|X) with the distance correlation between P(Y) and P(X|Y). We infer ‘X causes Y‘ if the dependence coefficient between P(X) and P(Y|X) is smaller. Experiments are performed to show the performance of the proposed method.

Online Learning: Sufficient Statistics and the Burkholder Method

We uncover a fairly general principle in online learning: If regret can be (approximately) expressed as a function of certain ‘sufficient statistics’ for the data sequence, then there exists a special Burkholder function that 1) can be used algorithmically to achieve the regret bound and 2) only depends on these sufficient statistics, not the entire data sequence, so that the online strategy is only required to keep the sufficient statistics in memory. This characterization is achieved by bringing the full power of the Burkholder Method — originally developed for certifying probabilistic martingale inequalities — to bear on the online learning setting. To demonstrate the scope and effectiveness of the Burkholder method, we develop a novel online strategy for matrix prediction that attains a regret bound corresponding to the variance term in matrix concentration inequalities. We also present a linear-time/space prediction strategy for parameter free supervised learning with linear classes and general smooth norms.

Dynamic Sampling Convolutional Neural Networks

We present Dynamic Sampling Convolutional Neural Networks (DSCNN), where the position-specific kernels learn from not only the current position but also multiple sampled neighbour regions. During sampling, residual learning is introduced to ease training and an attention mechanism is applied to fuse features from different samples. And the kernels are further factorized to reduce parameters. The multiple sampling strategy enlarges the effective receptive fields significantly without requiring more parameters. While DSCNNs inherit the advantages of DFN, namely avoiding feature map blurring by position-specific kernels while keeping translation invariance, it also efficiently alleviates the overfitting issue caused by much more parameters than normal CNNs. Our model is efficient and can be trained end-to-end via standard back-propagation. We demonstrate the merits of our DSCNNs on both sparse and dense prediction tasks involving object detection and flow estimation. Our results show that DSCNNs enjoy stronger recognition abilities and achieve 81.7% in VOC2012 detection dataset. Also, DSCNNs obtain much sharper responses in flow estimation on FlyingChairs dataset compared to multiple FlowNet models’ baselines.

Domain Adaptation with Randomized Expectation Maximization

Domain adaptation (DA) is the task of classifying an unlabeled dataset (target) using a labeled dataset (source) from a related domain. The majority of successful DA methods try to directly match the distributions of the source and target data by transforming the feature space. Despite their success, state of the art methods based on this approach are either involved or unable to directly scale to data with many features. This article shows that domain adaptation can be successfully performed by using a very simple randomized expectation maximization (EM) method. We consider two instances of the method, which involve logistic regression and support vector machine, respectively. The underlying assumption of the proposed method is the existence of a good single linear classifier for both source and target domain. The potential limitations of this assumption are alleviated by the flexibility of the method, which can directly incorporate deep features extracted from a pre-trained deep neural network. The resulting algorithm is strikingly easy to implement and apply. We test its performance on 36 real-life adaptation tasks over text and image data with diverse characteristics. The method achieves state-of-the-art results, competitive with those of involved end-to-end deep transfer-learning methods.

AllenNLP: A Deep Semantic Natural Language Processing Platform

This paper describes AllenNLP, a platform for research on deep learning methods in natural language understanding. AllenNLP is designed to support researchers who want to build novel language understanding models quickly and easily. It is built on top of PyTorch, allowing for dynamic computation graphs, and provides (1) a flexible data API that handles intelligent batching and padding, (2) high-level abstractions for common operations in working with text, and (3) a modular and extensible experiment framework that makes doing good science easy. It also includes reference implementations of high quality approaches for both core semantic problems (e.g. semantic role labeling (Palmer et al., 2005)) and language understanding applications (e.g. machine comprehension (Rajpurkar et al., 2016)). AllenNLP is an ongoing open-source effort maintained by engineers and researchers at the Allen Institute for Artificial Intelligence.

Inference in Probabilistic Graphical Models by Graph Neural Networks

A useful computation when acting in a complex environment is to infer the marginal probabilities or most probable states of task-relevant variables. Probabilistic graphical models can efficiently represent the structure of such complex data, but performing these inferences is generally difficult. Message-passing algorithms, such as belief propagation, are a natural way to disseminate evidence amongst correlated variables while exploiting the graph structure, but these algorithms can struggle when the conditional dependency graphs contain loops. Here we use Graph Neural Networks (GNNs) to learn a message-passing algorithm that solves these inference tasks. We first show that the architecture of GNNs is well-matched to inference tasks. We then demonstrate the efficacy of this inference approach by training GNNs on an ensemble of graphical models and showing that they substantially outperform belief propagation on loopy graphs. Our message-passing algorithms generalize out of the training set to larger graphs and graphs with different structure.

On-demand Relational Concept Analysis

Formal Concept Analysis and its associated conceptual structures have been used to support exploratory search through conceptual navigation. Relational Concept Analysis (RCA) is an extension of Formal Concept Analysis to process relational datasets. RCA and its multiple interconnected structures represent good candidates to support exploratory search in relational datasets, as they are enabling navigation within a structure as well as between the connected structures. However, building the entire structures does not present an efficient solution to explore a small localised area of the dataset, for instance to retrieve the closest alternatives to a given query. In these cases, generating only a concept and its neighbour concepts at each navigation step appears as a less costly alternative. In this paper, we propose an algorithm to compute a concept and its neighbourhood in extended concept lattices. The concepts are generated directly from the relational context family, and possess both formal and relational attributes. The algorithm takes into account two RCA scaling operators. We illustrate it on an example.

Scalable Generalized Dynamic Topic Models

Dynamic topic models (DTMs) model the evolution of prevalent themes in literature, online media, and other forms of text over time. DTMs assume that word co-occurrence statistics change continuously and therefore impose continuous stochastic process priors on their model parameters. These dynamical priors make inference much harder than in regular topic models, and also limit scalability. In this paper, we present several new results around DTMs. First, we extend the class of tractable priors from Wiener processes to the generic class of Gaussian processes (GPs). This allows us to explore topics that develop smoothly over time, that have a long-term memory or are temporally concentrated (for event detection). Second, we show how to perform scalable approximate inference in these models based on ideas around stochastic variational inference and sparse Gaussian processes. This way we can train a rich family of DTMs to massive data. Our experiments on several large-scale datasets show that our generalized model allows us to find interesting patterns that were not accessible by previous approaches.

Wasserstein Distance, Fourier Series and Applications

We study the Wasserstein metric W_p, a notion of distance between two probability distributions, from the perspective of Fourier Analysis and discuss applications. In particular, we bound the Earth Mover Distance W_1 between the distribution of quadratic residues in a finite field \mathbb{F}_p and uniform distribution by \lesssim p^{-1/2} (the Polya-Vinogradov inequality implies \lesssim p^{-1/2} \log{p}). We also show for continuous f:\mathbb{T} \rightarrow \mathbb{R}_{} with mean value 0 (\mbox{number of roots of}~f) \cdot \left( \sum_{k=1}^{\infty}{ \frac{ |\widehat{f}(k)|^2}{k^2}}\right)^{\frac{1}{2}} \gtrsim \frac{\|f\|^{2}_{L^1(\mathbb{T})}}{\|f\|_{L^{\infty}(\mathbb{T})}}. Moreover, we show that for a Laplacian eigenfunction -\Delta_g \phi_{\lambda} = \lambda \phi_{\lambda} on a compact Riemannian manifold W_p\left(\max\left\{\phi_{\lambda}, 0\right\}dx, \max\left\{-\phi_{\lambda}, 0\right\} dx\right) \lesssim_p \sqrt{\log{\lambda}/\lambda} \|\phi_{\lambda}\|_{L^1}^{1/p} which is at most a factor \sqrt{\log{\lambda}} away from sharp. Several other problems are discussed.

Indirect Influences, Links Ranking, and Deconstruction of Networks
HINT: A Toolbox for Hierarchical Modeling of Neuroimaging Data
A Push-Pull Gradient Method for Distributed Optimization in Networks
A probabilistic variant of Sperner’s theorem and of maximal $r$-cover free families
Thermal to Visible Synthesis of Face Images using Multiple Regions
UnibucKernel: A kernel-based learning method for complex word identification
A Survey of Deep Learning Techniques for Mobile Robot Applications
Generative Multi-Agent Behavioral Cloning
Fog Massive MIMO: A User-Centric Seamless Hot-Spot Architecture
IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning
Long term availability of raw experimental data in experimental fracture mechanics
Efficient treatment of bilinear forms in global optimization
Learning Robotic Assembly from CAD
An analysis of the Greedy Algorithm for Stochastic Set Cover
On the Dynamics of Distributed Energy Adoption: Equilibrium, Stability, and Limiting Capacity
V-Splines and Bayes Estimate
The bunkbed conjecture on the complete graph
Quantifying resilience to recurrent ecosystem disturbances using flow-kick dynamics
Distributed Model Predictive Control for Linear Systems with Adaptive Terminal Sets
On Multi-Server Coded Caching in the Low Memory Regime
Graph-based regularization for regression problems with highly-correlated designs
Decentralized decision making for networks of uncertain systems
Efficient Recurrent Neural Networks using Structured Matrices in FPGAs
Edgeworth expansions for weakly dependent random variables
Product Characterisation towards Personalisation: Learning Attributes from Unstructured Data to Recommend Fashion Products
Pathwise approximation of Feynman path integrals using simple random walks
A Feature-Driven Active Framework for Ultrasound-Based Brain Shift Compensation
On the Complexity of Testing Attainment of the Optimal Value in Nonlinear Optimization
Point process models for quasi-periodic volcanic earthquakes
Join-Idle-Queue with Service Elasticity: Large-Scale Asymptotics of a Non-monotone System
Balanced Black and White Coloring Problem on knights chessboards
Defective and Clustered Graph Colouring
Robust Depth Estimation from Auto Bracketed Images
Weakly Supervised Medical Diagnosis and Localization from Multiple Resolutions
Markov Chains with Maximum Return Time Entropy for Robotic Surveillance
Data-Driven Computational Methods: Parameter and Operator Estimations (Chapter 1)
SurvBoost: An R Package for High-Dimensional Variable Selection in the Stratified Proportional Hazards Model via Gradient Boosting
Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network
InfyNLP at SMM4H Task 2: Stacked Ensemble of Shallow Convolutional Neural Networks for Identifying Personal Medication Intake from Twitter
Modeling Camera Effects to Improve Deep Vision for Real and Synthetic Data
A Robust Fault-Tolerant and Scalable Cluster-wide Deduplication for Shared-Nothing Storage Systems
Attention on Attention: Architectures for Visual Question Answering (VQA)
Semidefinite Outer Approximation of the Backward Reachable Set of Discrete-time Autonomous Polynomial Systems
Gradient Descent with Random Initialization: Fast Global Convergence for Nonconvex Phase Retrieval
A family of Bell transformations
Unsupervised Representation Learning by Predicting Image Rotations
Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation
Adaptive Sequential MCMC for Combined State and Parameter Estimation
PyramidBox: A Context-assisted Single Shot Face Detector
Speech Emotion Recognition Considering Local Dynamic Features
Assessing Shape Bias Property of Convolutional Neural Networks
A Distributed Stochastic Gradient Tracking Method
Fast Semantic Segmentation on Video Using Motion Vector-Based Feature Interpolation
A Supplementary Condition for the Convergence of the Control Policy during Adaptive Dynamic Programming
Passivity and Evolutionary Game Dynamics
Activation cross-section data for alpha-particle induced nuclear reactions on natural ytterbium for some longer lived radioisotopes
Data-Driven Sparse System Identification
From Gauss to Kolmogorov: Localized Measures of Complexity for Ellipses
Statistical Properties and Variations of LOS MIMO Channels at Millimeter Wave Frequencies
Emergence of grid-like representations by training recurrent neural networks to perform spatial localization
$ρ$-hot Lexicon Embedding-based Two-level LSTM for Sentiment Analysis
Cross-Layer Energy Efficient Resource Allocation in PD-NOMA based H-CRANs: Implementation via GPU
Learning and Recognizing Human Action from Skeleton Movement with Deep Residual Neural Networks
Exploiting deep residual networks for human action recognition from skeletal data
High-dimensional covariance matrices in elliptical distributions with application to spherical test
Domain Adaptation for Ear Recognition Using Deep Convolutional Neural Networks
Covert Wireless Communications with Channel Inversion Power Control in Rayleigh Fading
Hiding higher order cross-correlations of multivariate data using Archimedean copulas
Modeling the distribution of the arrival angle based on transmitter antenna pattern
Patch-based Fake Fingerprint Detection Using a Fully Convolutional Neural Network with a Small Number of Parameters and an Optimal Threshold
Phase Retrieval via Sensor Network Localization
Some Theoretical Properties of GANs
Multi-view Metric Learning in Vector-valued Kernel Spaces
A Quantum-Secure Niederreiter Cryptosystem using Quasi-Cyclic Codes
Expeditious Generation of Knowledge Graph Embeddings
End-to-End Fingerprints Liveness Detection using Convolutional Networks with Gram module
Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network
Convergence rates for distributed stochastic optimization over random networks
Optimal Dynamic Contract for Spectrum Reservation in Mission-Critical UNB-IoT Systems
Distributed Zeroth Order Optimization Over Random Networks: A Kiefer-Wolfowitz Stochastic Approximation Approach
Non-normality, reactivity, and intrinsic stochasticity in neural dynamics: a non-equilibrium potential approach
Downlink Non-Orthogonal Multiple Access (NOMA) in Poisson Networks
Reservoir computing approaches for representation and classification of multivariate time series
A differential game with exit costs
Joint Power and Trajectory Design for Physical-Layer Secrecy in the UAV-Aided Mobile Relaying System
A Case Study for Grain Quality Assurance Tracking based on a Blockchain Business Network
An Unsupervised Multivariate Time Series Kernel Approach for Identifying Patients with Surgical Site Infection from Blood Samples
Parabolic equations with rough coefficients and singular forcing
On the necessary and sufficient conditions to solve a heat equation with general additive Gaussian noise
On Enumeration of Dyck Paths with colored hills
Multiple Models for Recommending Temporal Aspects of Entities
On scaling limits of planar maps with stable face-degrees
HATS: Histograms of Averaged Time Surfaces for Robust Event-based Object Classification
Modelling the Influence of Cultural Information on Vision-Based Human Home Activity Recognition
The Augustin Capacity and Center
Crowd-Machine Collaboration for Item Screening
End-to-End Video Captioning with Multitask Reinforcement Learning
Resilient Monotone Sequential Maximization
A Cascaded Convolutional Neural Network for Single Image Dehazing
Stability and optimality of multi-scale transportation networks with distributed dynamic tolls
Configurational stability for the Kuramoto-Sakaguchi model
A conjecture on Gallai-Ramsey numbers of even cycles and paths
Stochastic Learning under Random Reshuffling
Consistent Adaptive Multiple Importance Sampling and Controlled Diffusions
Non-rigid 3D Shape Registration using an Adaptive Template
A Survey on Application of Machine Learning Techniques in Optical Networks
Information Theoretic Interpretation of Deep learning
Online data assimilation in distributionally robust optimization
BioTracker: An Open-Source Computer Vision Framework for Visual Animal Tracking
Quantification of Lung Abnormalities in Cystic Fibrosis using Deep Networks
Age of Information in a Network of Preemptive Servers
Adversarial Defense based on Structure-to-Signal Autoencoders
Boosting Random Forests to Reduce Bias; One-Step Boosted Forest and its Variance Estimate
Video Object Segmentation with Language Referring Expressions
Social Media Would Not Lie: Prediction of the 2016 Taiwan Election via Online Heterogeneous Data
Monocular Depth Estimation by Learning from Heterogeneous Datasets
Error Estimation for Randomized Least-Squares Algorithms via the Bootstrap
Stacked Cross Attention for Image-Text Matching
Efficient Bandwidth Estimation in Two-dimensional Filtered Backprojection PET Reconstruction
Primal-Dual Algorithm for Distributed Reinforcement Learning: Distributed GTD2
Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs
Similar Elements and Metric Labeling on Complete Graphs
On Non-localization of Eigenvectors of High Girth Graphs