Advertisements

Whats new on arXiv

A Time-Power Series Based Semi-Analytical Approach for Power System Simulation

Time domain simulation is the basis of dynamic security assessment for power systems. Traditionally, numerical integration methods are adopted by simulation software to solve nonlinear power system differential-algebraic equations about any given contingency under a specific operating condition. An alternative approach promising for online simulation is to offline derive a semi-analytical solution (SAS) and then online evaluate the SAS over consecutive time windows regarding the operating condition and contingency until obtaining the simulation result over a desired period. This paper proposes a general semi-analytical approach that derives and evaluates an SAS in the form of power series in time to approximate the solutions of power system differential equations. An error-rate upper bound of the SAS is also proposed to guarantee the reliable use of adaptive time windows for evaluation of the SAS. A dynamic bus method is proposed to extend the semi-analytical approach for solving general power system DAEs by efficiently linking the SASs for dynamic components through the numerical solution of the network algebraic equations. Case studies performed on the New England 39-bus system and the Polish 2383-bus system test the performance of the proposed semi-analytical approach and compare to existing methods. The results show that the SAS based approach has potentials for online simulations.


Unsupervised Representation Adversarial Learning Network: from Reconstruction to Generation

A good representation for arbitrarily complicated data should have the capability of semantic generation, clustering and reconstruction. Previous research has already achieved impressive performance on either one. This paper aims at learning a disentangled representation effective for all of them in an unsupervised way. To achieve all the three tasks together, we learn the forward and inverse mapping between data and representation on the basis of a symmetric adversarial process. In theory, we minimize the upper bound of the two conditional entropy loss between the latent variables and the observations together to achieve the cycle consistency. The newly proposed RepGAN is tested on MNIST, fashionMNIST, CelebA, and SVHN datasets to perform unsupervised or semi-supervised classification, generation and reconstruction tasks. The result demonstrates that RepGAN is able to learn a useful and competitive representation. To the author’s knowledge, our work is the first one to achieve both a high unsupervised classification accuracy and low reconstruction error on MNIST.


The Role-Relevance Model for Enhanced Semantic Targeting in Unstructured Text

Personalized search provides a potentially powerful tool, however, it is limited due to the large number of roles that a person has: parent, employee, consumer, etc. We present the role-relevance algorithm: a search technique that favors search results relevant to the user’s current role. The role-relevance algorithm uses three factors to score documents: (1) the number of keywords each document contains; (2) each document’s geographic relevance to the user’s role (if applicable); and (3) each document’s topical relevance to the user’s role (if applicable). Topical relevance is assessed using a novel extension to Latent Dirichlet Allocation (LDA) that allows standard LDA to score document relevance to user-defined topics. Overall results on a pre-labeled corpus show an average improvement in search precision of approximately 20% compared to keyword search alone.


Review of methods for assessing the causal effect of binary interventions from aggregate time-series observational data

Researchers are often interested in assessing the impact of an intervention on an outcome of interest in situations where the intervention is non-randomised, information is available at an aggregate level, the intervention is only applied to one or few units, the intervention is binary, and there are outcome measurements at multiple time points. In this paper, we review existing methods for causal inference in the setup just outlined. We detail the assumptions underlying each method, emphasise connections between the different approaches and provide guidelines regarding their practical implementation. Several open problems are identified thus highlighting the need for future research.


GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset. In pursuit of this objective, we introduce the General Language Understanding Evaluation benchmark (GLUE), a tool for evaluating and analyzing the performance of models across a diverse range of existing NLU tasks. GLUE is model-agnostic, but it incentivizes sharing knowledge across tasks because certain tasks have very limited training data. We further provide a hand-crafted diagnostic test suite that enables detailed linguistic analysis of NLU models. We evaluate baselines based on current methods for multi-task and transfer learning and find that they do not immediately give substantial improvements over the aggregate performance of training a separate model per task, indicating room for improvement in developing general and robust NLU systems.


ADef: an Iterative Algorithm to Construct Adversarial Deformations

While deep neural networks have proven to be a powerful tool for many recognition and classification tasks, their stability properties are still not well understood. In the past, image classifiers have been shown to be vulnerable to so-called adversarial attacks, which are created by additively perturbing the correctly classified image. In this paper, we propose the ADef algorithm to construct a different kind of adversarial attack created by iteratively applying small deformations to the image, found through a gradient descent step. We demonstrate our results on MNIST with a convolutional neural network and on ImageNet with Inception-v3 and ResNet-101.


Generating Music using an LSTM Network

A model of music needs to have the ability to recall past details and have a clear, coherent understanding of musical structure. Detailed in the paper is a neural network architecture that predicts and generates polyphonic music aligned with musical rules. The probabilistic model presented is a Bi-axial LSTM trained with a kernel reminiscent of a convolutional kernel. When analyzed quantitatively and qualitatively, this approach performs well in composing polyphonic music. Link to the code is provided.


Juniper: An Open-Source Nonlinear Branch-and-Bound Solver in Julia

Nonconvex mixed-integer nonlinear programs (MINLPs) represent a challenging class of optimization problems that often arise in engineering and scientific applications. Because of nonconvexities, these programs are typically solved with global optimization algorithms, which have limited scalability. However, nonlinear branch-and-bound has recently been shown to be an effective heuristic for quickly finding high-quality solutions to large-scale nonconvex MINLPs, such as those arising in infrastructure network optimization. This work proposes Juniper, a Julia-based open-source solver for nonlinear branch-and-bound. Leveraging the high-level Julia programming language makes it easy to modify Juniper’s algorithm and explore extensions, such as branching heuristics, feasibility pumps, and parallelization. Detailed numerical experiments demonstrate that the initial release of Juniper is comparable with other nonlinear branch-and-bound solvers, such as Bonmin, Minotaur, and Knitro, illustrating that Juniper provides a strong foundation for further exploration in utilizing nonlinear branch-and-bound algorithms as heuristics for nonconvex MINLPs.


Sentence Simplification with Memory-Augmented Neural Networks

Sentence simplification aims to simplify the content and structure of complex sentences, and thus make them easier to interpret for human readers, and easier to process for downstream NLP applications. Recent advances in neural machine translation have paved the way for novel approaches to the task. In this paper, we adapt an architecture with augmented memory capacities called Neural Semantic Encoders (Munkhdalai and Yu, 2017) for sentence simplification. Our experiments demonstrate the effectiveness of our approach on different simplification datasets, both in terms of automatic evaluation measures and human judgments.


A Complementary Tracking Model with Multiple Features

Discriminative Correlation Filters (DCF)-based tracking algorithms exploiting conventional handcrafted features have achieved impressive results both in terms of accuracy and robustness. Template handcrafted features have shown excellent performance, but they perform poorly when the appearance of target changes rapidly such as fast motions and fast deformations. In contrast, statistical handcrafted features are insensitive to fast states changes, but they yield inferior performance in the scenarios of illumination variations and background clutters. In this work, to achieve an efficient tracking performance, we propose a novel visual tracking algorithm, named MFCMT, based on a complementary ensemble model with multiple features, including Histogram of Oriented Gradients (HOGs), Color Names (CNs) and Color Histograms (CHs). Additionally, to improve tracking results and prevent targets drift, we introduce an effective fusion method by exploiting relative entropy to coalesce all basic response maps and get an optimal response. Furthermore, we suggest a simple but efficient update strategy to boost tracking performance. Comprehensive evaluations are conducted on two tracking benchmarks demonstrate and the experimental results demonstrate that our method is competitive with numerous state-of-the-art trackers. Our tracker achieves impressive performance with faster speed on these benchmarks.


Streaming Active Learning Strategies for Real-Life Credit Card Fraud Detection: Assessment and Visualization

Credit card fraud detection is a very challenging problem because of the specific nature of transaction data and the labeling process. The transaction data is peculiar because they are obtained in a streaming fashion, they are strongly imbalanced and prone to non-stationarity. The labeling is the outcome of an active learning process, as every day human investigators contact only a small number of cardholders (associated to the riskiest transactions) and obtain the class (fraud or genuine) of the related transactions. An adequate selection of the set of cardholders is therefore crucial for an efficient fraud detection process. In this paper, we present a number of active learning strategies and we investigate their fraud detection accuracies. We compare different criteria (supervised, semi-supervised and unsupervised) to query unlabeled transactions. Finally, we highlight the existence of an exploitation/exploration trade-off for active learning in the context of fraud detection, which has so far been overlooked in the literature.


The FactChecker: Verifying Text Summaries of Relational Data Sets

We present a novel natural language query interface, the FactChecker, aimed at text summaries of relational data sets. The tool focuses on natural language claims that translate into an SQL query and a claimed query result. Similar in spirit to a spell checker, the FactChecker marks up text passages that seem to be inconsistent with the actual data. At the heart of the system is a probabilistic model that reasons about the input document in a holistic fashion. Based on claim keywords and the document structure, it maps each text claim to a probability distribution over associated query translations. By efficiently executing tens to hundreds of thousands of candidate translations for a typical input document, the system maps text claims to correctness probabilities. This process becomes practical via a specialized processing backend, avoiding redundant work via query merging and result caching. Verification is an interactive process in which users are shown tentative results, enabling them to take corrective actions if necessary. Our system was tested on a set of 53 public articles containing 392 claims. Our test cases include articles from major newspapers, summaries of survey results, and Wikipedia articles. Our tool revealed erroneous claims in roughly a third of test cases. A detailed user study shows that users using our tool are in average six times faster at checking text summaries, compared to generic SQL interfaces. In fully automated verification, our tool achieves significantly higher recall and precision than baselines from the areas of natural language query interfaces and fact checking.


A new regression model for positive data

In this paper, we propose a regression model where the response variable is beta prime distributed using a new parameterization of this distribution that is indexed by mean and precision parameters. The proposed regression model is useful for situations where the variable of interest is continuous and restricted to the positive real line and is related to other variables through the mean and precision parameters. The variance function of the proposed model has a quadratic form. In addition, the beta prime model has properties that its competitor distributions of the exponential family do not have. Estimation is performed by maximum likelihood. Furthermore, we discuss residuals and influence diagnostic tools. Finally, we also carry out an application to real data that demonstrates the usefulness of the proposed model.


Distributed, Private, and Derandomized Allocation Algorithm for EV Charging
One-Shot Learning using Mixture of Variational Autoencoders: a Generalization Learning approach
Analyzing Solar Irradiance Variation From GPS and Cameras
On a monotone scheme for nonconvex nonsmooth optimization with applications to fracture mechanics
n-Dimensional Optical Orthogonal Codes, Bounds and Optimal Constructions
Symmetry breaking and localization in a random Schwinger model with commensuration
Squared Bessel processes of positive and negative dimension embedded in Brownian local times
Nonparametric Stochastic Compositional Gradient Descent for Q-Learning in Continuous Markov Decision Problems
Assessing Language Proficiency from Eye Movements in Reading
Stylistic Variation in Social Media Part-of-Speech Tagging
A Rank-Preserving Generalized Matrix Inverse for Consistency with Respect to Similarity
Effects of sampling skewness of the importance-weighted risk estimator on model selection
Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events
Randomized ICA and LDA Dimensionality Reduction Methods for Hyperspectral Image Classification
Meromorphic continuation of the mean signature of fractional brownian motion
Invitación al estudio estadístico del lenguaje
Sampling-free Uncertainty Estimation in Gated Recurrent Units with Exponential Families
The Aftermath of Disbanding an Online Hateful Community
Challenges and pitfalls of partitioning blockchains
Survey of Face Detection on Low-quality Images
Stanley-Reisner rings for symmetric simplicial complexes, G-semimatroids and Abelian arrangements
Connectivity of Ad Hoc Wireless Networks with Node Faults
Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression
A genome-wide design and an empirical partially Bayes approach to increase the power of Mendelian randomization, with application to the effect of blood lipids on cardiovascular disease
Accelerated Affine Scaling Algorithms for Linear Programming Problems
A Predictive Model for Notional Anaphora in English
Vehicle Security: Risk Assessment in Transportation
A Suboptimality Approach to Distributed $\mathcal{H}_2$ Optimal Control
The minimum size of a linear set
Identity Aging: Efficient Blockchain Consensus
QoS Provisioning in Large Wireless Networks
Dynamic Power Splitting for SWIPT with Nonlinear Energy Harvesting in Ergodic Fading Channel
Video based Contextual Question Answering
Preference-Guided Planning: An Active Elicitation Approach
GritNet: Student Performance Prediction with Deep Learning
Tipping Points for Norm Change in Human Cultures
Identification of Switched ARX Systems from Large Noisy Data Sets
Variable Selection via Adaptive False Negative Control in High-Dimensional Regression
An Ensemble Generation MethodBased on Instance Hardness
High Dynamic Range SLAM with Map-Aware Exposure Time Control
Design of Ad Hoc Wireless Mesh Networks Formed by Unmanned Aerial Vehicles with Advanced Mechanical Automation
Volterra Kernel Identification using Regularized Orthonormal Basis Functions
Empirical-likelihood-based criteria for model selection on marginal analysis of longitudinal data with dropout missingness
Finding Cliques in Social Networks: A New Distribution-Free Model
Two Use Cases of Machine Learning for SDN-Enabled IP/Optical Networks: Traffic Matrix Prediction and Optical Path Performance Prediction
Calibration-free B0 correction of EPI data using structured low rank matrix recovery
Vision Meets Drones: A Challenge
DFT-Based Hybrid Beamforming Multiuser Systems: Rate Analysis and Beam Selection
Orbital Angular Momentum for Wireless Communications
Model reduction for Kuramoto models with complex topologies
Personal vs. Know-How Contacts: Which Matter More in Wiki Elections?
Bayesian Auctions with Efficient Queries
View Adaptive Neural Networks for High Performance Skeleton-based Human Action Recognition
Generating a Fusion Image: One’s Identity and Another’s Shape
Light Spanners for High Dimensional Norms via Stochastic Decompositions
Online Vertex-Weighted Bipartite Matching: Beating 1-1/e with Random Arrivals
Growth Mechanism and Origin of High $sp^3$ Content in Tetrahedral Amorphous Carbon
Delegating via Quitting Games
Stochastic Linear Quadratic Stackelberg Differential Game with Overlapping Information
Climb on the Bandwagon: Consensus and periodicity in a lifetime utility model with strategic interactions
Accurate Deep Direct Geo-Localization from Ground Imagery and Phone-Grade GPS
Discrete Total Variation with Finite Elements and Applications to Imaging
An efficient particle-based method for maximum likelihood estimation in nonlinear state-space models
MIMO Channel Hardening: A Physical Model based Analysis
Graph-based Hypothesis Generation for Parallax-tolerant Image Stitching
Residual-Guide Feature Fusion Network for Single Image Deraining
Parallel Quicksort without Pairwise Element Exchange
Planar Steiner Orientation is NP-complete
Phase diagram of dipolar-coupled XY moments on disordered square lattices
A general inequality for packings of boxes
Consensusability of Multi-agent Systems with Delay and Packet Dropout Under Predictor-like Protocols
Analyzing astronomical data with Apache Spark
Stabilizing predictive control with persistence of excitation for constrained linear systems
Extending the Best Linear Approximation Framework to the Process Noise Case
An Approximate Shading Model with Detail Decomposition for Object Relighting
Curing Braess’ Paradox by Secondary Control in Power Grids
Benchmarking Top-K Keyword and Top-K Document Processing with T${}^2$K${}^2$ and T${}^2$K${}^2$D${}^2$
Bias-variance tradeoff in MIMO channel estimation
In defence of the simple: Euclidean distance for comparing complex networks
A Bayesian Framework for Assessing the Strength Distribution of Composite Structures with Random Defects
Specialty-Aware Task Assignment in Spatial Crowdsourcing
Reliable Low Latency Wireless Communication Enabling Industrial Mobile Control and Safety Applications
Modelling the Time-dependent VRP through Open Data
Affine processes beyond stochastic continuity
Bound entangled states fit for robust experimental verification
Moments and convex optimization for analysis and control of nonlinear partial differential equations
On the Post Selection Inference constant under Restricted Isometry Properties
Mpemba effect in spin glasses: a persistent memory effect
The Power of Machine Learning and Market Design for Cloud Computing Admission Control
MobileFaceNets: Efficient CNNs for Accurate Real-time Face Verification on Mobile Devices
Optimal Sorting with Persistent Comparison Errors
Robust and scalable learning of data manifolds with complex topologies via ElPiGraph
Automatic Stance Detection Using End-to-End Memory Networks
Approaches for Enriching and Improving Textual Knowledge Bases
ClaimRank: Detecting Check-Worthy Claims in Arabic and English
OpenFPM: A scalable open framework for particle and particle-mesh codes on parallel computers
Conditional Maximum Lq-Likelihood Estimation for Regression Model with Autoregressive Error Terms
A Rigorous Analysis of Least Squares Sine Fitting Using Quantized Data: the Random Phase Case
Revisiting Small Batch Training for Deep Neural Networks
A Simple Quantum Neural Net with a Periodic Activation Function
On the Effects of Subpacketization in Content-Centric Mobile Networks
Assimilation of semi-qualitative observations with a stochastic Ensemble Kalman Filter
Evolution of a Functionally Diverse Swarm via a Novel Decentralised Quality-Diversity Algorithm
Acquisition of Phrase Correspondences using Natural Deduction Proofs
Super-resolution Ultrasound Localization Microscopy through Deep Learning
An Investigation of Environmental Influence on the Benefits of Adaptation Mechanisms in Evolutionary Swarm Robotics
Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Modelling customer online behaviours with neural networks: applications to conversion prediction and advertising retargeting
Unsupervised learning of the brain connectivity dynamic using residual D-net
Turán’s Theorem for the Fano plane
Achievable Information Rates for Nonlinear Fiber Communication via End-to-end Autoencoder Learning
CUDA Support in GNA Data Analysis Framework
Cross-domain Dialogue Policy Transfer via Simultaneous Speech-act and Slot Alignment
Central limit theorems from the roots of probability generating functions
On the Location of the Minimizer of the Sum of Strongly Convex Functions
Practical Issues in the Synthesis of Ternary Sequences
Lightweight Adaptive Mixture of Neural and N-gram Language Models
Generating syntactically varied realisations from AMR graphs
Infinite geodesics in hyperbolic random triangulations
Design of High-Order Decoupled Multirate GARK Schemes
Topology-driven Diversity for Targeted Influence Maximization with Application to User Engagement in Social Networks
Image Inpainting for Irregular Holes Using Partial Convolutions
Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension
Topological data analysis of continuum percolation with disks
Fick law and sticky Brownian motions
Synthesizing Images of Humans in Unseen Poses
The Indentifiable Elicitation Complexity of the Mode is Infinite
Correlated Random Matrices: Band Rigidity and Edge Universality
Improving Supervised Bilingual Mapping of Word Embeddings
Cut to Fit: Tailoring the Partitioning to the Computation
twAwler: A lightweight twitter crawler
The Dyson equation with linear self-energy: spectral bands, edges and cusps
Learning Semantic Textual Similarity from Conversations
Phrase-Based & Neural Unsupervised Machine Translation
Mobile Edge Computing-Enabled Heterogeneous Networks

Advertisements

Magister Dixit

“What was once just a figment of the imagination of some our most famous science fiction writers, artificial intelligence (AI) is taking root in our everyday lives. We’re still a few years away from having robots at our beck and call, but AI has already had a profound impact in more subtle ways. Weather forecasts, email spam filtering, Google’s search predictions, and voice recognition, such Apple’s Siri, are all examples. What these technologies have in common are machine-learning algorithms that enable them to react and respond in real time. There will be growing pains as AI technology evolves, but the positive effect it will have on society in terms of efficiency is immeasurable.” Or Shani ( January 27, 2015 )

Document worth reading: “Learning from the machine: interpreting machine learning algorithms for point- and extended- source classification”

We investigate star-galaxy classification for astronomical surveys in the context of four methods enabling the interpretation of black-box machine learning systems. The first is outputting and exploring the decision boundaries as given by decision tree based methods, which enables the visualization of the classification categories. Secondly, we investigate how the Mutual Information based Transductive Feature Selection (MINT) algorithm can be used to perform feature pre-selection. If one would like to provide only a small number of input features to a machine learning classification algorithm, feature pre-selection provides a method to determine which of the many possible input properties should be selected. Third is the use of the tree-interpreter package to enable popular decision tree based ensemble methods to be opened, visualized, and understood. This is done by additional analysis of the tree based model, determining not only which features are important to the model, but how important a feature is for a particular classification given its value. Lastly, we use decision boundaries from the model to revise an already existing method of classification, essentially asking the tree based method where decision boundaries are best placed and defining a new classification method. We showcase these techniques by applying them to the problem of star-galaxy separation using data from the Sloan Digital Sky Survey (hereafter SDSS). We use the output of MINT and the ensemble methods to demonstrate how more complex decision boundaries improve star-galaxy classification accuracy over the standard SDSS frames approach (reducing misclassifications by up to $\approx33\%$). We then show how tree-interpreter can be used to explore how relevant each photometric feature is when making a classification on an object by object basis. Learning from the machine: interpreting machine learning algorithms for point- and extended- source classification

R Packages worth a look

Spatial Association Between Regionalizations (sabre)
Calculates a degree of spatial association between regionalizations or categorical maps using the information-theoretical V-measure (Nowosad and Stepinski (2018) <doi:10.17605/OSF.IO/RCJH7>). It also offers an R implementation of the MapCurve method (Hargrove et al. (2006) <doi:10.1007/s10109-006-0025-x>).

Basic and Advanced Statistical Power Analysis (WebPower)
This is a collection of tools for conducting both basic and advanced statistical power analysis including correlation, proportion, t-test, one-way ANOVA, two-way ANOVA, linear regression, logistic regression, Poisson regression, mediation analysis, longitudinal data analysis, structural equation modeling and multilevel modeling. It also serves as the engine for conducting power analysis online at <https://webpower.psychstat.org>.

Interface to the ‘Libraries.io’ API (rbraries)
Interface to the ‘Libraries.io’ API (<https://…/api> ). ‘Libraries.io’ indexes data from 36 different package managers for programming languages.

AWS Comprehend’ Client Package (aws.comprehend)
Client for ‘AWS Comprehend’ <https://…/comprehend>, a cloud natural language processing service that can perform a number of quantitative text analyses, including language detection, sentiment analysis, and feature extraction.

Entropy-Based Segregation Indices (segregation)
Computes entropy-based segregation indices, as developed by Theil (1971) <isbn:978-0471858454>, with a focus on the Mutual Information Index (M). The M, further described by Mora and Ruiz-Castillo (2011) <doi:10.1111/j.1467-9531.2011.01237.x> and Frankel and Volij (2011) <doi:10.1016/j.jet.2010.10.008>, is a measure of segregation that is highly decomposable. The package provides tools to decompose the index by units and groups (local segregation), and by within and between terms. Includes standard error estimation by bootstrapping.

Book Memo: “Visual Pattern Discovery and Recognition”

This book presents a systematic study of visual pattern discovery, from unsupervised to semi-supervised manner approaches, and from dealing with a single feature to multiple types of features. Furthermore, it discusses the potential applications of discovering visual patterns for visual data analytics, including visual search, object and scene recognition. It is intended as a reference book for advanced undergraduates or postgraduate students who are interested in visual data analytics, enabling them to quickly access the research world and acquire a systematic methodology rather than a few isolated techniques to analyze visual data with large variations. It is also inspiring for researchers working in computer vision and pattern recognition fields. Basic knowledge of linear algebra, computer vision and pattern recognition would be helpful to readers.

Book Memo: “Kernel Mean Embedding of Distributions”

A Review and Beyond
A Hilbert space embedding of a distribution – in short, a kernel mean embedding – has recently emerged as a powerful tool for machine learning and statistical inference. The basic idea behind this framework is to map distributions into a reproducing kernel Hilbert space (RKHS) in which the whole arsenal of kernel methods can be extended to probability measures. It can be viewed as a generalization of the original “feature map” common to support vector machines (SVMs) and other kernel methods. In addition to the classical applications of kernel methods, the kernel mean embedding has found novel applications in fields ranging from probabilistic modeling to statistical inference, causal discovery, and deep learning. This survey aims to give a comprehensive review of existing work and recent advances in this research area, and to discuss challenging issues and open problems that could potentially lead to new research directions. The survey begins with a brief introduction to the RKHS and positive definite kernels which forms the backbone of this survey, followed by a thorough discussion of the Hilbert space embedding of marginal distributions, theoretical guarantees, and a review of its applications. The embedding of distributions enables us to apply RKHS methods to probability measures which prompts a wide range of applications such as kernel two-sample testing, independent testing, and learning on distributional data. Next, we discuss the Hilbert space embedding for conditional distributions, give theoretical insights, and review some applications. The conditional mean embedding enables us to perform sum, product, and Bayes’ rules—which are ubiquitous in graphical model, probabilistic inference, and reinforcement learning— in a non-parametric way using this new representation of distributions. We then discuss relationships between this framework and other related areas. Lastly, we give some suggestions on future research directions.

If you did not already know

Deep Policy Inference Q-Network (DPIQN) google
We present DPIQN, a deep policy inference Q-network that targets multi-agent systems composed of controllable agents, collaborators, and opponents that interact with each other. We focus on one challenging issue in such systems—modeling agents with varying strategies—and propose to employ ‘policy features’ learned from raw observations (e.g., raw images) of collaborators and opponents by inferring their policies. DPIQN incorporates the learned policy features as a hidden vector into its own deep Q-network (DQN), such that it is able to predict better Q values for the controllable agents than the state-of-the-art deep reinforcement learning models. We further propose an enhanced version of DPIQN, called deep recurrent policy inference Q-network (DRPIQN), for handling partial observability. Both DPIQN and DRPIQN are trained by an adaptive training procedure, which adjusts the network’s attention to learn the policy features and its own Q-values at different phases of the training process. We present a comprehensive analysis of DPIQN and DRPIQN, and highlight their effectiveness and generalizability in various multi-agent settings. Our models are evaluated in a classic soccer game involving both competitive and collaborative scenarios. Experimental results performed on 1 vs. 1 and 2 vs. 2 games show that DPIQN and DRPIQN demonstrate superior performance to the baseline DQN and deep recurrent Q-network (DRQN) models. We also explore scenarios in which collaborators or opponents dynamically change their policies, and show that DPIQN and DRPIQN do lead to better overall performance in terms of stability and mean scores. …

Multiregression Dynamic Models (MDM) google
Multiregression dynamic models are defined to preserve certain conditional independence structures over time across a multivariate time series. They are non-Gaussian and yet they can often be updated in closed form. The first two moments of their one-step-ahead forecast distribution can be easily calculated. Furthermore, they can be built to contain all the features of the univariate dynamic linear model and promise more efficient identification of causal structures in a time series than has been possible in the past …

Extremal Depth (ED) google
We propose a new notion called `extremal depth’ (ED) for functional data, discuss its properties, and compare its performance with existing concepts. The proposed notion is based on a measure of extreme `outlyingness’. ED has several desirable properties that are not shared by other notions and is especially well suited for obtaining central regions of functional data and function spaces. In particular: a) the central region achieves the nominal (desired) simultaneous coverage probability; b) there is a correspondence between ED-based (simultaneous) central regions and appropriate point-wise central regions; and c) the method is resistant to certain classes of functional outliers. The paper examines the performance of ED and compares it with other depth notions. Its usefulness is demonstrated through applications to constructing central regions, functional boxplots, outlier detection, and simultaneous confidence bands in regression problems. …

Book Memo: “Analysis of Repeated Measures Data”

This book presents a broad range of statistical techniques to address emerging needs in the field of repeated measures. It also provides a comprehensive overview of extensions of generalized linear models for the bivariate exponential family of distributions, which represent a new development in analysing repeated measures data. The demand for statistical models for correlated outcomes has grown rapidly recently, mainly due to presence of two types of underlying associations: associations between outcomes, and associations between explanatory variables and outcomes. The book systematically addresses key problems arising in the modelling of repeated measures data, bearing in mind those factors that play a major role in estimating the underlying relationships between covariates and outcome variables for correlated outcome data. In addition, it presents new approaches to addressing current challenges in the field of repeated measures and models based on conditional and joint probabilities. Markov models of first and higher orders are used for conditional models in addition to conditional probabilities as a function of covariates. Similarly, joint models are developed using both marginal-conditional probabilities as well as joint probabilities as a function of covariates. In addition to generalized linear models for bivariate outcomes, it highlights extended semi-parametric models for continuous failure time data and their applications in order to include models for a broader range of outcome variables that researchers encounter in various fields. The book further discusses the problem of analysing repeated measures data for failure time in the competing risk framework, which is now taking on an increasingly important role in the field of survival analysis, reliability and actuarial science. Details on how to perform the analyses are included in each chapter and supplemented with newly developed R packages and functions along with SAS codes and macro/IML. It is a valuable resource for researchers, graduate students and other users of statistical techniques for analysing repeated measures data.

Book Memo: “Iterative Learning Control for Multi-agent Systems Coordination”

A timely guide using iterative learning control (ILC) as a solution for multi-agent systems (MAS) challenges, showcasing recent advances and industrially relevant applications
• Explores the synergy between the important topics of iterative learning control (ILC) and multi-agent systems (MAS)
• Concisely summarizes recent advances and significant applications in ILC methods for power grids, sensor networks and control processes
• Covers basic theory, rigorous mathematics as well as engineering practice

If you did not already know

Convolutional Neural Network – Support Vector Machine (CNN-SVM) google
Convolutional neural networks (CNNs) are similar to ‘ordinary’ neural networks in the sense that they are made up of hidden layers consisting of neurons with ‘learnable’ parameters. These neurons receive inputs, performs a dot product, and then follows it with a non-linearity. The whole network expresses the mapping between raw image pixels and their class scores. Conventionally, the Softmax function is the classifier used at the last layer of this network. However, there have been studies (Alalshekmubarak and Smith, 2013; Agarap, 2017; Tang, 2013) conducted to challenge this norm. The cited studies introduce the usage of linear support vector machine (SVM) in an artificial neural network architecture. This project is yet another take on the subject, and is inspired by (Tang, 2013). Empirical data has shown that the CNN-SVM model was able to achieve a test accuracy of ~99.04% using the MNIST dataset (LeCun, Cortes, and Burges, 2010). On the other hand, the CNN-Softmax was able to achieve a test accuracy of ~99.23% using the same dataset. Both models were also tested on the recently-published Fashion-MNIST dataset (Xiao, Rasul, and Vollgraf, 2017), which is suppose to be a more difficult image classification dataset than MNIST (Zalandoresearch, 2017). This proved to be the case as CNN-SVM reached a test accuracy of ~90.72%, while the CNN-Softmax reached a test accuracy of ~91.86%. The said results may be improved if data preprocessing techniques were employed on the datasets, and if the base CNN model was a relatively more sophisticated than the one used in this study. …

Dynamically Routed Network (SkipNet) google
Increasing depth and complexity in convolutional neural networks has enabled significant progress in visual perception tasks. However, incremental improvements in accuracy are often accompanied by exponentially deeper models that push the computational limits of modern hardware. These incremental improvements in accuracy imply that only a small fraction of the inputs require the additional model complexity. As a consequence, for any given image it is possible to bypass multiple stages of computation to reduce the cost of forward inference without affecting accuracy. We exploit this simple observation by learning to dynamically route computation through a convolutional network. We introduce dynamically routed networks (SkipNets) by adding gating layers that route images through existing convolutional networks and formulate the routing problem in the context of sequential decision making. We propose a hybrid learning algorithm which combines supervised learning and reinforcement learning to address the challenges of inherently non-differentiable routing decisions. We show SkipNet reduces computation by 30 – 90% while preserving the accuracy of the original model on four benchmark datasets. We compare SkipNet with SACT and ACT to show SkipNet achieves better accuracy with lower computation. …

Out-of-Distribution Detector for Neural Networks (ODIN) google
We consider the problem of detecting out-of-distribution examples in neural networks. We propose ODIN, a simple and effective out-of-distribution detector for neural networks, that does not require any change to a pre-trained model. Our method is based on the observation that using temperature scaling and adding small perturbations to the input can separate the softmax score distributions of in- and out-of-distribution samples, allowing for more effective detection. We show in a series of experiments that our approach is compatible with diverse network architectures and datasets. It consistently outperforms the baseline approach[1] by a large margin, establishing a new state-of-the-art performance on this task. For example, ODIN reduces the false positive rate from the baseline 34.7% to 4.3% on the DenseNet (applied to CIFAR-10) when the true positive rate is 95%. We theoretically analyze the method and prove that performance improvement is guaranteed under mild conditions on the image distributions. …