Automated Problem Identification: Regression vs Classification via Evolutionary Deep Networks

Regression or classification? This is perhaps the most basic question faced when tackling a new supervised learning problem. We present an Evolutionary Deep Learning (EDL) algorithm that automatically solves this by identifying the question type with high accuracy, along with a proposed deep architecture. Typically, a significant amount of human insight and preparation is required prior to executing machine learning algorithms. For example, when creating deep neural networks, the number of parameters must be selected in advance and furthermore, a lot of these choices are made based upon pre-existing knowledge of the data such as the use of a categorical cross entropy loss function. Humans are able to study a dataset and decide whether it represents a classification or a regression problem, and consequently make decisions which will be applied to the execution of the neural network. We propose the Automated Problem Identification (API) algorithm, which uses an evolutionary algorithm interface to TensorFlow to manipulate a deep neural network to decide if a dataset represents a classification or a regression problem. We test API on 16 different classification, regression and sentiment analysis datasets with up to 10,000 features and up to 17,000 unique target values. API achieves an average accuracy of 96.3\% in identifying the problem type without hardcoding any insights about the general characteristics of regression or classification problems. For example, API successfully identifies classification problems even with 1000 target values. Furthermore, the algorithm recommends which loss function to use and also recommends a neural network architecture. Our work is therefore a step towards fully automated machine learning.

Deep Jointly-Informed Neural Networks

In this work a novel, automated process for determining an appropriate deep neural network architecture and weight initialization based on decision trees is presented. The method maps a collection of decision trees trained on the data into a collection of initialized neural networks, with the structure of the network determined by the structure of the tree. These models, referred to as ‘deep jointly-informed neural networks’, demonstrate high predictive performance for a variety of datasets. Furthermore, the algorithm is readily cast into a Bayesian framework, resulting in accurate and scalable models that provide quantified uncertainties on predictions.

OPEB: Open Physical Environment Benchmark for Artificial Intelligence

Artificial Intelligence methods to solve continuous- control tasks have made significant progress in recent years. However, these algorithms have important limitations and still need significant improvement to be used in industry and real- world applications. This means that this area is still in an active research phase. To involve a large number of research groups, standard benchmarks are needed to evaluate and compare proposed algorithms. In this paper, we propose a physical environment benchmark framework to facilitate collaborative research in this area by enabling different research groups to integrate their designed benchmarks in a unified cloud-based repository and also share their actual implemented benchmarks via the cloud. We demonstrate the proposed framework using an actual implementation of the classical mountain-car example and present the results obtained using a Reinforcement Learning algorithm.

Visualizing the Consequences of Evidence in Bayesian Networks

This paper addresses the challenge of viewing and navigating Bayesian networks as their structural size and complexity grow. Starting with a review of the state of the art of visualizing Bayesian networks, an area which has largely been passed over, we improve upon existing visualizations in three ways. First, we apply a disciplined approach to the graphic design of the basic elements of the Bayesian network. Second, we propose a technique for direct, visual comparison of posterior distributions resulting from alternative evidence sets. Third, we leverage a central mathematical tool in information theory, to assist the user in finding variables of interest in the network, and to reduce visual complexity where unimportant. We present our methods applied to two modestly large Bayesian networks constructed from real-world data sets. Results suggest the new techniques can be a useful tool for discovering information flow phenomena, and also for qualitative comparisons of different evidence configurations, especially in large probabilistic networks.

Modeling the Internet of Things: a simulation perspective

This paper deals with the problem of properly simulating the Internet of Things (IoT). Simulating an IoT allows evaluating strategies that can be employed to deploy smart services over different kinds of territories. However, the heterogeneity of scenarios seriously complicates this task. This imposes the use of sophisticated modeling and simulation techniques. We discuss novel approaches for the provision of scalable simulation scenarios, that enable the real-time execution of massively populated IoT environments. Attention is given to novel hybrid and multi-level simulation techniques that, when combined with agent-based, adaptive Parallel and Distributed Simulation (PADS) approaches, can provide means to perform highly detailed simulations on demand. To support this claim, we detail a use case concerned with the simulation of vehicular transportation systems.

Anomaly Detection and Modeling in 802.11 Wireless Networks

IEEE 802.11 Wireless Networks are getting more and more popular at university campuses, enterprises, shopping centers, airports and in so many other public places, providing Internet access to a large crowd openly and quickly. The wireless users are also getting more dependent on WiFi technology and therefore demanding more reliability and higher performance for this vital technology. However, due to unstable radio conditions, faulty equipment, and dynamic user behavior among other reasons, there are always unpredictable performance problems in a wireless covered area. Detection and prediction of such problems is of great significance to network managers if they are to alleviate the connectivity issues of the mobile users and provide a higher quality wireless service. This paper aims to improve the management of the 802.11 wireless networks by characterizing and modeling wireless usage patterns in a set of anomalous scenarios that can occur in such networks. We apply time-invariant (Gaussian Mixture Models) and time-variant (Hidden Markov Models) modeling approaches to a dataset generated from a large production network and describe how we use these models for anomaly detection. We then generate several common anomalies on a Testbed network and evaluate the proposed anomaly detection methodologies in a controlled environment. The experimental results of the Testbed show that HMM outperforms GMM and yields a higher anomaly detection ratio and a lower false alarm rate.

How Noisy Data Affects Geometric Semantic Genetic Programming

Noise is a consequence of acquiring and pre-processing data from the environment, and shows fluctuations from different sources—e.g., from sensors, signal processing technology or even human error. As a machine learning technique, Genetic Programming (GP) is not immune to this problem, which the field has frequently addressed. Recently, Geometric Semantic Genetic Programming (GSGP), a semantic-aware branch of GP, has shown robustness and high generalization capability. Researchers believe these characteristics may be associated with a lower sensibility to noisy data. However, there is no systematic study on this matter. This paper performs a deep analysis of the GSGP performance over the presence of noise. Using 15 synthetic datasets where noise can be controlled, we added different ratios of noise to the data and compared the results obtained with those of a canonical GP. The results show that, as we increase the percentage of noisy instances, the generalization performance degradation is more pronounced in GSGP than GP. However, in general, GSGP is more robust to noise than GP in the presence of up to 10% of noise, and presents no statistical difference for values higher than that in the test bed.

Structured Black Box Variational Inference for Latent Time Series Models

Continuous latent time series models are prevalent in Bayesian modeling; examples include the Kalman filter, dynamic collaborative filtering, or dynamic topic models. These models often benefit from structured, non mean field variational approximations that capture correlations between time steps. Black box variational inference with reparameterization gradients (BBVI) allows us to explore a rich new class of Bayesian non-conjugate latent time series models; however, a naive application of BBVI to such structured variational models would scale quadratically in the number of time steps. We describe a BBVI algorithm analogous to the forward-backward algorithm which instead scales linearly in time. It allows us to efficiently sample from the variational distribution and estimate the gradients of the ELBO. Finally, we show results on the recently proposed dynamic word embedding model, which was trained using our method.

ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

We introduce an extremely computation efficient CNN architecture named ShuffleNet, designed specially for mobile devices with very limited computing power (e.g., 10-150 MFLOPs). The new architecture utilizes two proposed operations, pointwise group convolution and channel shuffle, to greatly reduce computation cost while maintaining accuracy. Experiments on ImageNet classification and MS COCO object detection demonstrate the superior performance of ShuffleNet over other structures, e.g. lower top-1 error (absolute 6.7\%) than the recent MobileNet system on ImageNet classification under the computation budget of 40 MFLOPs. On an ARM-based mobile device, ShuffleNet achieves \textasciitilde 13\times actual speedup over AlexNet while maintaining comparable accuracy.

The Nu Class of Low-Degree-Truncated Rational Multifunctions. Ib. Integrals of Matern-correlation functions for all odd-half-integer class parameters
Hyperbolic Geometry of Kuramoto Oscillator Networks
Modeling preference time in middle distance triathlons
Improving LSTM-CTC based ASR performance in domains with limited training data
Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning
Regression Phalanxes
Data Fusion Reconstruction of Spatially Embedded Complex Networks
Probability tilting of compensated fragmentations
On the Extremal Graphs with Respect to Bond Incident Degree Indices
Polar Codes for SCMA Systems
Robust Prediction and Control of Continuous-time Epidemic Processes
Structure Optimization for Deep Multimodal Fusion Networks using Graph-Induced Kernels
Appearance invariance in convolutional networks with neighborhood similarity
Multiscale sequence modeling with a learned dictionary
Dynamic Shrinkage Processes
Learning to Avoid Errors in GANs by Manipulating Input Spaces
Discriminatory Transfer
The Fall of the Empire: The Americanization of English
On Symmetric But Not Cyclotomic Numerical Semigroups
A simple efficient density estimator that enables fast systematic search
Zero-Shot Fine-Grained Classification by Deep Feature Learning with Semantics
Diagonal sum of infinite image partition regular matrices
Lifeguard : SWIM-ing with Situational Awareness
Efficient sensor network planning method using approximate potential game
Learning Deep Energy Models: Contrastive Divergence vs. Amortized MLE
Deep Representation Learning with Part Loss for Person Re-Identification
Hydrodynamics of the $N$-BBM process
Arabic Character Segmentation Using Projection Based Approach with Profile’s Amplitude Filter
A curious identity and its applications to partitions with bounded part differences
PBODL : Parallel Bayesian Online Deep Learning for Click-Through Rate Prediction in Tencent Advertising System
Aggregating Frame-level Features for Large-Scale Video Classification
General Price Bounds for Guaranteed Annuity Options
Deconvolution of Point Sources: A Sampling Theorem and Robustness Guarantees
Selective Deep Convolutional Features for Image Retrieval
High-Quality Face Image SR Using Conditional Generative Adversarial Networks
Rényi Resolvability and Its Applications to the Wiretap Channel
One-Shot Fine-Grained Instance Retrieval
Inexact decomposition methods for solving deterministic and stochastic convex dynamic programming equations
Spatial and Angular Resolution Enhancement of Light Fields Using Convolutional Neural Networks
Causal Consistency of Structural Equation Models
The Maximum Cosine Framework for Deriving Perceptron Based Linear Classifiers
Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition
Supporting Ruled Polygons
A functional limit theorem for random processes with immigration in the case of heavy tails
Two-sample Tests for Random Graphs
Face Recognition with Machine Learning in OpenCV_ Fusion of the results with the Localization Data of an Acoustic Camera for Speaker Identification
DeepStory: Video Story QA by Deep Embedded Memory Networks
Discussions of the paper ‘Sparse graphs using exchangeable random measures’ by F. Caron and E. B. Fox
A Complete Classification of Partial-MDS (Maximally Recoverable) Codes with One Global Parity
Automated Modal Parameter Estimation Using Correlation Analysis and Bootstrap Sampling
Equivariant Euler characteristics of the unitary building
Conditional generation of multi-modal data using constrained embedding space mapping
A semiparametric approach for bivariate extreme exceedances
Space-Time Analysis of Movements in Basketball using Sensor Data
Identification of non-linear behavior models with restricted or redundant data
ECHO: An Adaptive Orchestration Platform for Hybrid Dataflows across Cloud and Edge
A sparse linear algebra algorithm for fast computation of prediction variances with Gaussian Markov random fields
Multilingual Hierarchical Attention Networks for Document Classification
Factorizations of symmetric Macdonald polynomials
Asymptotics for the Euler-Discretized Hull-White Stochastic Volatility Model
Short-Range-Order for fcc-based binary alloys Revisited from Microscopic Geometry
Sequential Checking: Reallocation-Free Data-Distribution Algorithm for Scale-out Storage
The Candidate Multi-Cut for Cell Segmentation
Quantifying and estimating additive measures of interaction from case-control data
Bonus–malus systems with different claim types and varying deductibles
Hook formulas for skew shapes III. Multivariate and product formulas
Window-of-interest based Multi-objective Evolutionary Search for Satisficing Concepts
The sample complexity of multi-reference alignment
On The Brownian Loop Measure
Automatic estimation of harmonic tension by distributed representation of chords
Empirical optimal transport on countable metric spaces: Distributional limits and statistical applications
Signed generating functions for odd inversions on descent classes
An empirical study on the effectiveness of images in Multimodal Neural Machine Translation
Proof of a conjecture of Klopsch-Voll on Weyl groups of type $A$
Visually Grounded Word Embeddings and Richer Visual Features for Improving Multimodal Neural Machine Translation
Ins-Robust Primitive Words
LED-based Photometric Stereo: Modeling, Calibration and Numerical Solution
Mixingales on Riesz spaces
Distance Properties of Short LDPC Codes and their Impact on the BP, ML and Near-ML Decoding Performance
Polyhedra and parameter spaces for matroids over valuation rings
Packing Cycles Faster Than Erdős-Pósa
Massive MIMO for Communications with Drone Swarms
Convex regularization of discrete-valued inverse problems
Robust Optimization for Non-Convex Objectives
Improving Estimations in Quantile Regression Model with Autoregressive Errors
Markov processes on Riesz spaces
Skeleton-aided Articulated Motion Generation
On the relaxed maximum-likelihood blind MIMO channel estimation for orthogonal space-time block codes
Physical Layer Service Integration in 5G: Potentials and Challenges
Zero-Shot Transfer Learning for Event Extraction
ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games
Maintaining cooperation in complex social dilemmas using deep reinforcement learning
Improving Slot Filling Performance with Attentive Neural Networks on Dependency Structures
Optimal Littlewood-Offord inequalities in groups
Discriminative Localization in CNNs for Weakly-Supervised Segmentation of Pulmonary Nodules