Regression or classification? This is perhaps the most basic question faced when tackling a new supervised learning problem. We present an Evolutionary Deep Learning (EDL) algorithm that automatically solves this by identifying the question type with high accuracy, along with a proposed deep architecture. Typically, a significant amount of human insight and preparation is required prior to executing machine learning algorithms. For example, when creating deep neural networks, the number of parameters must be selected in advance and furthermore, a lot of these choices are made based upon pre-existing knowledge of the data such as the use of a categorical cross entropy loss function. Humans are able to study a dataset and decide whether it represents a classification or a regression problem, and consequently make decisions which will be applied to the execution of the neural network. We propose the Automated Problem Identification (API) algorithm, which uses an evolutionary algorithm interface to TensorFlow to manipulate a deep neural network to decide if a dataset represents a classification or a regression problem. We test API on 16 different classification, regression and sentiment analysis datasets with up to 10,000 features and up to 17,000 unique target values. API achieves an average accuracy of $96.3\%$ in identifying the problem type without hardcoding any insights about the general characteristics of regression or classification problems. For example, API successfully identifies classification problems even with 1000 target values. Furthermore, the algorithm recommends which loss function to use and also recommends a neural network architecture. Our work is therefore a step towards fully automated machine learning.
In this work a novel, automated process for determining an appropriate deep neural network architecture and weight initialization based on decision trees is presented. The method maps a collection of decision trees trained on the data into a collection of initialized neural networks, with the structure of the network determined by the structure of the tree. These models, referred to as ‘deep jointly-informed neural networks’, demonstrate high predictive performance for a variety of datasets. Furthermore, the algorithm is readily cast into a Bayesian framework, resulting in accurate and scalable models that provide quantified uncertainties on predictions.
Artificial Intelligence methods to solve continuous- control tasks have made significant progress in recent years. However, these algorithms have important limitations and still need significant improvement to be used in industry and real- world applications. This means that this area is still in an active research phase. To involve a large number of research groups, standard benchmarks are needed to evaluate and compare proposed algorithms. In this paper, we propose a physical environment benchmark framework to facilitate collaborative research in this area by enabling different research groups to integrate their designed benchmarks in a unified cloud-based repository and also share their actual implemented benchmarks via the cloud. We demonstrate the proposed framework using an actual implementation of the classical mountain-car example and present the results obtained using a Reinforcement Learning algorithm.
This paper addresses the challenge of viewing and navigating Bayesian networks as their structural size and complexity grow. Starting with a review of the state of the art of visualizing Bayesian networks, an area which has largely been passed over, we improve upon existing visualizations in three ways. First, we apply a disciplined approach to the graphic design of the basic elements of the Bayesian network. Second, we propose a technique for direct, visual comparison of posterior distributions resulting from alternative evidence sets. Third, we leverage a central mathematical tool in information theory, to assist the user in finding variables of interest in the network, and to reduce visual complexity where unimportant. We present our methods applied to two modestly large Bayesian networks constructed from real-world data sets. Results suggest the new techniques can be a useful tool for discovering information flow phenomena, and also for qualitative comparisons of different evidence configurations, especially in large probabilistic networks.
This paper deals with the problem of properly simulating the Internet of Things (IoT). Simulating an IoT allows evaluating strategies that can be employed to deploy smart services over different kinds of territories. However, the heterogeneity of scenarios seriously complicates this task. This imposes the use of sophisticated modeling and simulation techniques. We discuss novel approaches for the provision of scalable simulation scenarios, that enable the real-time execution of massively populated IoT environments. Attention is given to novel hybrid and multi-level simulation techniques that, when combined with agent-based, adaptive Parallel and Distributed Simulation (PADS) approaches, can provide means to perform highly detailed simulations on demand. To support this claim, we detail a use case concerned with the simulation of vehicular transportation systems.
IEEE 802.11 Wireless Networks are getting more and more popular at university campuses, enterprises, shopping centers, airports and in so many other public places, providing Internet access to a large crowd openly and quickly. The wireless users are also getting more dependent on WiFi technology and therefore demanding more reliability and higher performance for this vital technology. However, due to unstable radio conditions, faulty equipment, and dynamic user behavior among other reasons, there are always unpredictable performance problems in a wireless covered area. Detection and prediction of such problems is of great significance to network managers if they are to alleviate the connectivity issues of the mobile users and provide a higher quality wireless service. This paper aims to improve the management of the 802.11 wireless networks by characterizing and modeling wireless usage patterns in a set of anomalous scenarios that can occur in such networks. We apply time-invariant (Gaussian Mixture Models) and time-variant (Hidden Markov Models) modeling approaches to a dataset generated from a large production network and describe how we use these models for anomaly detection. We then generate several common anomalies on a Testbed network and evaluate the proposed anomaly detection methodologies in a controlled environment. The experimental results of the Testbed show that HMM outperforms GMM and yields a higher anomaly detection ratio and a lower false alarm rate.
Noise is a consequence of acquiring and pre-processing data from the environment, and shows fluctuations from different sources—e.g., from sensors, signal processing technology or even human error. As a machine learning technique, Genetic Programming (GP) is not immune to this problem, which the field has frequently addressed. Recently, Geometric Semantic Genetic Programming (GSGP), a semantic-aware branch of GP, has shown robustness and high generalization capability. Researchers believe these characteristics may be associated with a lower sensibility to noisy data. However, there is no systematic study on this matter. This paper performs a deep analysis of the GSGP performance over the presence of noise. Using 15 synthetic datasets where noise can be controlled, we added different ratios of noise to the data and compared the results obtained with those of a canonical GP. The results show that, as we increase the percentage of noisy instances, the generalization performance degradation is more pronounced in GSGP than GP. However, in general, GSGP is more robust to noise than GP in the presence of up to 10% of noise, and presents no statistical difference for values higher than that in the test bed.
Continuous latent time series models are prevalent in Bayesian modeling; examples include the Kalman filter, dynamic collaborative filtering, or dynamic topic models. These models often benefit from structured, non mean field variational approximations that capture correlations between time steps. Black box variational inference with reparameterization gradients (BBVI) allows us to explore a rich new class of Bayesian non-conjugate latent time series models; however, a naive application of BBVI to such structured variational models would scale quadratically in the number of time steps. We describe a BBVI algorithm analogous to the forward-backward algorithm which instead scales linearly in time. It allows us to efficiently sample from the variational distribution and estimate the gradients of the ELBO. Finally, we show results on the recently proposed dynamic word embedding model, which was trained using our method.
We introduce an extremely computation efficient CNN architecture named ShuffleNet, designed specially for mobile devices with very limited computing power (e.g., 10-150 MFLOPs). The new architecture utilizes two proposed operations, pointwise group convolution and channel shuffle, to greatly reduce computation cost while maintaining accuracy. Experiments on ImageNet classification and MS COCO object detection demonstrate the superior performance of ShuffleNet over other structures, e.g. lower top-1 error (absolute 6.7\%) than the recent MobileNet system on ImageNet classification under the computation budget of 40 MFLOPs. On an ARM-based mobile device, ShuffleNet achieves \textasciitilde 13$\times$ actual speedup over AlexNet while maintaining comparable accuracy.