Practical Machine Learning for Cloud Intrusion Detection: Challenges and the Way Forward

Operationalizing machine learning based security detections is extremely challenging, especially in a continuously evolving cloud environment. Conventional anomaly detection does not produce satisfactory results for analysts that are investigating security incidents in the cloud. Model evaluation alone presents its own set of problems due to a lack of benchmark datasets. When deploying these detections, we must deal with model compliance, localization, and data silo issues, among many others. We pose the problem of ‘attack disruption’ as a way forward in the security data science space. In this paper, we describe the framework, challenges, and open questions surrounding the successful operationalization of machine learning based security detections in a cloud environment and provide some insights on how we have addressed them.

Deconvolutional Latent-Variable Model for Text Sequence Matching

A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives. To alleviate typical optimization challenges in latent-variable models for text, we employ deconvolutional networks as the sequence decoder (generator), providing learned latent codes with more semantic information and better generalization. Our model, trained in an unsupervised manner, yields stronger empirical predictive performance than a decoder based on Long Short-Term Memory (LSTM), with less parameters and considerably faster training. Further, we apply it to text sequence-matching problems. The proposed model significantly outperforms several strong sentence-encoding baselines, especially in the semi-supervised setting.

Feature Engineering for Predictive Modeling using Reinforcement Learning

Feature engineering is a crucial step in the process of predictive modeling. It involves the transformation of given feature space, typically using mathematical functions, with the objective of reducing the modeling error for a given target. However, there is no well-defined basis for performing effective feature engineering. It involves domain knowledge, intuition, and most of all, a lengthy process of trial and error. The human attention involved in overseeing this process significantly influences the cost of model generation. We present a new framework to automate feature engineering. It is based on performance driven exploration of a transformation graph, which systematically and compactly enumerates the space of given options. A highly efficient exploration strategy is derived through reinforcement learning on past examples.

Lazy stochastic principal component analysis

Stochastic principal component analysis (SPCA) has become a popular dimensionality reduction strategy for large, high-dimensional datasets. We derive a simplified algorithm, called Lazy SPCA, which has reduced computational complexity and is better suited for large-scale distributed computation. We prove that SPCA and Lazy SPCA find the same approximations to the principal subspace, and that the pairwise distances between samples in the lower-dimensional space is invariant to whether SPCA is executed lazily or not. Empirical studies find downstream predictive performance to be identical for both methods, and superior to random projections, across a range of predictive models (linear regression, logistic lasso, and random forests). In our largest experiment with 4.6 million samples, Lazy SPCA reduced 43.7 hours of computation to 9.9 hours. Overall, Lazy SPCA relies exclusively on matrix multiplications, besides an operation on a small square matrix whose size depends only on the target dimensionality.

Handling Factors in Variable Selection Problems

Factors are categorical variables, and the values which these variables assume are called levels. In this paper, we consider the variable selection problem where the set of potential predictors contains both factors and numerical variables. Formally, this problem is a particular case of the standard variable selection problem where factors are coded using dummy variables. As such, the Bayesian solution would be straightforward and, possibly because of this, the problem, despite its importance, has not received much attention in the literature. Nevertheless, we show that this perception is illusory and that in fact several inputs like the assignment of prior probabilities over the model space or the parameterization adopted for factors may have a large (and difficult to anticipate) impact on the results. We provide a solution to these issues that extends the proposals in the standard variable selection problem and does not depend on how the factors are coded using dummy variables. Our approach is illustrated with a real example concerning a childhood obesity study in Spain.

Class-Splitting Generative Adversarial Networks

Generative Adversarial Networks (GANs) produce systematically better quality samples when class label information is provided., i.e. in the conditional GAN setup. This is still observed for the recently proposed Wasserstein GAN formulation which stabilized adversarial training and allows considering high capacity network architectures such as ResNet. In this work we show how to boost conditional GAN by augmenting available class labels. The new classes come from clustering in the representation space learned by the same GAN model. The proposed strategy is also feasible when no class information is available, i.e. in the unsupervised setup. Our generated samples reach state-of-the-art Inception scores for CIFAR-10 and STL-10 datasets in both supervised and unsupervised setup.

Neural Optimizer Search with Reinforcement Learning

We present an approach to automate the process of discovering optimization methods, with a focus on deep learning architectures. We train a Recurrent Neural Network controller to generate a string in a domain specific language that describes a mathematical update equation based on a list of primitive functions, such as the gradient, running average of the gradient, etc. The controller is trained with Reinforcement Learning to maximize the performance of a model after a few epochs. On CIFAR-10, our method discovers several update rules that are better than many commonly used optimizers, such as Adam, RMSProp, or SGD with and without Momentum on a ConvNet model. We introduce two new optimizers, named PowerSign and AddSign, which we show transfer well and improve training on a variety of different tasks and architectures, including ImageNet classification and Google’s neural machine translation system.

Analyzing users’ sentiment towards popular consumer industries and brands on Twitter

Social media serves as a unified platform for users to express their thoughts on subjects ranging from their daily lives to their opinion on consumer brands and products. These users wield an enormous influence in shaping the opinions of other consumers and influence brand perception, brand loyalty and brand advocacy. In this paper, we analyze the opinion of 19M Twitter users towards 62 popular industries, encompassing 12,898 enterprise and consumer brands, as well as associated subject matter topics, via sentiment analysis of 330M tweets over a period spanning a month. We find that users tend to be most positive towards manufacturing and most negative towards service industries. In addition, they tend to be more positive or negative when interacting with brands than generally on Twitter. We also find that sentiment towards brands within an industry varies greatly and we demonstrate this using two industries as use cases. In addition, we discover that there is no strong correlation between topic sentiments of different industries, demonstrating that topic sentiments are highly dependent on the context of the industry that they are mentioned in. We demonstrate the value of such an analysis in order to assess the impact of brands on social media. We hope that this initial study will prove valuable for both researchers and companies in understanding users’ perception of industries, brands and associated topics and encourage more research in this field.

Uniquely labelled geodesics of Coxeter groups
Anisotropic Functional Fourier Deconvolution from indirect long-memory observations
Numerical reconstruction of the first band(s) in an inverse Hill’s problem
Extreme Value Estimation for Discretely Sampled Continuous Processes
Data-Driven Model Predictive Control of Autonomous Mobility-on-Demand Systems
Inter-Subject Analysis: Inferring Sparse Interactions with Dense Intra-Graphs
Minimum Covariance Determinant and Extensions
Multi-Resolution Functional ANOVA for Large-Scale, Many-Input Computer Experiments
Multi-camera Multi-Object Tracking
A Unified Approach to the Global Exactness of Penalty and Augmented Lagrangian Functions I: Parametric Exactness
Estimated Depth Map Helps Image Classification
A Deep-Reinforcement Learning Approach for Software-Defined Networking Routing Optimization
A Flocking-based Approach for Distributed Stochastic Optimization
On the Design of LQR Kernels for Efficient Controller Learning
On Compiling DNNFs without Determinism
Near Optimal Sketching of Low-Rank Tensor Regression
Covert Wireless Communication with Artificial Noise Generation
Persistence Flamelets: multiscale Persistent Homology for kernel density exploration
Talagrand Concentration Inequalities for Stochastic Partial Differential Equations
Supervised Learning with Indefinite Topological Kernels
On the Use of Machine Translation-Based Approaches for Vietnamese Diacritic Restoration
Statistical Methods for Ecological Breakpoints and Prediction Intervals
Cost Adaptation for Robust Decentralized Swarm Behaviour
Variational Memory Addressing in Generative Models
Irreversibility of mechanical and hydrodynamic instabilities
Discrete-Time Polar Opinion Dynamics with Susceptibility
Accelerating PageRank using Partition-Centric Processing
Deep Recurrent NMF for Speech Separation by Unfolding Iterative Thresholding
Hypergraph Theory: Applications in 5G Heterogeneous Ultra-Dense Networks
Maximal Moments and Uniform Modulus of Continuity for Stable Random Fields
The k-tacnode process
Fractional iterated Ornstein-Uhlenbeck Processes
Learning RBM with a DC programming Approach
Large Vocabulary Automatic Chord Estimation Using Deep Neural Nets: Design Framework, System Variations and Limitations
Local Private Hypothesis Testing: Chi-Square Tests
SceneCut: Joint Geometric and Object Segmentation for Indoor Scenes
Chromatic number, Clique number, and Lovász’s bound: In a comparison
Semi-Automated Nasal PAP Mask Sizing using Facial Photographs
SpectralFPL: Online Spectral Learning for Single Topic Models
Worst-case evaluation complexity and optimality of second-order methods for nonconvex smooth optimization
Analysis of Wireless-Powered Device-to-Device Communications with Ambient Backscattering
Convergence characteristics of the generalized residual cutting method
Visual Question Generation as Dual Task of Visual Question Answering
Temporal Multimodal Fusion for Video Emotion Classification in the Wild
The size of $3$-uniform hypergraphs with given matching number and codegree
A First Derivative Potts Model for Segmentation and Denoising Using MILP
3D Deformable Object Manipulation using Fast Online Gaussian Process Regression
Human Pose Estimation using Global and Local Normalization
Self-Dual Codes better than the Gilbert–Varshamov bound
Convolutional neural networks that teach microscopes how to image
Learning Complex Swarm Behaviors by Exploiting Local Communication Protocols with Deep Reinforcement Learning
Bayesian nonparametric inference for the M/G/1 queueing systems based on the marked departure process
Neural network identification of people hidden from view with a single-pixel, single-photon detector
Sorting with Recurrent Comparison Errors
Real-time predictive maintenance for wind turbines using Big Data frameworks
Assumption-Based Approaches to Reasoning with Priorities
Hysteretic percolation from locally optimal decisions
A Communication-Efficient Distributed Data Structure for Top-k and k-Select Queries
The power of big data sparse signal detection tests on nonparametric detection boundaries
Yet Another ADNI Machine Learning Paper? Paving The Way Towards Fully-reproducible Research on Classification of Alzheimer’s Disease
On Composite Quantum Hypothesis Testing
A New Framework for $\mathcal{H}_2$-Optimal Model Reduction
Hybrid Beamforming Based on Implicit Channel State Information for Millimeter Wave Links
Speech Recognition Challenge in the Wild: Arabic MGB-3
Secure Energy Efficiency Optimization for MISO Cognitive Radio Network with Energy Harvesting
Blood-based metabolic signatures in Alzheimer’s disease
Alternating least squares as moving subspace correction
Spectral Asymptotics for Krein-Feller-Operators with respect to Random Recursive Cantor Measures
Connectedness of random set attractors
On the distribution of monochromatic complete subgraphs and arithmetic progressions
Influence of Clustering on Cascading Failures in Interdependent Systems
Down the Large Rabbit Hole
Playing for Benchmarks
AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection
Stochastic parameterization identification using ensemble Kalman filtering combined with expectation-maximization and Newton-Raphson maximum likelihood methods
Density of the set of probability measures with the martingale representation property
H-DenseUNet: Hybrid Densely Connected UNet for Liver and Liver Tumor Segmentation from CT Volumes
Symbolic Optimal Control
Efficient Column Generation for Cell Detection and Segmentation
Beyond the Sharp Null: Randomization Inference, Bounded Null Hypotheses, and Confidence Intervals for Maximum Effects
On the multi-dimensional elephant random walk
Extended-Alphabet Finite-Context Models
Retrofitting Concept Vector Representations of Medical Concepts to Improve Estimates of Semantic Similarity and Relatedness
Non-Depth-First Search against Independent Distributions on an AND-OR Tree
Stable-like fluctuations of Biggins’ martingales
Multi-label Pixelwise Classification for Reconstruction of Large-scale Urban Areas
On the precise determination of the Tsallis parameters in proton – proton collisions at LHC energies
Geometric SMOTE: Effective oversampling for imbalanced learning through a geometric extension of SMOTE
Distributed Submodular Minimization And Motion Coordination Over Discrete State Space
Berezinskii-Kosteriltz-Thouless transition in disordered multi-channel Luttinger liquids
If and When a Driver or Passenger is Returning to Vehicle: Framework to Infer Intent and Arrival Time
Urban Land Cover Classification with Missing Data Using Deep Convolutional Neural Networks
On Andrews–Warnaar’s identities of partial theta functions
A new ‘3D Calorimetry’ of hot nuclei
Inducing Distant Supervision in Suggestion Mining through Part-of-Speech Embeddings
Quantum Autoencoders via Quantum Adders with Genetic Algorithms
Bidirected Graphs I: Signed General Kotzig-Lovász Decomposition
On the $l^p$-norm of the Discrete Hilbert transform
Learned Features are better for Ethnicity Classification
Dynamic Evaluation of Neural Sequence Models
Perturbative Black Box Variational Inference