• Self-organized critical behavior in Ising spin glasses
• A universal tree balancing theorem
• Finding the Size of a Radio Network with Short Labels
• The phase transition in bounded-size Achlioptas processes
• Action Understanding with Multiple Classes of Actors
• Structured Sparse Modelling with Hierarchical GP
• Calibration of a two-state pitch-wise HMM method for note segmentation in Automatic Music Transcription systems
• Splittability and 1-amalgamability of permutation classes
• Conserved quantities of Q-systems from dimer integrable systems
• Portfolio-driven Resource Management for Transient Cloud Servers
• Signed graphs: from modulo flows to integer-valued flows
• Improving Facial Attribute Prediction using Semantic Segmentation
• Efficient Feature Screening for Lasso-Type Problems via Hybrid Safe-Strong Rules
• Bifurcation Mechanism Design — From Optimal Flat Taxes to Improved Cancer Treatments
• A Network Perspective on Stratification of Multi-Label Data
• Obstacle Avoidance through Deep Networks based Intermediate Perception
• Computational complexity of the initial value problem for the three body problem
• GazeDirector: Fully Articulated Eye Gaze Redirection in Video
• Data Based Identification and Prediction of Nonlinear and Complex Dynamical Systems
• Prediction of Daytime Hypoglycemic Events Using Continuous Glucose Monitoring Data and Classification Technique
• Strong Coordination over Noisy Channels: Is Separation Sufficient?
• Deep Face Deblurring
• Genealogical Distance as a Diversity Estimate in Evolutionary Algorithms
• Partially Occluded Leaf Recognition via Beta-Spline Curve Matching and Energy Minimization
• Learning Quadratic Variance Function (QVF) DAG models via OverDispersion Scoring (ODS)
• One-Dimensional Packing: Maximality Implies Rationality
• Mapping Instructions and Visual Observations to Actions with Reinforcement Learning
• Generating Simple Near-Bipartite Bricks
• Risk Stratification of Lung Nodules Using 3D CNN-Based Multi-task Learning
• Word Affect Intensities
• The spectral symmetry of weakly irreducible nonnegative tensors and connected hypergraphs
• Neural Ranking Models with Weak Supervision
• Performance Assessment of High-dimensional Variable Identification
• Automatic Real-time Background Cut for Portrait Videos
• Generator polynomials and generator matrix for quasi cyclic codes
• A Tribe Competition-Based Genetic Algorithm for Feature Selection in Pattern Classification
• Disorder-protected topological entropy after a quantum quench
• Active Collaborative Ensemble Tracking
• AKS method: a new image compression by gradient Haar wavelet
• Spectral-Efficient Analog Precoding for Generalized Spatial Modulation Aided MmWave MIMO
• Generalized Spatial Modulation Aided MmWave MIMO with Sub-Connected Hybrid Precoding Scheme
• Classical Widely Linear Estimation of Real Valued Parameter Vectors in Complex Valued Environments
• On partitioning the edges of an infinite digraph into directed cycles
• Outline Colorization through Tandem Adversarial Networks
• Relaxing the Irrevocability Requirement for Online Graph Algorithms
• On consecutive pattern-avoiding permutations of length 4, 5 and beyond
• Image reconstruction by domain transform manifold learning
• The speed of biased random walk among random conductances
• The right tool for the right question — beyond the encoding versus decoding dichotomy
• On the 1-factorizations of Middle Level Graph: Inner structure, Algorithm, and Application
• Learning Spatiotemporal-Aware Representation for POI Recommendation
• Multi-antenna Wireless Legitimate Surveillance Systems: Design and Performance Analysis
• Structural Parameters, Tight Bounds, and Approximation for (k,r)-Center
• Deterministic Gathering with Crash Faults
• Improving Small Object Proposals for Company Logo Detection
• Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning
• Finite-state Strategies in Delay Games
• Stochastic Proximal Gradient Algorithms for Penalized Mixed Models
• How consistent are our discourse annotations? Insights from mapping RST-DT and PDTB annotations
• Quaternion Gaussian matrices satisfy the RIP
• A Hida-Malliavin white noise calculus approach to optimal control
• Unbiased Shape Compactness for Segmentation
• Adaptation and learning over networks for nonlinear system modeling
• Interference Exploitation for Radar and Cellular Coexistence: The Power-Efficient Approach
• Necessary conditions for linear convergence of Picard iterations and application to alternating projections
• A Framework for Rate Efficient Control of Distributed Discrete Systems
• Dynamic disorder in simple enzymatic reactions induces stochastic amplification of substrate
• A lower bound on CNF encodings of the at-most-one constraint
• Object Discovery via Cohesion Measurement
• Expressing Facial Structure and Appearance Information in Frequency Domain for Face Recognition
• Neural Word Segmentation with Rich Pretraining
• Dependent Microstructure Noise and Integrated Volatility Estimation from High-Frequency Data
• Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models
• Phase retrieval with a multivariate Von Mises prior: from a Bayesian formulation to a lifting solution
• Exact extremal statistics in the classical $1d$ Coulomb gas
• When is the mode functional the Bayes classifier?
• Brownian disks and the Brownian snake
• The topological face of recommendation: models and application to bias detection
• A Unified Approach of Multi-scale Deep and Hand-crafted Features for Defocus Estimation
• Distribution System Voltage Control under Uncertainties
• Entropy of Independent Experiments, Revisited
• Exploiting the Natural Exploration In Contextual Bandits
• A robust parallel algorithm for combinatorial compressed sensing
• Unimodular hierarchical models and their Graver bases
• Parameter Estimation in Computational Biology by Approximate Bayesian Computation coupled with Sensitivity Analysis
• Time-Sensitive Bandit Learning and Satisficing Thompson Sampling
A Siamese Deep Forest (SDF) is proposed in the paper. It is based on the Deep Forest or gcForest proposed by Zhou and Feng and can be viewed as a gcForest modification. It can be also regarded as an alternative to the well-known Siamese neural networks. The SDF uses a modified training set consisting of concatenated pairs of vectors. Moreover, it defines the class distributions in the deep forest as the weighted sum of the tree class probabilities such that the weights are determined in order to reduce distances between similar pairs and to increase them between dissimilar points. We show that the weights can be obtained by solving a quadratic optimization problem. The SDF aims to prevent overfitting which takes place in neural networks when only limited training data are available. The numerical experiments illustrate the proposed distance metric method.
Artificial intelligence methods have often been applied to perform specific functions or tasks in the cyber-defense realm. However, as adversary methods become more complex and difficult to divine, piecemeal efforts to understand cyber-attacks, and malware-based attacks in particular, are not providing sufficient means for malware analysts to understand the past, present and future characteristics of malware. In this paper, we present the Malware Analysis and Attributed using Genetic Information (MAAGI) system. The underlying idea behind the MAAGI system is that there are strong similarities between malware behavior and biological organism behavior, and applying biologically inspired methods to corpora of malware can help analysts better understand the ecosystem of malware attacks. Due to the sophistication of the malware and the analysis, the MAAGI system relies heavily on artificial intelligence techniques to provide this capability. It has already yielded promising results over its development life, and will hopefully inspire more integration between the artificial intelligence and cyber–defense communities.
We present an approach to rapidly and easily build natural language interfaces to databases for new domains, whose performance improves over time based on user feedback, and requires minimal intervention. To achieve this, we adapt neural sequence models to map utterances directly to SQL with its full expressivity, bypassing any intermediate meaning representations. These models are immediately deployed online to solicit feedback from real users to flag incorrect queries. Finally, the popularity of SQL facilitates gathering annotations for incorrect predictions using the crowd, which is directly used to improve our models. This complete feedback loop, without intermediate representations or database specific engineering, opens up new ways of building high quality semantic parsers. Experiments suggest that this approach can be deployed quickly for any new target domain, as we show by learning a semantic parser for an online academic database from scratch.
In deep learning, performance is strongly affected by the choice of architecture and hyperparameters. While there has been extensive work on automatic hyperparameter optimization for simple spaces, complex spaces such as the space of deep architectures remain largely unexplored. As a result, the choice of architecture is done manually by the human expert through a slow trial and error process guided mainly by intuition. In this paper we describe a framework for automatically designing and training deep models. We propose an extensible and modular language that allows the human expert to compactly represent complex search spaces over architectures and their hyperparameters. The resulting search spaces are tree-structured and therefore easy to traverse. Models can be automatically compiled to computational graphs once values for all hyperparameters have been chosen. We can leverage the structure of the search space to introduce different model search algorithms, such as random search, Monte Carlo tree search (MCTS), and sequential model-based optimization (SMBO). We present experiments comparing the different algorithms on CIFAR-10 and show that MCTS and SMBO outperform random search. In addition, these experiments show that our framework can be used effectively for model discovery, as it is possible to describe expressive search spaces and discover competitive models without much effort from the human expert. Code for our framework and experiments has been made publicly available.
This paper presents a general graph representation learning framework called DeepGL for learning deep node and edge representations from large (attributed) graphs. In particular, DeepGL begins by deriving a set of base features (e.g., graphlet features) and automatically learns a multi-layered hierarchical graph representation where each successive layer leverages the output from the previous layer to learn features of a higher-order. Contrary to previous work, DeepGL learns relational functions (each representing a feature) that generalize across-networks and therefore useful for graph-based transfer learning tasks. Moreover, DeepGL naturally supports attributed graphs, learns interpretable features, and is space-efficient (by learning sparse feature vectors). In addition, DeepGL is expressive, flexible with many interchangeable components, efficient with a time complexity of , and scalable for large networks via an efficient parallel implementation. Compared with the state-of-the-art method, DeepGL is (1) effective for across-network transfer learning tasks and attributed graph representation learning, (2) space-efficient requiring up to 6x less memory, (3) fast with up to 182x speedup in runtime performance, and (4) accurate with an average improvement of 20% or more on many learning tasks.
We introduce Parseval networks, a form of deep neural networks in which the Lipschitz constant of linear, convolutional and aggregation layers is constrained to be smaller than 1. Parseval networks are empirically and theoretically motivated by an analysis of the robustness of the predictions made by deep neural networks when their input is subject to an adversarial perturbation. The most important feature of Parseval networks is to maintain weight matrices of linear and convolutional layers to be (approximately) Parseval tight frames, which are extensions of orthogonal matrices to non-square matrices. We describe how these constraints can be maintained efficiently during SGD. We show that Parseval networks match the state-of-the-art in terms of accuracy on CIFAR-10/100 and Street View House Numbers (SVHN) while being more robust than their vanilla counterpart against adversarial examples. Incidentally, Parseval networks also tend to train faster and make a better usage of the full capacity of the networks.
A proper initialization of the weights in a neural network is critical to its convergence. Current insights into weight initialization come primarily from linear activation functions. In this paper, I develop a theory for weight initializations with non-linear activations. First, I derive a general weight initialization strategy for any neural network using activation functions differentiable at 0. Next, I derive the weight initialization strategy for the Rectified Linear Unit (RELU), and provide theoretical insights into why the Xavier initialization is a poor choice with RELU activations. My analysis provides a clear demonstration of the role of non-linearities in determining the proper weight initializations.
We present SuperPivot, an analysis method for low-resource languages that occur in a superparallel corpus, i.e., in a corpus that contains an order of magnitude more languages than parallel corpora currently in use. We show that SuperPivot performs well for the crosslingual analysis of the linguistic phenomenon of tense. We produce analysis results for more than 1000 languages, conducting – to the best of our knowledge – the largest crosslingual computational study performed to date. We extend existing methodology for leveraging parallel corpora for typological analysis by overcoming a limiting assumption of earlier work: We only require that a linguistic feature is overtly marked in a few of thousands of languages as opposed to requiring that it be marked in all languages under investigation.
An Intelligent Personal Agent (IPA) is an agent that has the purpose of helping the user to gain information through reliable resources with the help of knowledge navigation techniques and saving time to search the best content. The agent is also responsible for responding to the chat-based queries with the help of Conversation Corpus. We will be testing different methods for optimal query generation. To felicitate the ease of usage of the application, the agent will be able to accept the input through Text (Keyboard), Voice (Speech Recognition) and Server (Facebook) and output responses using the same method. Existing chat bots reply by making changes in the input, but we will give responses based on multiple SRT files. The model will learn using the human dialogs dataset and will be able respond human-like. Responses to queries about famous things (places, people, and words) can be provided using web scraping which will enable the bot to have knowledge navigation features. The agent will even learn from its past experiences supporting semi-supervised learning.
In this paper we introduce and formalize Substochastic Monte Carlo (SSMC) algorithms. These algorithms, originally intended to be a better classical foil to quantum annealing than simulated annealing, prove to be worthy optimization algorithms in their own right. In SSMC, a population of walkers is initialized according to a known distribution on an arbitrary search space and varied into the solution of some optimization problem of interest. The first argument of this paper shows how an existing classical algorithm, ‘Go-With-The-Winners’ (GWW), is a limiting case of SSMC when restricted to binary search and particular driving dynamics. Although limiting to GWW, SSMC is more general. We show that (1) GWW can be efficiently simulated within the SSMC framework, (2) SSMC can be exponentially faster than GWW, (3) by naturally incorporating structural information, SSMC can exponentially outperform the quantum algorithm that first inspired it, and (4) SSMC exhibits desirable search features in general spaces. Our approach combines ideas from genetic algorithms (GWW), theoretical probability (Fleming-Viot processes), and quantum computing. Not only do we demonstrate that SSMC is often more efficient than competing algorithms, but we also hope that our results connecting these disciplines will impact each independently. An implemented version of SSMC has previously enjoyed some success as a competitive optimization algorithm for Max--SAT.