CrescendoNet: A Simple Deep Convolutional Neural Network with Ensemble Behavior

We introduce a new deep convolutional neural network, CrescendoNet, by stacking simple building blocks without residual connections. Each Crescendo block contains independent convolution paths with increased depths. The numbers of convolution layers and parameters are only increased linearly in Crescendo blocks. In experiments, CrescendoNet with only 15 layers outperforms almost all networks without residual connections on benchmark datasets, CIFAR10, CIFAR100, and SVHN. Given sufficient amount of data as in SVHN dataset, CrescendoNet with 15 layers and 4.1M parameters can match the performance of DenseNet-BC with 250 layers and 15.3M parameters. CrescendoNet provides a new way to construct high performance deep convolutional neural networks without residual connections. Moreover, through investigating the behavior and performance of subnetworks in CrescendoNet, we note that the high performance of CrescendoNet may come from its implicit ensemble behavior, which differs from the FractalNet that is also a deep convolutional neural network without residual connections. Furthermore, the independence between paths in CrescendoNet allows us to introduce a new path-wise training procedure, which can reduce the memory needed for training.


How Algorithmic Confounding in Recommendation Systems Increases Homogeneity and Decreases Utility

Recommendation systems occupy an expanding role in everyday decision making, from choice of movies and household goods to consequential medical and legal decisions. The data used to train and test these systems is algorithmically confounded in that it is the result of a feedback loop between human choices and an existing algorithmic recommendation system. Using simulations, we demonstrate that algorithmic confounding can disadvantage algorithms in training, bias held-out evaluation, and amplify homogenization of user behavior without gains in utility.


Machine Learning and Cognitive Technology for Intelligent Wireless Networks

The ability to dynamically and efficiently allocate resources to meet the need of growing diversity in services and user behavior marks the future of wireless networks, giving rise to intelligent processing, which aims at enabling the system to perceive and assess the available resources, to autonomously learn to adapt to the perceived wireless environment, and to reconfigure its operating mode to maximize the utility of the available resources. The perception capability and reconfigurability are the essential features of cognitive technology while modern machine learning techniques project effectiveness in system adaptation. In this paper, we discuss the development of the cognitive technology and machine learning techniques and emphasize their roles in improving both spectrum and energy efficiency of the future wireless networks. We describe in detail the state-of-the-art of cognitive technology, covering spectrum sensing and access approaches that may enhance spectrum utilization and curtail energy consumption. We discuss powerful machine learning algorithms that enable spectrum- and energy-efficient communications in dynamic wireless environments. We also present practical applications of these techniques to the existing and future wireless communication systems, such as heterogeneous networks and device-to-device communications, and identify some research opportunities and challenges in cognitive technology and machine learning as applied to future wireless networks.


Consistency of Generalized Dynamic Principal Components in Dynamic Factor Models

We study the theoretical properties of the generalized dynamic principal components introduced in Pe\~na and Yohai (2016). In particular, we prove that when the data follows a dynamic factor model, the reconstruction provided by the procedure converges in mean square to the common part of the model as the number of series and periods diverge to infinity. The results of a simulation study support our findings.


Bayesian Learning of Random Graphs & Correlation Structure of Multivariate Data, with Distance between Graphs

We present a method for the simultaneous Bayesian learning of the correlation matrix and graphical model of a multivariate dataset, using Metropolis-within-Gibbs inference. Here, the data comprises measurement of a vector-valued observable, that we model using a high-dimensional Gaussian Process (GP), such that, likelihood of GP parameters given the data, is Matrix-Normal, defined by a mean matrix and between-rows and between-columns covariance matrices. We marginalise over the between-row matrices, to achieve a closed-form likelihood of the between-columns correlation matrix, given the data. This correlation matrix is updated in the first block of an iteration, given the data, and the (generalised Binomial) graph is updated in the second block, at the partial correlation matrix that is computed given the updated correlation. We also learn the 95\% Highest Probability Density credible regions of the correlation matrix as well as the graphical model of the data. The difference in the acknowledgement of measurement errors in learning the graphical model, is demonstrated on a small simulated dataset, while the large human disease-symptom network–with > 8,000 nodes–is learnt using real data. Data on the vino-chemical attributes of Portugese red and white wine samples are employed to learn the correlation structure and graphical model of each dataset, to then compute the distance between the learnt graphical models.


Generating Natural Adversarial Examples

Due to their complex nature, it is hard to characterize the ways in which machine learning models can misbehave or be exploited when deployed. Recent work on adversarial examples, i.e. inputs with minor perturbations that result in substantially different model predictions, is helpful in evaluating the robustness of these models by exposing the adversarial scenarios where they fail. However, these malicious perturbations are often unnatural, not semantically meaningful, and not applicable to complicated domains such as language. In this paper, we propose a framework to generate natural and legible adversarial examples by searching in semantic space of dense and continuous data representation, utilizing the recent advances in generative adversarial networks. We present generated adversaries to demonstrate the potential of the proposed approach for black-box classifiers in a wide range of applications such as image classification, textual entailment, and machine translation. We include experiments to show that the generated adversaries are natural, legible to humans, and useful in evaluating and analyzing black-box classifiers.


Tensor Regression Meets Gaussian Processes

Low-rank tensor regression, a new model class that learns high-order correlation from data, has recently received considerable attention. At the same time, Gaussian processes (GP) are well-studied machine learning models for structure learning. In this paper, we demonstrate interesting connections between the two, especially for multi-way data analysis. We show that low-rank tensor regression is essentially learning a multi-linear kernel in Gaussian processes, and the low-rank assumption translates to the constrained Bayesian inference problem. We prove the oracle inequality and derive the average case learning curve for the equivalent GP model. Our finding implies that low-rank tensor regression, though empirically successful, is highly dependent on the eigenvalues of covariance functions as well as variable correlations.


SemTK: An Ontology-first, Open Source Semantic Toolkit for Managing and Querying Knowledge Graphs

The relatively recent adoption of Knowledge Graphs as an enabling technology in multiple high-profile artificial intelligence and cognitive applications has led to growing interest in the Semantic Web technology stack. Many semantics-related tools, however, are focused on serving experts with a deep understanding of semantic technologies. For example, triplification of relational data is available but there is no open source tool that allows a user unfamiliar with OWL/RDF to import data into a semantic triple store in an intuitive manner. Further, many tools require users to have a working understanding of SPARQL to query data. Casual users interested in benefiting from the power of Knowledge Graphs have few tools available for exploring, querying, and managing semantic data. We present SemTK, the Semantics Toolkit, a user-friendly suite of tools that allow both expert and non-expert semantics users convenient ingestion of relational data, simplified query generation, and more. The exploration of ontologies and instance data is performed through SPARQLgraph, an intuitive web-based user interface in SemTK understandable and navigable by a lay user. The open source version of SemTK is available at http://semtk.research.ge.com.


TF Boosted Trees: A scalable TensorFlow based framework for gradient boosting

TF Boosted Trees (TFBT) is a new open-sourced frame-work for the distributed training of gradient boosted trees. It is based on TensorFlow, and its distinguishing features include a novel architecture, automatic loss differentiation, layer-by-layer boosting that results in smaller ensembles and faster prediction, principled multi-class handling, and a number of regularization techniques to prevent overfitting.


Partial Least Squares Random Forest Ensemble Regression as a Soft Sensor

Six simple, dynamic soft sensor methodologies with two update conditions were compared on two experimentally-obtained datasets and one simulated dataset. The soft sensors investigated were: moving window partial least squares regression (and a recursive variant), moving window random forest regression, feedforward neural networks, mean moving window, and a novel random forest partial least squares regression ensemble (RF-PLS). We found that, on two of the datasets studied, very small window sizes (4 samples) led to the lowest prediction errors. The RF-PLS method offered the lowest one-step-ahead prediction errors compared to those of the other methods, and demonstrated greater stability at larger time lags than moving window PLS alone. We found that this method most adequately modeled the datasets that did not feature purely monotonic increases in property values. In general, we observed that linear models deteriorated most rapidly at more delayed model update conditions while nonlinear methods tended to provide predictions that approached those from a simple mean moving window. Other data dependent findings are presented and discussed.


Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm

Learning to learn is a powerful paradigm for enabling models to learn from data more effectively and efficiently. A popular approach to meta-learning is to train a recurrent model to read in a training dataset as input and output the parameters of a learned model, or output predictions for new test inputs. Alternatively, a more recent approach to meta-learning aims to acquire deep representations that can be effectively fine-tuned, via standard gradient descent, to new tasks. In this paper, we consider the meta-learning problem from the perspective of universality, formalizing the notion of learning algorithm approximation and comparing the expressive power of the aforementioned recurrent models to the more recent approaches that embed gradient descent into the meta-learner. In particular, we seek to answer the following question: does deep representation combined with standard gradient descent have sufficient capacity to approximate any learning algorithm? We find that this is indeed true, and further find, in our experiments, that gradient-based meta-learning consistently leads to learning strategies that generalize more widely compared to those represented by recurrent models.


The Capacity of Private Computation
(Quasi)Periodic revivals in periodically driven interacting quantum systems
Analysis, Identification, and Validation of Discrete-Time Epidemic Processes
A stochastic model for evolution with mass extinction on $\mathbb{T}_d^+$
High efficiency compression for object detection
Onsets and Frames: Dual-Objective Piano Transcription
Creation of an Annotated Corpus of Spanish Radiology Reports
Super-polynomial separations for quantum-enhanced reinforcement learning
Indirect Supervision for Relation Extraction using Question-Answer Pairs
Adjusted quantile residual for generalized linear models
Sedentary quantum walks
Sample-efficient Policy Optimization with Stein Control Variate
VLSI Computational Architectures for the Arithmetic Cosine Transform
Deep word embeddings for visual speech recognition
Scaling Limits of Processes with Fast Nonlinear Mean Reversion
Improve SAT-solving with Machine Learning
Critical Points of Neural Networks: Analytical Forms and Landscape Properties
Prophet Secretary for Combinatorial Auctions and Matroids
Deep Learning and Conditional Random Fields-based Depth Estimation and Topographical Reconstruction from Conventional Endoscopy
Location-adjusted Wald statistic for scalar parameters
Fast and Scalable Learning of Sparse Changes in High-Dimensional Gaussian Graphical Model Structure
Bibliometric-Enhanced Information Retrieval: 5th International BIR Workshop
Prototype Matching Networks for Large-Scale Multi-label Genomic Sequence Classification
Time-lagged autoencoders: Deep learning of slow collective variables for molecular kinetics
Theoretical properties of the global optimizer of two layer neural network
Integer polygons of given perimeter
A Dynamic Hash Table for the GPU
Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
Rock-Paper-Scissors, Differential Games and Biological Diversity
Reachability Preservers: New Extremal Bounds and Approximation Algorithms
Stochastic Variational Video Prediction
Approximation Algorithms for $\ell_0$-Low Rank Approximation
Adaptive Sampling Strategies for Stochastic Optimization
Implicit Manifold Learning on Generative Adversarial Networks
Some network conditions for positive recurrence of stochastically modeled reaction networks
Theoretical and Computational Guarantees of Mean Field Variational Inference for Community Detection
Empirical analysis of non-linear activation functions for Deep Neural Networks in classification tasks
Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning
Approximating Continuous Functions by ReLU Nets of Minimal Width
Notes on Cops and Robber game on graphs
Macroeconomics and FinTech: Uncovering Latent Macroeconomic Effects on Peer-to-Peer Lending
Optimal Control of Connected and Automated Vehicles at Roundabouts: An Investigation in a Mixed-Traffic Environment
Sequential Adaptive Detection for In-Situ Transmission Electron Microscopy (TEM)
Tensor Sketching: Sparsification and Rank-One Projection
A generalized parsing framework for Abstract Grammars
Stochastic Linear Quadratic Optimal Control with General Control Domain
Algorithmic learning of probability distributions from random data in the limit
Characterizing the structural diversity of complex networks across domains
Critical behaviour of a probabilistic cellular automaton model for the dynamics of a population driven by logistic growth and weak Allee effect
The Exact Solution to Rank-1 L1-norm TUCKER2 Decomposition
Generalized Forward-Backward Splitting with Penalization for Monotone Inclusion Problems
Tumor Classification and Segmentation of MR Brain Images
An Innovations Approach to Viterbi Decoding of Convolutional Codes
Deep Forward and Inverse Perceptual Models for Tracking and Prediction
Rate-optimal Meta Learning of Classification Error
Emergence and Relevance of Criticality in Deep Learning
Gaussian Approximation of the Distribution of Strongly Repelling Particles on the Unit Circle
A quenched variational principle for discrete random maps
Improving Social Media Text Summarization by Learning Sentence Weight Distribution
Shallow Discourse Parsing with Maximum Entropy Model
Coarse-Graining Open Markov Processes
A Sequential Matching Framework for Multi-turn Response Selection in Retrieval-based Chatbots
Mildly context sensitive grammar induction and variational bayesian inference
ChainerMN: Scalable Distributed Deep Learning Framework
Variations of the cop and robber game on graphs
Spatio-temporal interaction model for crowd video analysis
Image Patch Matching Using Convolutional Descriptors with Euclidean Distance
Capacity-Achieving PIR Schemes with Optimal Sub-Packetization
Intermittent quasistatic dynamical systems: weak convergence of fluctuations
A Computer Vision System to Localize and Classify Wastes on the Streets
Latent Space Oddity: on the Curvature of Deep Generative Models
Semantic Interpolation in Implicit Models
Flexible Prior Distributions for Deep Generative Models
Parametrizing filters of a CNN with a GAN
Updating the VESICLE-CNN Synapse Detector
Boolean convolutions and regular variation
Reshaping Cellular Networks for the Sky: The Major Factors and Feasibility
Continuum percolation for Cox point processes
A Scaled Smart City for Experimental Validation of Connected and Automated Vehicles
Improved Bounds for Online Dominating Sets of Trees
Two extensions of the Erős–Szekeres problem
TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning
Joint Cooperative Computation and Interactive Communication for Relay-Assisted Mobile Edge Computing
Reconnecting statistical physics and combinatorics beyond ensemble equivalence
Regret Minimization for Partially Observable Deep Reinforcement Learning
SVSGAN: Singing Voice Separation via Generative Adversarial Network
Physics-guided Neural Networks (PGNN): An Application in Lake Temperature Modeling
Stochastic Maximum Principle under Probability Distortion
Learning Neural Representations of Human Cognition across Many fMRI Studies
Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization
Deep Hashing with Triplet Quantization Loss
Clothing Retrieval with Visual Attention Model
Breaking the Interference Barrier in Dense Wireless Networks with Interference Alignment
Marginal false discovery rates for penalized likelihood methods
Investigating the effect of social groups in uni-directional pedestrian flow
Manipulation Strategies for the Rank Maximal Matching Problem
Guarding Against Adversarial Domain Shifts with Counterfactual Regularization
Immersion of transitive tournaments in digraphs with large minimum outdegree
Optimal Control of Endo-Atmospheric Launch Vehicle Systems: Geometric and Computational Issues
Asymptotically Distribution-Free Goodness-of-Fit Testing for Copulas
A multi-layer network based on Sparse Ternary Codes for universal vector compression
Designing RNA Secondary Structures is Hard
On the List-Decodability of Random Linear Rank-Metric Codes
Energy Efficiency of Multi-user Multi-antenna Random Cellular Networks with Minimum Distance Constraints
Extracting Syntactic Patterns from Databases
A 4D-Var Method with Flow-Dependent Background Covariances for the Shallow-Water Equations
Parameter Estimation in Mean Reversion Processes with Periodic Functional Tendency
Compact Multi-Class Boosted Trees
Modelo de Tratamiento para Tumores en Presencia de Radiación
Discussion of ‘Data-driven confounder selection via Markov and Bayesian networks’ by Jenny Häggström
Deep Learning as a Mixed Convex-Combinatorial Optimization Problem
Bypass rewiring and extreme robustness of Eulerian networks
Learning Graph Convolution Filters from Data Manifold
Energy-Aware Virtual Network Embedding Approach for Distributed Cloud
On Learning Mixtures of Well-Separated Gaussians
Universal Constraints on the Location of Extrema of Eigenfunctions of Non-Local Schrödinger Operators
Multiple Instance Hybrid Estimator for Hyperspectral Target Characterization and Sub-pixel Target Detection
Whodunnit? Crime Drama as a Case for Natural Language Understanding
Lower Bounds for Finding Stationary Points I
Quasisymmetric Power Sums
Space-filling design for nonlinear models
Cellular-Enabled UAV Communication: Trajectory Optimization Under Connectivity Constraint
Delocalization of Polymers in Lower Tail Large Deviation

Advertisements