Statistical Inference on Panel Data Models: A Kernel Ridge Regression Method

We propose statistical inferential procedures for panel data models with interactive fixed effects in a kernel ridge regression framework.Compared with traditional sieve methods, our method is automatic in the sense that it does not require the choice of basis functions and truncation parameters.Model complexity is controlled by a continuous regularization parameter which can be automatically selected by generalized cross validation. Based on empirical processes theory and functional analysis tools, we derive joint asymptotic distributions for the estimators in the heterogeneous setting. These joint asymptotic results are then used to construct confidence intervals for the regression means and prediction intervals for the future observations, both being the first provably valid intervals in literature. Marginal asymptotic normality of the functional estimators in homogeneous setting is also obtained. Simulation and real data analysis demonstrate the advantages of our method.

Deep Learning applied to NLP

Convolutional Neural Network (CNNs) are typically associated with Computer Vision. CNNs are responsible for major breakthroughs in Image Classification and are the core of most Computer Vision systems today. More recently CNNs have been applied to problems in Natural Language Processing and gotten some interesting results. In this paper, we will try to explain the basics of CNNs, its different variations and how they have been applied to NLP.

Dynamic Intention-Aware Recommendation System

Recommender systems have been actively and extensively studied over past decades. In the meanwhile, the boom of Big Data is driving fundamental changes in the development of recommender systems. In this paper, we propose a dynamic intention-aware recommender system to better facilitate users to find desirable products and services. Compare to prior work, our proposal possesses the following advantages: (1) it takes user intentions and demands into account through intention mining techniques. By unearthing user intentions from the historical user-item interactions, and various user digital traces harvested from social media and Internet of Things, it is capable of delivering more satisfactory recommendations by leveraging rich online and offline user data; (2) it embraces the benefits of embedding heterogeneous source information and shared representations of multiple domains to provide accurate and effective recommendations comprehensively; (3) it recommends products or services proactively and timely by capturing the dynamic influences, which can significantly reduce user involvements and efforts.

Perturbation Bootstrap in Adaptive Lasso

The Adaptive LASSO (ALASSO) was proposed by Zou [J. Amer. Statist. Assoc. 101 (2006) 1418-1429] as a modification of the LASSO for the purpose of simultaneous variable selection and estimation of the parameters in a linear regression model. Zou (2006) established that the ALASSO estimator is variable-selection consistent as well as asymptotically Normal in the indices corresponding to the nonzero regression coefficients in certain fixed-dimensional settings. In an influential paper, Minnier, Tian and Cai [J. Amer. Statist. Assoc. 106 (2011) 1371-1382] proposed a perturbation bootstrap method and established its distributional consistency for the ALASSO estimator in the fixed-dimensional setting. In this paper, however, we show that this (naive) perturbation bootstrap fails to achieve second order correctness in approximating the distribution of the ALASSO estimator. We propose a modification to the perturbation bootstrap objective function and show that a suitably studentized version of our modified perturbation bootstrap ALASSO estimator achieves second-order correctness even when the dimension of the model is allowed to grow to infinity with the sample size. As a consequence, inferences based on the modified perturbation bootstrap will be more accurate than the inferences based on the oracle Normal approximation. We give simulation studies demonstrating good finite-sample properties of our modified perturbation bootstrap method as well as an illustration of our method on a real data set.


This text is a survey on cross-validation. We define all classical cross-validation procedures, and we study their properties for two different goals: estimating the risk of a given estimator, and selecting the best estimator among a given family. For the risk estimation problem, we compute the bias (which can also be corrected) and the variance of cross-validation methods. For estimator selection, we first provide a first-order analysis (based on expectations). Then, we explain how to take into account second-order terms (from variance computations, and by taking into account the usefulness of overpenalization). This allows, in the end, to provide some guidelines for choosing the best cross-validation method for a given learning problem.

Robustness in Highly Dynamic Networks

We investigate a special case of hereditary property that we refer to as {\em robustness}. A property is {\em robust} in a given graph if it is inherited by all connected spanning subgraphs of this graph. We motivate this definition in different contexts, showing that it plays a central role in highly dynamic networks, although the problem is defined in terms of classical (static) graph theory. In this paper, we focus on the robustness of {\em maximal independent sets} (MIS). Following the above definition, a MIS is said to be {\em robust} (RMIS) if it remains a valid MIS in all connected spanning subgraphs of the original graph. We characterize the class of graphs in which {\em all} possible MISs are robust. We show that, in these particular graphs, the problem of finding a robust MIS is {\em local}; that is, we present an RMIS algorithm using only a sublogarithmic number of rounds (in the number of nodes n) in the {\cal LOCAL} model. On the negative side, we show that, in general graphs, the problem is not local. Precisely, we prove a \Omega(n) lower bound on the number of rounds required for the nodes to decide consistently in some graphs. This result implies a separation between the RMIS problem and the MIS problem in general graphs. It also implies that any strategy in this case is asymptotically (in order) as bad as collecting all the network information at one node and solving the problem in a centralized manner. Motivated by this observation, we present a centralized algorithm that computes a robust MIS in a given graph, if one exists, and rejects otherwise. Significantly, this algorithm requires only a polynomial amount of local computation time, despite the fact that exponentially many MISs and exponentially many connected spanning subgraphs may exist.

Compressed Sensing using Generative Models

The goal of compressed sensing is to estimate a vector from an underdetermined system of noisy linear measurements, by making use of prior knowledge on the structure of vectors in the relevant domain. For almost all results in this literature, the structure is represented by sparsity in a well-chosen basis. We show how to achieve guarantees similar to standard compressed sensing but without employing sparsity at all. Instead, we suppose that vectors lie near the range of a generative model G: \mathbb{R}^k \to \mathbb{R}^n. Our main theorem is that, if G is L-Lipschitz, then roughly O(k \log L) random Gaussian measurements suffice for an \ell_2/\ell_2 recovery guarantee. We demonstrate our results using generative models from published variational autoencoder and generative adversarial networks. Our method can use 510x fewer measurements than Lasso for the same accuracy.

Anomaly Detection and Redundancy Elimination of Big Sensor Data in Internet of Things

In the era of big data and Internet of things, massive sensor data are gathered with Internet of things. Quantity of data captured by sensor networks are considered to contain highly useful and valuable information. However, for a variety of reasons, received sensor data often appear abnormal. Therefore, effective anomaly detection methods are required to guarantee the quality of data collected by those sensor nodes. Since sensor data are usually correlated in time and space, not all the gathered data are valuable for further data processing and analysis. Preprocessing is necessary for eliminating the redundancy in gathered massive sensor data. In this paper, the proposed work defines a sensor data preprocessing framework. It is mainly composed of two parts, i.e., sensor data anomaly detection and sensor data redundancy elimination. In the first part, methods based on principal statistic analysis and Bayesian network is proposed for sensor data anomaly detection. Then, approaches based on static Bayesian network (SBN) and dynamic Bayesian networks (DBNs) are proposed for sensor data redundancy elimination. Static sensor data redundancy detection algorithm (SSDRDA) for eliminating redundant data in static datasets and real-time sensor data redundancy detection algorithm (RSDRDA) for eliminating redundant sensor data in real-time are proposed. The efficiency and effectiveness of the proposed methods are validated using real-world gathered sensor datasets.

Boosted KZ and LLL Algorithms

There exists two issues among popular lattice reduction (LR) algorithms that should cause our concern. The first one is Korkine Zolotarev (KZ) and Lenstra Lenstra Lovasz (LLL) algorithms may increase the lengths of basis vectors. The other is KZ reduction suffers much worse performance than Minkowski reduction in terms of providing short basis vectors, despite its superior theoretical upper bounds. To address these limitations, we improve the size reduction steps in KZ and LLL to set up two new efficient algorithms, referred to as boosted KZ and LLL, for solving the shortest basis problem (SBP) with exponential and polynomial complexity, respectively. Both of them offer better actual performance than their classic counterparts, and the performance bounds for KZ are also improved. We apply them to designing integer-forcing (IF) linear receivers for multi-input multi-output (MIMO) communications. Our simulations confirm their rate and complexity advantages.

Fast Genetic Algorithms

For genetic algorithms using a bit-string representation of length~n, the general recommendation is to take 1/n as mutation rate. In this work, we discuss whether this is really justified for multimodal functions. Taking jump functions and the (1+1) evolutionary algorithm as the simplest example, we observe that larger mutation rates give significantly better runtimes. For the \jump_{m,n} function, any mutation rate between 2/n and m \ln(m/2) / n leads to a speed-up at least exponential in m compared to the standard choice. The asymptotically best runtime, obtained from using the mutation rate m/n and leading to a speed-up super-exponential in m, is very sensitive to small changes of the mutation rate. Any deviation by a small (1 \pm \eps) factor leads to a slow-down exponential in m. Consequently, any fixed mutation rate gives strongly sub-optimal results for most jump functions. Building on this observation, we propose to use a random mutation rate \alpha/n, where \alpha is chosen from a power-law distribution. We prove that the (1+1) EA with this heavy-tailed mutation rate optimizes any \jump_{m,n} function in a time that is only a small polynomial (in~m) factor above the one stemming from the optimal rate for this m. Our heavy-tailed mutation operator yields similar speed-ups (over the best known performance guarantees) for the vertex cover problem in bipartite graphs and the matching problem in general graphs. Following the example of fast simulated annealing, fast evolution strategies, and fast evolutionary programming, we propose to call genetic algorithms using a heavy-tailed mutation operator \emph{fast genetic algorithms}.

Compressibility and probabilistic proofs

We consider several examples of probabilistic existence proofs using compressibility arguments, including some results that involve Lov\’asz local lemma.

A Note on Bayesian Model Selection for Discrete Data Using Proper Scoring Rules

We consider the problem of choosing between parametric models for a discrete observable, taking a Bayesian approach in which the within-model prior distributions are allowed to be improper. In order to avoid the ambiguity in the marginal likelihood function in such a case, we apply a homogeneous scoring rule. For the particular case of distinguishing between Poisson and Negative Binomial models, we conduct simulations that indicate that, applied prequentially, the method will consistently select the true model.

Learning Active Learning from Real and Synthetic Data

In this paper, we suggest a novel data-driven approach to active learning: Learning Active Learning (LAL). The key idea behind LAL is to train a regressor that predicts the expected error reduction for a potential sample in a particular learning state. By treating the query selection procedure as a regression problem we are not restricted to dealing with existing AL heuristics; instead, we learn strategies based on experience from previous active learning experiments. We show that LAL can be learnt from a simple artificial 2D dataset and yields strategies that work well on real data from a wide range of domains. Moreover, if some domain-specific samples are available to bootstrap active learning, the LAL strategy can be tailored for a particular problem.

Loyalty in Online Communities

Loyalty is an essential component of multi-community engagement. When users have the choice to engage with a variety of different communities, they often become loyal to just one, focusing on that community at the expense of others. However, it is unclear how loyalty is manifested in user behavior, or whether loyalty is encouraged by certain community characteristics. In this paper we operationalize loyalty as a user-community relation: users loyal to a community consistently prefer it over all others; loyal communities retain their loyal users over time. By exploring this relation using a large dataset of discussion communities from Reddit, we reveal that loyalty is manifested in remarkably consistent behaviors across a wide spectrum of communities. Loyal users employ language that signals collective identity and engage with more esoteric, less popular content, indicating they may play a curational role in surfacing new material. Loyal communities have denser user-user interaction networks and lower rates of triadic closure, suggesting that community-level loyalty is associated with more cohesive interactions and less fragmentation into subgroups. We exploit these general patterns to predict future rates of loyalty. Our results show that a user’s propensity to become loyal is apparent from their first interactions with a community, suggesting that some users are intrinsically loyal from the very beginning.

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. In our approach, the parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance on that task. In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on a few-shot image classification benchmark, produces good results on few-shot regression, and accelerates fine-tuning for policy gradient reinforcement learning with neural network policies.

A Manifold Approach to Learning Mutually Orthogonal Subspaces

A note on quickly sampling a sparse matrix with low rank expectation

New approximation for GARCH parameters estimate

Spectral Graph Convolutions on Population Graphs for Disease Prediction

A dichotomy theorem for nonuniform CSPs

A New Capture-Recapture Model in Dual-record System

Elicitation, measuring bias, checking for prior-data conflict and inference with a Dirichlet prior

Lot sizing problem integrated into cutting stock problem in a paper industry: a multiobjective approach

Beamspace Aware Adaptive Channel Estimation for Single-Carrier Time-varying Massive MIMO Channels

Moderate deviations for the Langevin equation with strong damping

Parallel Implementation of Efficient Search Schemes for the Inference of Cancer Progression Models

Combining Bayesian Approaches and Evolutionary Techniques for the Inference of Breast Cancer Networks

Bootstrap with Clustering in Two or More Dimensions

A GAMP Based Low Complexity Sparse Bayesian Learning Algorithm

Quickest Visibility Queries in Polygonal Domains

Matrix Minor Reformulation and SOCP-based Spatial Branch-and-Cut Method for the AC Optimal Power Flow Problem

Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection

Interpretable Structure-Evolving LSTM

The hierarchical Cannings process in random environment

Upper semismooth functions and the subdifferential determination property

Bi-Boolean independence for pairs of algebras

Deep Convolutional Neural Network Inference with Floating-point Weights and Fixed-point Activations

Learning the Probabilistic Structure of Cumulative Phenomena with Suppes-Bayes Causal Networks

Efficient Simulation of Financial Stress Testing Scenarios with Suppes-Bayes Causal Networks

Typical structure of oriented graphs and digraphs with forbidden blow-up transitive triangle

A major-index preserving map on fillings

Fitting the Linear Preferential Attachment Model

Information Extraction in Illicit Domains

DA-RNN: Semantic Mapping with Data Associated Recurrent Neural Networks

Predictive and Recommendatory Spectrum Decision for Cognitive Radio

Image Classification of Melanoma, Nevus and Seborrheic Keratosis by Deep Neural Network Ensemble

Long quasi-polycyclic $t-$CIS codes

Statistical Cost Sharing

Performance of Proportional Fair Scheduling for Downlink Non-Orthogonal Multiple Access Systems

On linear complementary-dual multinegacirculant codes

Coordinated Multi-Agent Imitation Learning

Calibrated Data Augmentation for Scalable Markov Chain Monte Carlo

DeepSD: Generating High Resolution Climate Change Projections through Single Image Super-Resolution

Learning to Remember Rare Events

A Structured Self-attentive Sentence Embedding

Technical Report: Outage Performance of Full-Duplex MIMO DF Relaying using Zero-Forcing Beamforming

Characterisation of Optimal Responses to Pulse Inputs in the Bergman Minimal Model

Approaching Channel Capacity without Error Correction Coding through Nonlinear Transformation of OFDM Signals

Juggling Functions Inside a Database

On Hamilton Cycle Decompositions of Tensor Products of Graphs

Detecting Sockpuppets in Deceptive Opinion Spam

A Normalization Model for Analyzing Multi-Tier Millimeter Wave Cellular Networks

New approaches to coding information using inverse scattering transform

Conic relaxation approaches for equal deployment problems

Face-to-BMI: Using Computer Vision to Infer Body Mass Index on Social Media

A note on permutation polynomials over finite fields

Behavior-based Navigation of Mobile Robot in Unknown Environments Using Fuzzy Logic and Multi-Objective Optimization

A Generalized Zero-Forcing Precoder with Successive Dirty-Paper Coding in MISO Broadcast Channels

Stationary solutions to the compressible Navier-Stokes system driven by stochastic forces

AG codes and AG quantum codes from the GGS curve

Achievable Rate Region of Non-Orthogonal Multiple Access Systems with Wireless Powered Decoder

Segmenting Dermoscopic Images

Embedding Tarskian Semantics in Vector Spaces

Prior-based Hierarchical Segmentation Highlighting Structures of Interest

Turkish PoS Tagging by Reducing Sparsity with Morpheme Tags in Small Datasets

Kernel intensity estimation, bootstrapping and bandwidth selection for inhomogeneous point processes depending on spatial covariates

Robust Density Ratio Estimation: Trimming the Likelihood Ratio

Maximum Likelihood Decoder for Index Coded PSK Modulation for Priority Ordered Receivers

WebCaricature: a benchmark for caricature face recognition

Modeling the Ellsberg Paradox by Argument Strength

Fuzzy Authentication using Rank Distance

Fractional compound Poisson processes with multiple internal states

Reflected stochastic differential equations driven by $G$-Brownian motion in non-convex domains

The equivalence between the categories of Giry-algebras and convex spaces

On minimal additive complements of integers

Abductive, Causal, and Counterfactual Conditionals Under Incomplete Probablistitic Knowledge

Counterfactuals, indicative conditionals, and negation under uncertainty: Are there cross-cultural differences?

Does Nash Envy Immunity

Lower Bounds on Nonnegative Signed Domination Parameters in Graphs

Heteroclinic switching between chimeras

Confidence intervals in high-dimensional regressions based on regularized psuedoinverses

Preorder Construct on Simple Undirected Graphs

On low rank-width colorings

End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks

Conditional expanding bounds for two-variable functions over arbitrary fields

Split Sample Empirical Likelihood

On a Class of Polynomials Generated by F (xt — R(t))

Self-Stabilizing Disconnected Components Detection and Rooted Shortest-Path Tree Maintenance in Polynomial Steps

Robust Control Policies given Formal Specifications in Uncertain Environments

Independence-Domination duality in weighted graphs

UntrimmedNets for Weakly Supervised Action Recognition and Detection

Relay Pair Selection with Source Precoding in Buffer-Aided Successive Opportunistic Relaying

Adaptive Non-uniform Compressive Sampling for Time-varying Signals

A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation

Fast and Robust Detection of Fallen People from a Mobile Robot

A log-linear time algorithm for constrained changepoint detection

Laplacian, on the graph of the Weierstrass function

LesionSeg: Semantic segmentation of skin lesions using Deep Convolutional Neural Network

mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions

Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis

Faster Greedy MAP Inference for Determinantal Point Processes