The key issue of few-shot learning is learning to generalize. In this paper, we propose a large margin principle to improve the generalization capacity of metric based methods for few-shot learning. To realize it, we develop a unified framework to learn a more discriminative metric space by augmenting the softmax classification loss function with a large margin distance loss function for training. Extensive experiments on two state-of-the-art few-shot learning models, graph neural networks and prototypical networks, show that our method can improve the performance of existing models substantially with very little computational overhead, demonstrating the effectiveness of the large margin principle and the potential of our method.
Pairwise network models such as the Gaussian Graphical Model (GGM) are a powerful and intuitive way to analyze dependencies in multivariate data. A key assumption of the GGM is that each pairwise interaction is independent of the values of all other variables. However, in psychological research this is often implausible. In this paper, we extend the GGM by allowing each pairwise interaction between two variables to be moderated by (a subset of) all other variables in the model, and thereby introduce a Moderated Network Model (MNM). We show how to construct the MNW and propose an L1-regularized nodewise regression approach to estimate it. We provide performance results in a simulation study and show that MNMs outperform the split-sample based methods Network Comparison Test (NCT) and Fused Graphical Lasso (FGL) in detecting moderation effects. Finally, we provide a fully reproducible tutorial on how to estimate MNMs with the R-package mgm and discuss possible issues with model misspecification.
Neural network decoding algorithms are recently introduced by Nachmani et al. to decode high-density parity-check (HDPC) codes. In contrast with iterative decoding algorithms such as sum-product or min-sum algorithms in which the weight of each edge is set to $1$, in the neural network decoding algorithms, the weight of every edge depends on its impact in the transmitted codeword. In this paper, we provide a novel \emph{feed-forward neural network lattice decoding algorithm} suitable to decode lattices constructed based on Construction A, whose underlying codes have HDPC matrices. We first establish the concept of feed-forward neural network for HDPC codes and improve their decoding algorithms compared to Nachmani et al. We then apply our proposed decoder for a Construction A lattice with HDPC underlying code, for which the well-known iterative decoding algorithms show poor performances. The main advantage of our proposed algorithm is that instead of assigning and training weights for all edges, which turns out to be time-consuming especially for high-density parity-check matrices, we concentrate on edges which are present in most of $4$-cycles and removing them gives a girth-$6$ Tanner graph. This approach, by slight modifications using updated LLRs instead of initial ones, simultaneously accelerates the training process and improves the error performance of our proposed decoding algorithm.
We address the problem of domain generalization where a decision function is learned from the data of several related domains, and the goal is to apply it on an unseen domain successfully. It is assumed that there is plenty of labeled data available in source domains (also called as training domain), but no labeled data is available for the unseen domain (also called a target domain or test domain). We propose a novel neural network architecture, Domain2Vec (D2V) that learns domain-specific embedding and then uses this embedding to generalize the learning across related domains. The proposed algorithm, D2V extends the idea of distribution regression and kernelized domain generalization to the neural networks setting. We propose a neural network architecture to learn domain-specific embedding and then use this embedding along with the data point specific features to label it. We show the effectiveness of the architecture by accurately estimating domain to domain similarity. We evaluate our algorithm against standard domain generalization datasets for image classification and outperform other state of the art algorithms.
In this study, we present a novel ranking model based on learning the nearest neighbor relationships embedded in the index space. Given a query point, a conventional nearest neighbor search approach calculates the distances to the cluster centroids, before ranking the clusters from near to far based on the distances. The data indexed in the top-ranked clusters are retrieved and treated as the nearest neighbor candidates for the query. However, the loss of quantization between the data and cluster centroids will inevitably harm the search accuracy. To address this problem, the proposed model ranks clusters based on their nearest neighbor probabilities rather than the query-centroid distances to the query. The nearest neighbor probabilities are estimated by employing neural networks to characterize the neighborhood relationships as a nonlinear function, i.e., the density distribution of nearest neighbors with respect to the query. The proposed probability-based ranking model can replace the conventional distance-based ranking model as a coarse filter for candidate clusters, and the nearest neighbor probability can be used to determine the data quantity to be retrieved from the candidate cluster. Our experimental results demonstrated that implementation of the proposed ranking model for two state-of-the-art nearest neighbor quantization and search methods could boost the search performance effectively in billion-scale datasets.