Detecting non-causal artifacts in multivariate linear regression models

We consider linear models where d potential causes X_1,...,X_d are correlated with one target quantity Y and propose a method to infer whether the association is causal or whether it is an artifact caused by overfitting or hidden common causes. We employ the idea that in the former case the vector of regression coefficients has ‘generic’ orientation relative to the covariance matrix \Sigma_{XX} of X. Using an ICA based model for confounding, we show that both confounding and overfitting yield regression vectors that concentrate mainly in the space of low eigenvalues of \Sigma_{XX}.

On Statistical Non-Significance

Significance tests are probably the most extended form of inference in empirical research, and significance is often interpreted as providing greater informational content than non-significance. In this article we show, however, that rejection of a point null often carries very little information, while failure to reject may be highly informative. This is particularly true in empirical contexts where data sets are large and where there are rarely reasons to put substantial prior probability on a point null. Our results challenge the usual practice of conferring point null rejections a higher level of scientific significance than non-rejections. In consequence, we advocate a visible reporting and discussion of non-significant results in empirical practice.

Static and Dynamic Robust PCA via Low-Rank + Sparse Matrix Decomposition: A Review

Principal Components Analysis (PCA) is one of the most widely used dimension reduction techniques. Robust PCA (RPCA) refers to the problem of PCA when the data may be corrupted by outliers. Recent work by Candes, Wright, Li, and Ma defined RPCA as a problem of decomposing a given data matrix into the sum of a low-rank matrix (true data) and a sparse matrix (outliers). The column space of the low-rank matrix then gives the PCA solution. This simple definition has lead to a large amount of interesting new work on provably correct, fast, and practically useful solutions to the RPCA problem. More recently, the dynamic (time-varying) version of the RPCA problem has been studied and a series of provably correct, fast, and memory efficient tracking solutions have been proposed. Dynamic RPCA (or robust subspace tracking) is the problem of tracking data lying in a (slowly) changing subspace while being robust to sparse outliers. This article provides an exhaustive review of the last decade of literature on RPCA and its dynamic counterpart (robust subspace tracking), along with describing their theoretical guarantees, discussing the pros and cons of various approaches, and providing empirical comparisons of performance and speed.

Evolutionary Generative Adversarial Networks

Generative adversarial networks (GAN) have been effective for learning generative models for real-world data. However, existing GANs (GAN and its variants) tend to suffer from training problems such as instability and mode collapse. In this paper, we propose a novel GAN framework called evolutionary generative adversarial networks (E-GAN) for stable GAN training and improved generative performance. Unlike existing GANs, which employ a pre-defined adversarial objective function alternately training a generator and a discriminator, we utilize different adversarial training objectives as mutation operations and evolve a population of generators to adapt to the environment (i.e., the discriminator). We also utilize an evaluation mechanism to measure the quality and diversity of generated samples, such that only well-performing generator(s) are preserved and used for further training. In this way, E-GAN overcomes the limitations of an individual adversarial training objective and always preserves the best offspring, contributing to progress in and the success of GANs. Experiments on several datasets demonstrate that E-GAN achieves convincing generative performance and reduces the training problems inherent in existing GANs.

Meta-Learning for Semi-Supervised Few-Shot Classification

In few-shot classification, we are interested in learning algorithms that train a classifier from only a handful of labeled examples. Recent progress in few-shot classification has featured meta-learning, in which a parameterized model for a learning algorithm is defined and trained on episodes representing different classification problems, each with a small labeled training set and its corresponding test set. In this work, we advance this few-shot classification paradigm towards a scenario where unlabeled examples are also available within each episode. We consider two situations: one where all unlabeled examples are assumed to belong to the same set of classes as the labeled examples of the episode, as well as the more challenging situation where examples from other distractor classes are also provided. To address this paradigm, we propose novel extensions of Prototypical Networks (Snell et al., 2017) that are augmented with the ability to use unlabeled examples when producing prototypes. These models are trained in an end-to-end way on episodes, to learn to leverage the unlabeled examples successfully. We evaluate these methods on versions of the Omniglot and miniImageNet benchmarks, adapted to this new framework augmented with unlabeled examples. We also propose a new split of ImageNet, consisting of a large set of classes, with a hierarchical structure. Our experiments confirm that our Prototypical Networks can learn to improve their predictions due to unlabeled examples, much like a semi-supervised algorithm would.

Learning Decorrelated Hashing Codes for Multimodal Retrieval

In social networks, heterogeneous multimedia data correlate to each other, such as videos and their corresponding tags in YouTube and image-text pairs in Facebook. Nearest neighbor retrieval across multiple modalities on large data sets becomes a hot yet challenging problem. Hashing is expected to be an efficient solution, since it represents data as binary codes. As the bit-wise XOR operations can be fast handled, the retrieval time is greatly reduced. Few existing multimodal hashing methods consider the correlation among hashing bits. The correlation has negative impact on hashing codes. When the hashing code length becomes longer, the retrieval performance improvement becomes slower. In this paper, we propose a minimum correlation regularization (MCR) for multimodal hashing. First, the sigmoid function is used to embed the data matrices. Then, the MCR is applied on the output of sigmoid function. As the output of sigmoid function approximates a binary code matrix, the proposed MCR can efficiently decorrelate the hashing codes. Experiments show the superiority of the proposed method becomes greater as the code length increases.

Autostacker: A Compositional Evolutionary Learning System

We introduce an automatic machine learning (AutoML) modeling architecture called Autostacker, which combines an innovative hierarchical stacking architecture and an Evolutionary Algorithm (EA) to perform efficient parameter search. Neither prior domain knowledge about the data nor feature preprocessing is needed. Using EA, Autostacker quickly evolves candidate pipelines with high predictive accuracy. These pipelines can be used as is or as a starting point for human experts to build on. Autostacker finds innovative combinations and structures of machine learning models, rather than selecting a single model and optimizing its hyperparameters. Compared with other AutoML systems on fifteen datasets, Autostacker achieves state-of-art or competitive performance both in terms of test accuracy and time cost.

Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application

In e-commerce platforms such as Amazon and TaoBao, ranking items in a search session is a typical multi-step decision-making problem. Learning to rank (LTR) methods have been widely applied to ranking problems. However, such methods often consider different ranking steps in a session to be independent, which conversely may be highly correlated to each other. For better utilizing the correlation between different ranking steps, in this paper, we propose to use reinforcement learning (RL) to learn an optimal ranking policy which maximizes the expected accumulative rewards in a search session. Firstly, we formally define the concept of search session Markov decision process (SSMDP) to formulate the multi-step ranking problem. Secondly, we analyze the property of SSMDP and theoretically prove the necessity of maximizing accumulative rewards. Lastly, we propose a novel policy gradient algorithm for learning an optimal ranking policy, which is able to deal with the problem of high reward variance and unbalanced reward distribution of an SSMDP. Experiments are conducted in simulation and TaoBao search engine. The results demonstrate that our algorithm performs much better than online LTR methods, with more than 40% and 30% growth of total transaction amount in the simulation and the real application, respectively.

Markov Switch Smooth Transition HYGARCH Model: Stability and Estimation

HYGARCH model is basically used to model long-range dependence in volatility. We propose Markov switch smooth-transition HYGARCH model, where the volatility in each state is a time-dependent convex combination of GARCH and FIGARCH. This model provides a flexible structure to capture different levels of volatilities and also short and long memory effects. The necessary and sufficient condition for the asymptotic stability is derived. Forecast of conditional variance is studied by using all past information through a parsimonious way. Bayesian estimations based on Gibbs sampling are provided. A simulation study has been given to evaluate the estimations and model stability. The competitive performance of the proposed model is shown by comparing it with the HYGARCH and smooth-transition HYGARCH models for some period of the \textit{S}\&\textit{P}500 indices based on volatility and value-at-risk forecasts.

Convolutional Geometric Matrix Completion

Geometric matrix completion~(GMC) has been proposed for recommendation by integrating the relationship~(link) graphs among users/items into matrix completion~(MC) . Traditional \mbox{GMC} methods typically adopt graph regularization to impose smoothness priors for \mbox{MC}. Recently, geometric deep learning on graphs~(\mbox{GDLG}) is proposed to solve the GMC problem, showing better performance than existing GMC methods including traditional graph regularization based methods. To the best of our knowledge, there exists only one GDLG method for GMC, which is called \mbox{RMGCNN}. RMGCNN combines graph convolutional network~(GCN) and recurrent neural network~(RNN) together for GMC. In the original work of RMGCNN, RMGCNN demonstrates better performance than pure GCN-based method. In this paper, we propose a new \mbox{GMC} method, called \underline{c}onvolutional \underline{g}eometric \underline{m}atrix \underline{c}ompletion~(CGMC), for recommendation with graphs among users/items. CGMC is a pure GCN-based method with a newly designed graph convolutional network. Experimental results on real datasets show that CGMC can outperform other state-of-the-art methods including RMGCNN.

Sparse Multiple Kernel Learning: Support Identification via Mirror Stratifiability

In statistical machine learning, kernel methods allow to consider infinite dimensional feature spaces with a computational cost that only depends on the number of observations. This is usually done by solving an optimization problem depending on a data fit term and a suitable regularizer. In this paper we consider feature maps which are the concatenation of a fixed, possibly large, set of simpler feature maps. The penalty is a sparsity inducing one, promoting solutions depending only on a small subset of the features. The group lasso problem is a special case of this more general setting. We show that one of the most popular optimization algorithms to solve the regularized objective function, the forward-backward splitting method, allows to perform feature selection in a stable manner. In particular, we prove that the set of relevant features is identified by the algorithm after a finite number of iterations if a suitable qualification condition holds. The main tools used in the proofs are the notions of stratification and mirror stratifiability.

DEMorphy, German Language Morphological Analyzer

DEMorphy is a morphological analyzer for German. It is built onto large, compactified lexicons from German Morphological Dictionary. A guesser based on German declension suffixed is also provided. For German, we provided a state-of-art morphological analyzer. DEMorphy is implemented in Python with ease of usability and accompanying documentation. The package is suitable for both academic and commercial purposes wit a permissive licence.

Recovering quantum gates from few average gate fidelities
Conductance relaxation in GeBiTe – slow thermalization in an open quantum system
The information and wave-theoretic limits of analog beamforming
Hierarchical Imitation and Reinforcement Learning
On Polynomial Time PAC Reinforcement Learning with Rich Observations
Algorithm for Evolutionarily Stable Strategies Against Pure Mutations
Knowledge Base Relation Detection via Multi-View Matching
Effects of CSI Knowledge on Secrecy of Threshold-Selection Decode-and-Forward Relaying
Compositional Analysis of Hybrid Systems Defined Over Finite Alphabets
Fast and accurate computation of orthogonal moments for texture analysis
Re-examination of Bregman functions and new properties of their divergences
How strong are correlations in strongly recurrent neuronal networks?
Uniform large deviation principles for Banach space valued stochastic differential equations
Kernel Embedding Approaches to Orbit Determination of Spacecraft Clusters
Semi-parametric Topological Memory for Navigation
Memoryless Determinacy of Infinite Parity Games: Another Simple Proof
On the number of generalized Sidon sets
Optimal Distributed Energy Resources Sizing for Commercial Building Hybrid Microgrids
SD-CNN: a Shallow-Deep CNN for Improved Breast Cancer Diagnosis
Can cut generating functions be good and effective?
An efficient algorithm to test forcibly-connectedness of graphical degree sequences
Mirror-Prox SCA Algorithm for Multicast Beamforming and Antenna Selection
Random perturbation and matrix sparsification and completion
A Tutorial on UAVs for Wireless Networks: Applications, Challenges, and Open Problems
Contained Neural Style Transfer for Decorated Logo Generation
Throughput Maximization for Laser-Powered UAV Wireless Communication Systems
Accelerating E-Commerce Search Engine Ranking by Contextual Factor Selection
Deep-neural-network based sinogram synthesis for sparse-view CT image reconstruction
Next Steps for the Colorado Risk-Limiting Audit (CORLA) Program
Unifacta: Profiling-driven String Pattern Standardization
Raw Multi-Channel Audio Source Separation using Multi-Resolution Convolutional Auto-Encoders
A Factoid Question Answering System for Vietnamese
Robust Multivariate Nonparametric Tests via Projection-Pursuit
Optimal Smoothed Variable Sample-size Accelerated Proximal Methods for Structured Nonsmooth Stochastic Convex Programs
RankDCG: Rank-Ordering Evaluation Measure
Model Predictive Climate Control of Connected and Automated Vehicles for Improved Energy Efficiency
Age Group Classification with Speech and Metadata Multimodality Fusion
Proceedings 6th International Workshop on Theorem proving components for Educational software
Representing Verbs as Argument Concepts
On stability properties over powers of polymatroidal ideals
Continuous-time GARCH process driven by semi-Lévy process
Stable amplitude chimera states in a network of locally coupled Stuart-Landau oscillators
Fusion of multispectral satellite imagery using a cluster of graphics processing unit
Clinically Meaningful Comparisons Over Time: An Approach to Measuring Patient Similarity based on Subsequence Alignment
Driving Digital Rock towards Machine Learning: predicting permeability with Gradient Boosting and Deep Neural Networks
Aspl{ü}nd’s metric defined in the Logarithmic Image Processing (LIP) framework for colour and multivariate images
Optimality of 1-norm regularization among weighted 1-norms for sparse recovery: a case study on how to find optimal regularizations
Unsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration
Simple and Local Independent Set Approximation
Automated Map Reading: Image Based Localisation in 2-D Maps Using Binary Semantic Descriptors
A pathwise construction of Birth-Death-Swap systems leading to an averaging result in the presence of two timescales
A microscopic model for a one parameter class of fractional laplacians with dirichlet boundary conditions
Equivalence of some subcritical properties in continuum percolation
Perceptual decision making: Biases in post-error reaction times explained by attractor network dynamics
Fine-Grained Complexity of Analyzing Compressed Data: Quantifying Improvements over Decompress-And-Solve
Permutation Tests for Equality of Distributions of Functional Data
Jointly Controlled Lotteries with Biased Coins
Proofs of Technical Results Justifying an Algorithm of Reactive 3D Navigation of a Mobile Robot through an Unknown Tunnel
Clique-Based Lower Bounds for Parsing Tree-Adjoining Grammars
Deep Unsupervised Intrinsic Image Decomposition by Siamese Training
On the Relation of Strong Triadic Closure and Cluster Deletion
An easy proof of Polya’s theorem on random walks
Lifetime of flatband states
NetGAN: Generating Graphs via Random Walks
Robustness against Disturbances in Power Systems under Frequency Constraints
Convex Restriction of Power Flow Feasibility Set
The ‘No Justice in the Universe’ phenomenon: why honesty of effort may not be rewarded in tournaments
Independence number and the number of maximum independent sets in pseudofractal scale-free web and Sierpiński gasket
Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift
Lexico-acoustic Neural-based Models for Dialog Act Classification
Towards a Question Answering System over the Semantic Web
A multi-instance deep neural network classifier: application to Higgs boson CP measurement
Pose-Robust Face Recognition via Deep Residual Equivariant Mapping
Gradient-based Sampling: An Adaptive Importance Sampling for Least-squares
Maximum Volume Subset Selection for Anchored Boxes
Quantum distance-based classifier with constant size memory, distributed knowledge and state recycling
Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama’s voice using GAN, WaveNet and low-quality found data
On zero-sum generalized Schur Numbers
A Fast Interior Point Method for Atomic Norm Soft Thresholding
Game-theoretical model of cooperation between producers in a production process: 3-agent interaction case
Nash equilibria in routing games with edge priorities
Bifurcation analysis of a mean field laser equation
The last zero crossing of an iterated Brownian motion with drift
Sunspot Equilibrium in General Quitting Games
Probabilistic design of a molybdenum-base alloy using a neural network
A comparative study of stochastic resonance for a model with two pathways by escape times, linear response, invariant measures and the conditional Kolmogorov-Smirnov Test
Physical Layer Security for RF Satellite Channels in the Finite-length Regime
Essentially No Barriers in Neural Network Energy Landscape
Scalable Bayesian uncertainty quantification in imaging inverse problems via convex optimization
Sparse Identification of Nonlinear Dynamics for Rapid Model Recovery
Impact of Biases in Big Data
Hardness of Approximate Nearest Neighbor Search
Goldberg’s Conjecture is True for Random Multigraphs
Optional projection in duality
Experimental Evaluation of Parameterized Algorithms for Feedback Vertex Set
Semi-Supervised Algorithms for Approximately Optimal and Accurate Clustering
Finding Hamiltonian Cycle in Graphs of Bounded Treewidth: Experimental Evaluation
Estimating model bias over the complete nuclide chart with sparse Gaussian processes at the example of INCL/ABLA and double-differential neutron spectra
Beyond black-boxes in Bayesian inverse problems and model validation: applications in solid mechanics of elastography
Distributed Prioritized Experience Replay
An improved FPT algorithm for Independent Feedback Vertex Set
Multivariate Fine-Grained Complexity of Longest Common Subsequence
Protecting JPEG Images Against Adversarial Attacks
Not All Samples Are Created Equal: Deep Learning with Importance Sampling
Tree Species Identification from Bark Images Using Convolutional Neural Networks
Multimodal Registration of Retinal Images Using Domain-Specific Landmarks and Vessel Enhancement
Optimization with Gradient-Boosted Trees and Risk Control
A measure theoretic approach to traffic flow optimization on networks
A study in $\mathbb{G}_{\mathbb{R}, \geq 0}$: from the geometric case book of Wilson loop diagrams and SYM $N=4$
Fitting and Analysis Technique for Inconsistent Nuclear Data
Energy Efficiency of Opportunistic Device-to-Device Relaying Under Lognormal Shadowing
Hashing with Mutual Information
Sparse power-law network model for reliable statistical sampling
Estimation of Poisson arrival processes under linear models
Power Control and Channel Allocation for D2D Underlaid Cellular Networks
Hybrid Model For Word Prediction Using Naive Bayes and Latent Information
Secure and Privacy-Aware Data Dissemination for Cloud-Based Applications
Label Sanitization against Label Flipping Poisoning Attacks
A computational perspective of the role of Thalamus in cognition