Tensor Networks in a Nutshell

Tensor network methods are taking a central role in modern quantum physics and beyond. They can provide an efficient approximation to certain classes of quantum states, and the associated graphical language makes it easy to describe and pictorially reason about quantum circuits, channels, protocols, open systems and more. Our goal is to explain tensor networks and some associated methods as quickly and as painlessly as possible. Beginning with the key definitions, the graphical tensor network language is presented through examples. We then provide an introduction to matrix product states. We conclude the tutorial with tensor contractions evaluating combinatorial counting problems. The first one counts the number of solutions for Boolean formulae, whereas the second is Penrose’s tensor contraction algorithm, returning the number of 3-edge-colorings of 3-regular planar graphs.

Interpretable Active Learning

Active learning has long been a topic of study in machine learning. However, as increasingly complex and opaque models have become standard practice, the process of active learning, too, has become more opaque. There has been little investigation into interpreting what specific trends and patterns an active learning strategy may be exploring. This work expands on the Local Interpretable Model-agnostic Explanations framework (LIME) to provide explanations for active learning recommendations. We demonstrate how LIME can be used to generate locally faithful explanations for an active learning strategy, and how these explanations can be used to understand how different models and datasets explore a problem space over time. In order to quantify the per-subgroup differences in how an active learning strategy queries spatial regions, we introduce a notion of uncertainty bias (based on disparate impact) to measure the discrepancy in the confidence for a model’s predictions between one subgroup and another. Using the uncertainty bias measure, we show that our query explanations accurately reflect the subgroup focus of the active learning queries, allowing for an interpretable explanation of what is being learned as points with similar sources of uncertainty have their uncertainty bias resolved. We demonstrate that this technique can be applied to track uncertainty bias over user-defined clusters or automatically generated clusters based on the source of uncertainty.

Time-Dependent Representation for Neural Event Sequence Prediction

Existing sequence prediction methods are mostly concerned with time-independent sequences, in which the actual time span between events is irrelevant and the distance between events is simply the difference between their order positions in the sequence. While this time-independent view of sequences is applicable for data such as natural languages, e.g., dealing with words in a sentence, it is inappropriate and inefficient for many real world events that are observed and collected at unequally spaced points of time as they naturally arise, e.g., when a person goes to a grocery store or makes a phone call. The time span between events can carry important information about the sequence dependence of human behaviors. To leverage continuous time in sequence prediction, we propose two methods for integrating time into event representation, based on the intuition on how time is tokenized in everyday life and previous work on embedding contextualization. We particularly focus on using these methods in recurrent neural networks, which have gained popularity in many sequence prediction tasks. We evaluated these methods as well as baseline models on two learning tasks: mobile app usage prediction and music recommendation. The experiments revealed that the proposed methods for time-dependent representation offer consistent gain on accuracy compared to baseline models that either directly use continuous time value in a recurrent neural network or do not use time.

Bayesian Sparsification of Recurrent Neural Networks

Recurrent neural networks show state-of-the-art results in many text analysis tasks but often require a lot of memory to store their weights. Recently proposed Sparse Variational Dropout eliminates the majority of the weights in a feed-forward neural network without significant loss of quality. We apply this technique to sparsify recurrent neural networks. To account for recurrent specifics we also rely on Binary Variational Dropout for RNN. We report 99.5% sparsity level on sentiment analysis task without a quality drop and up to 87% sparsity level on language modeling task with slight loss of accuracy.

Learning Algorithms for Active Learning

We introduce a model that learns active learning algorithms via metalearning. For a distribution of related tasks, our model jointly learns: a data representation, an item selection heuristic, and a method for constructing prediction functions from labeled training sets. Our model uses the item selection heuristic to gather labeled training sets from which to construct prediction functions. Using the Omniglot and MovieLens datasets, we test our model in synthetic and practical settings.

The Code2Text Challenge: Text Generation in Source Code Libraries

We propose a new shared task for tactical data-to-text generation in the domain of source code libraries. Specifically, we focus on text generation of function descriptions from example software projects. Data is drawn from existing resources used for studying the related problem of semantic parser induction (Richardson and Kuhn, 2017b; Richardson and Kuhn, 2017a), and spans a wide variety of both natural languages and programming languages. In this paper, we describe these existing resources, which will serve as training and development data for the task, and discuss plans for building new independent test sets.

Advantages and Limitations of using Successor Features for Transfer in Reinforcement Learning

One question central to Reinforcement Learning is how to learn a feature representation that supports algorithm scaling and re-use of learned information from different tasks. Successor Features approach this problem by learning a feature representation that satisfies a temporal constraint. We present an implementation of an approach that decouples the feature representation from the reward function, making it suitable for transferring knowledge between domains. We then assess the advantages and limitations of using Successor Features for transfer.

A Labelling Framework for Probabilistic Argumentation

The combination of argumentation and probability paves the way to new accounts of qualitative and quantitative uncertainty, thereby offering new theoretical and applicative opportunities. Due to a variety of interests, probabilistic argumentation is approached in the literature with different frameworks, pertaining to structured and abstract argumentation, and with respect to diverse types of uncertainty, in particular the uncertainty on the credibility of the premises, the uncertainty about which arguments to consider, and the uncertainty on the acceptance status of arguments or statements. Towards a general framework for probabilistic argumentation, we investigate a labelling-oriented framework encompassing a basic setting for rule-based argumentation and its (semi-) abstract account, along with diverse types of uncertainty. Our framework provides a systematic treatment of various kinds of uncertainty and of their relationships and allows us to retrieve (by derivation) multiple statements (sometimes assumed) or results from the literature.

Predicting Session Length in Media Streaming

Session length is a very important aspect in determining a user’s satisfaction with a media streaming service. Being able to predict how long a session will last can be of great use for various downstream tasks, such as recommendations and ad scheduling. Most of the related literature on user interaction duration has focused on dwell time for websites, usually in the context of approximating post-click satisfaction either in search results, or display ads. In this work we present the first analysis of session length in a mobile-focused online service, using a real world data-set from a major music streaming service. We use survival analysis techniques to show that the characteristics of the length distributions can differ significantly between users, and use gradient boosted trees with appropriate objectives to predict the length of a session using only information available at its beginning. Our evaluation on real world data illustrates that our proposed technique outperforms the considered baseline.

On Tensor Train Rank Minimization : Statistical Efficiency and Scalable Algorithm

Tensor train (TT) decomposition provides a space-efficient representation for higher-order tensors. Despite its advantage, we face two crucial limitations when we apply the TT decomposition to machine learning problems: the lack of statistical theory and of scalable algorithms. In this paper, we address the limitations. First, we introduce a convex relaxation of the TT decomposition problem and derive its error bound for the tensor completion task. Next, we develop an alternating optimization method with a randomization technique, in which the time complexity is as efficient as the space complexity is. In experiments, we numerically confirm the derived bounds and empirically demonstrate the performance of our method with a real higher-order tensor.

Tensorial Recurrent Neural Networks for Longitudinal Data Analysis

Traditional Recurrent Neural Networks assume vectorized data as inputs. However many data from modern science and technology come in certain structures such as tensorial time series data. To apply the recurrent neural networks for this type of data, a vectorisation process is necessary, while such a vectorisation leads to the loss of the precise information of the spatial or longitudinal dimensions. In addition, such a vectorized data is not an optimum solution for learning the representation of the longitudinal data. In this paper, we propose a new variant of tensorial neural networks which directly take tensorial time series data as inputs. We call this new variant as Tensorial Recurrent Neural Network (TRNN). The proposed TRNN is based on tensor Tucker decomposition.

A Survey on Visual Query Systems in the Web Era (extended version)

As more and more collections of data are becoming available on the web to everyone, non expert users demand easy ways to retrieve data from these collections. One solution is the so called Visual Query Systems (VQS) where queries are represented visually and users do not have to understand query languages such as SQL or XQuery. In 1996, a paper by Catarci reviewed the Visual Query Systems available until that year. In this paper, we review VQSs from 1997 until now and try to determine whether they have been the solution for non expert users. The short answer is no because very few systems have in fact been used in real environments or as commercial tools. We have also gathered basic features of VQSs such as the visual representation adopted to present the reality of interest or the visual representation adopted to express queries.

Natural Language Processing with Small Feed-Forward Networks

We show that small and shallow feed-forward neural networks can achieve near state-of-the-art results on a range of unstructured and structured language processing tasks while being considerably cheaper in memory and computational requirements than deep recurrent models. Motivated by resource-constrained environments like mobile phones, we showcase simple techniques for obtaining such small neural network models, and investigate different tradeoffs when deciding how to allocate a small memory budget.

Query Expansion Techniques for Information Retrieval: a Survey

With the ever increasing size of web, relevant information extraction on the Internet with a query formed by a few keywords has become a big challenge. To overcome this, query expansion (QE) plays a crucial role in improving the Internet searches, where the user’s initial query is reformulated to a new query by adding new meaningful terms with similar significance. QE — as part of information retrieval (IR) — has long attracted researchers’ attention. It has also become very influential in the field of personalized social document, Question Answering over Linked Data (QALD), and, Text Retrieval Conference (TREC) and REAL sets. This paper surveys QE techniques in IR from 1960 to 2017 with respect to core techniques, data sources used, weighting and ranking methodologies, user participation and applications (of QE techniques) — bringing out similarities and differences.

Deep Asymmetric Multi-task Feature Learning

We propose Deep Asymmetric Multitask Feature Learning (Deep-AMTFL) which can learn deep representations shared across multiple tasks while effectively preventing negative transfer that may happen in the feature sharing process. Specifically, we introduce an asymmetric autoencoder term that allows predictors for the confident tasks to have high contribution to the feature learning while suppressing the influences of less confident task predictors. This allows learning less noisy representations, and allows weak predictors to exploit knowledge from the strong predictors via the shared latent features. Such asymmetric knowledge transfer through shared features is also more scalable and efficient than inter-task asymmetric transfer. We validate our Deep-AMTFL model on multiple benchmark datasets for multitask learning and image classification, on which it significantly outperforms existing symmetric and asymmetric multitask learning models, by effectively preventing negative transfer in deep feature learning.

SenGen: Sentence Generating Neural Variational Topic Model

We present a new topic model that generates documents by sampling a topic for one whole sentence at a time, and generating the words in the sentence using an RNN decoder that is conditioned on the topic of the sentence. We argue that this novel formalism will help us not only visualize and model the topical discourse structure in a document better, but also potentially lead to more interpretable topics since we can now illustrate topics by sampling representative sentences instead of bag of words or phrases. We present a variational auto-encoder approach for learning in which we use a factorized variational encoder that independently models the posterior over topical mixture vectors of documents using a feed-forward network, and the posterior over topic assignments to sentences using an RNN. Our preliminary experiments on two different datasets indicate early promise, but also expose many challenges that remain to be addressed.

Generative Semantic Manipulation with Contrasting GAN

Generative Adversarial Networks (GANs) have recently achieved significant improvement on paired/unpaired image-to-image translation, such as photo\rightarrow sketch and artist painting style transfer. However, existing models can only be capable of transferring the low-level information (e.g. color or texture changes), but fail to edit high-level semantic meanings (e.g., geometric structure or content) of objects. On the other hand, while some researches can synthesize compelling real-world images given a class label or caption, they cannot condition on arbitrary shapes or structures, which largely limits their application scenarios and interpretive capability of model results. In this work, we focus on a more challenging semantic manipulation task, which aims to modify the semantic meaning of an object while preserving its own characteristics (e.g. viewpoints and shapes), such as cow\rightarrowsheep, motor\rightarrow bicycle, cat\rightarrowdog. To tackle such large semantic changes, we introduce a contrasting GAN (contrast-GAN) with a novel adversarial contrasting objective. Instead of directly making the synthesized samples close to target data as previous GANs did, our adversarial contrasting objective optimizes over the distance comparisons between samples, that is, enforcing the manipulated data be semantically closer to the real data with target category than the input data. Equipped with the new contrasting objective, a novel mask-conditional contrast-GAN architecture is proposed to enable disentangle image background with object semantic changes. Experiments on several semantic manipulation tasks on ImageNet and MSCOCO dataset show considerable performance gain by our contrast-GAN over other conditional GANs. Quantitative results further demonstrate the superiority of our model on generating manipulated results with high visual fidelity and reasonable object semantics.

A k-means procedure based on a Mahalanobis type distance for clustering multivariate functional data

This paper proposes a clustering procedure for samples of multivariate functions in (L^2(I))^{J}, with J\geq1. This method is based on a k-means algorithm in which the distance between the curves is measured with a metrics that generalizes the Mahalanobis distance in Hilbert spaces, considering the correlation and the variability along all the components of the functional data. The proposed procedure has been studied in simulation and compared with the k-means based on other distances typically adopted for clustering multivariate functional data. In these simulations, it is shown that the k-means algorithm with the generalized Mahalanobis distance provides the best clustering performances, both in terms of mean and standard deviation of the number of misclassified curves. Finally, the proposed method has been applied to two real cases studies, concerning ECG signals and growth curves, where the results obtained in simulation are confirmed and strengthened.

Factor analysis with finite data

Factor analysis aims to describe high dimensional random vectors by means of a small number of unknown common factors. In mathematical terms, it is required to decompose the covariance matrix \Sigma of the random vector as the sum of a diagonal matrix D | accounting for the idiosyncratic noise in the data | and a low rank matrix R | accounting for the variance of the common factors | in such a way that the rank of R is as small as possible so that the number of common factors is minimal. In practice, however, the matrix \Sigma is unknown and must be replaced by its estimate, i.e. the sample covariance, which comes from a finite amount of data. This paper provides a strategy to account for the uncertainty in the estimation of \Sigma in the factor analysis problem.

An Investigation on Social Network Recommender Systems and Collaborative Filtering Techniques

Nowadays, with the remarkable expansion of the information through the internet, users prefer to receive the exact information that they need through some suggestions from their friends or profiles to save their time and money. Recommend systems based on different algorithms as one of the basic ways to reach this goal through the internet have been proposed but each of them has their own advantages and disadvantages. In this study, we have selected and implemented two approaches which are Collaborative Filtering (CF) and Social Network Recommendations System (SNRS). Based on some limitations to finding a dataset which covers friendship, rating and item categories we generated it for 10 categories, 10 items, and 100 users and compared two approaches. We used Mean Absolute Error (MAE) and accuracy to compare the result of two mentioned approaches and found that the SNRS method as it is claimed to be improved version of CF works more efficiency.

Which Distribution Distances are Sublinearly Testable?
On the topology of no $k$-equal spaces
An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor
Two theorems on point-flat incidences
Spatio-Temporal Action Detection with Cascade Proposal and Location Anticipation
Pricing for Online Resource Allocation: Beyond Subadditive Values
Statistics on the (compact) Stiefel manifold: Theory and Applications
Nonconvex piecewise linear functions: Advanced formulations and simple modeling tools
Gaussian Behavior of Quadratic Irrationals
Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform
SemEval-2017 Task 1: Semantic Textual Similarity – Multilingual and Cross-lingual Focused Evaluation
Lectures on the Spin and Loop $O(n)$ Models
Asymptotically optimal private estimation under mean square loss
The inverse eigenvalue problem of a graph: Multiplicities and minors
Learning Robust Representations for Computer Vision
Efficient Regret Minimization in Non-Convex Games
Bayesian Dyadic Trees and Histograms for Regression
Towards the Success Rate of One: Real-time Unconstrained Salient Object Detection
On the zeros of the spectrogram of white noise
Hybrid Beamforming with Selection for Multi-user Massive MIMO Systems
Sparse Autoregressive Processes for Dynamic Variable Selection
Quantum wireless multihop teleportation via 4-qubit cluster state
Discrete probabilistic and algebraic dynamics: a stochastic Gelfand-Naimark Theorem
Conditional Expectation Bounds with Applications in Cryptography
On facial unique-maximum (edge-)coloring
The Projective Planarity Question for Matroids of $3$-Nets and Biased Graphs
Mixture Data-Dependent Priors
Analysis of the Polya-Gamma block Gibbs sampler for Bayesian logistic linear mixed models
Material Editing Using a Physically Based Rendering Network
Learned in Translation: Contextualized Word Vectors
A Continuous Relaxation of Beam Search for End-to-end Training of Neural Sequence Models
Retrofitting Distributional Embeddings to Knowledge Graphs with Functional Relations
Prediction and Generation of Binary Markov Processes: Can a Finite-State Fox Catch a Markov Mouse?
Compiling Deep Learning Models for Custom Hardware Accelerators
Anomaly Detection Using Optimally-Placed Micro-PMU Sensors in Distribution Grids
Multiple Stakeholders in Music Recommender Systems
Computing the Margin of Victory in Preferential Parliamentary Elections
Application of Support Vector Machine Modeling and Graph Theory Metrics for Disease Classification
A Note on Upper Bounds for Some Generalized Folkman Numbers
Deep Generative Adversarial Neural Networks for Realistic Prostate Lesion MRI Synthesis
Deep Transfer in Reinforcement Learning by Language Grounding
Digit Serial Methods with Applications to Division and Square Root (with mechanically checked correctness proofs)
Improved Algorithms for Scheduling Unsplittable Flows on Paths
On A Conjecture Regarding Permutations Which Destroy Arithmetic Progressions
Efficient Estimation in Convex Single Index Models
Large-Scale Low-Rank Matrix Learning with Nonconvex Regularizers
Adaptive Hierarchical Clustering Using Ordinal Queries
Borel-Padé re-summation of the $β$-functions describing Anderson localisation in the Wigner-Dyson symmetry classes
Parallel Tracking and Verifying: A Framework for Real-Time and High Accuracy Visual Tracking
Neural Rating Regression with Abstractive Tips Generation for Recommendation
Image Denoising via CNNs: An Adversarial Approach
Enhancing the Input Representation: From Complexity to Simplicity
Towards Vision-Based Smart Hospitals: A System for Tracking and Monitoring Hand Hygiene Compliance
A Locally Weighted Fixation Density-Based Metric for Assessing the Quality of Visual Saliency Predictions
PROBE-GK: Predictive Robust Estimation using Generalized Kernels
PROBE: Predictive Robust Estimation for Visual-Inertial Navigation
Fate of topological states and mobility edges in one-dimensional slowly varying incommensurate potentials
An Investigation into the Pedagogical Features of Documents
Model-based learning of local image features for unsupervised texture segmentation
Real-time Deep Video Deinterlacing
Distributed multi-agent Gaussian regression via Karhunen-Loève expansions
Some new sufficient conditions for $2p$-Hamilton-biconnectedness of graphs
Video Object Segmentation with Re-identification
Switching Convolutional Neural Network for Crowd Counting
Dynamic Linear Discriminant Analysis in High Dimensional Space
Sharp vertical Littlewood–Paley inequalities for heat flows in weighted $L^2$ spaces
Linear Volterra backward stochastic differential equations
Energy-Efficient Data Collection in UAV Enabled Wireless Sensor Network
Learning to Hallucinate Face Images via Component Generation and Enhancement
Fast Preprocessing for Robust Face Sketch Synthesis
CREST: Convolutional Residual Learning for Visual Tracking
HMM-based Indic Handwritten Word Recognition using Zone Segmentation
Locating any two vertices on Hamiltonian cycles
Pulse-Based Control Using Koopman Operator Under Parametric Uncertainty
The survival probability of critical and subcritical branching processes in finite state space Markovian environment
Assigning peaks and modeling ETD in top-down mass spectrometry
Evolutionary game of N competing AIMD connections
An Efficient Algorithm for Mixed Domination on Generalized Series-Parallel Graphs
Improving Part-of-Speech Tagging for NLP Pipelines
CNN Cascades for Segmenting Whole Slide Images of the Kidney
Application of machine learning for hematological diagnosis
Robust Principal Component Analysis by Manifold Optimization
Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy
Exhaustive search of convex pentagons which tile the plane
Distributed Approximation of Maximum Independent Set and Maximum Matching
Learning Deep Convolutional Embeddings for Face Representation Using Joint Sample- and Set-based Supervision
Dual Motion GAN for Future-Flow Embedded Video Prediction
On the density of sets avoiding parallelohedron distance 1
Mean and Variance of Phylogenetic Trees
Best Viewpoint Tracking for Camera Mounted on Robotic Arm with Dynamic Obstacles
A tanglegram Kuratowski theorem
Nonlinear Backward Stochastic Evolutionary Equations Driven by a Space-Time White Noise
Exact Approaches for the Travelling Thief Problem
Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin
An Inverse Normal Transformation Solution for the comparison of two samples that contain both paired observations and independent observations
Estimation of population size when capture probability depends on individual states
Exceptional Scattered Polynomials
Experimental Demonstration of Dual Polarization Nonlinear Frequency Division Multiplexed Optical Transmission System
Implementing a fog/cloud architecture for supporting stream data management
A Watershed Delineation Algorithm for 2D Flow Direction Grids
Learning the kernel matrix by resampling
Self-Supervised Learning for Spinal MRIs
Hand2Face: Automatic Synthesis and Recognition of Hand Over Face Occlusions
Segmentation of Glioma Tumors in Brain Using Deep Convolutional Neural Network
A Continuously Growing Dataset of Sentential Paraphrases
On the $E$-polynomial of parabolic $\mathrm{Sp}_{2n}$-character varieties
Self-avoiding walk on $\mathbb{Z}^2$ with Yang-Baxter weights: universality of critical fugacity and 2-point function
Momo: Monocular Motion Estimation on Manifolds
Depth Super-Resolution Meets Uncalibrated Photometric Stereo
Classification of lattice polytopes with small volumes
A Generative Parser with a Discriminative Recognition Algorithm
Deriving Verb Predicates By Clustering Verbs with Arguments
Impact of different time series aggregation methods on optimal energy system design
Wiretap Channels with Causal State Information: Strong Secrecy
Domination and fractional domination in digraphs
Fast Exact Conformalization of Lasso using Piecewise Linear Homotopy
Breaking the curse of dimensionality in regression