Streaming Word Embeddings with the Space-Saving Algorithm

We develop a streaming (one-pass, bounded-memory) word embedding algorithm based on the canonical skip-gram with negative sampling algorithm implemented in word2vec. We compare our streaming algorithm to word2vec empirically by measuring the cosine similarity between word pairs under each algorithm and by applying each algorithm in the downstream task of hashtag prediction on a two-month interval of the Twitter sample stream. We then discuss the results of these experiments, concluding they provide partial validation of our approach as a streaming replacement for word2vec. Finally, we discuss potential failure modes and suggest directions for future work.

Learning from Ontology Streams with Semantic Concept Drift

Data stream learning has been largely studied for extracting knowledge structures from continuous and rapid data records. In the semantic Web, data is interpreted in ontologies and its ordered sequence is represented as an ontology stream. Our work exploits the semantics of such streams to tackle the problem of concept drift i.e., unexpected changes in data distribution, causing most of models to be less accurate as time passes. To this end we revisited (i) semantic inference in the context of supervised stream learning, and (ii) models with semantic embeddings. The experiments show accurate prediction with data from Dublin and Beijing.

Sufficient Markov Decision Processes with Alternating Deep Neural Networks

Advances in mobile computing technologies have made it possible to monitor and apply data-driven interventions across complex systems in real time. Markov decision processes (MDPs) are the primary model for sequential decision problems with a large or indefinite time horizon. Choosing a representation of the underlying decision process that is both Markov and low-dimensional is non-trivial. We propose a method for constructing a low-dimensional representation of the original decision process for which: 1. the MDP model holds; 2. a decision strategy that maximizes mean utility when applied to the low-dimensional representation also maximizes mean utility when applied to the original process. We use a deep neural network to define a class of potential process representations and estimate the process of lowest dimension within this class. The method is illustrated using data from a mobile study on heavy drinking and smoking among college students.

Decision Stream: Cultivating Deep Decision Trees

Various modifications of decision trees have been extensively used during the past years due to their high efficiency and interpretability. Selection of relevant features for spitting the tree nodes is a key property of their architecture, at the same time being their major shortcoming: the recursive nodes partitioning leads to geometric reduction of data quantity in the leaf nodes, which causes an excessive model complexity and data overfitting. In this paper, we present a novel architecture – a Decision Stream, – aimed to overcome this problem. Instead of building an acyclic tree structure during the training process, we propose merging nodes from different branches based on their similarity that is estimated with two-sample test statistics. To evaluate the proposed solution, we test it on several common machine learning problems~— credit scoring, twitter sentiment analysis, aircraft flight control, MNIST and CIFAR image classification, synthetic data classification and regression. Our experimental results reveal that the proposed approach significantly outperforms the standard decision tree method on both regression and classification tasks, yielding a prediction error decrease up to 35%.

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning

Computer programs written in one language are often required to be ported to other languages to support multiple devices and environments. When programs use language specific APIs (Application Programming Interfaces), it is very challenging to migrate these APIs to the corresponding APIs written in other languages. Existing approaches mine API mappings from projects that have corresponding versions in two languages. They rely on the sparse availability of bilingual projects, thus producing a limited number of API mappings. In this paper, we propose an intelligent system called DeepAM for automatically mining API mappings from a large-scale code corpus without bilingual projects. The key component of DeepAM is based on the multimodal sequence to sequence learning architecture that aims to learn joint semantic representations of bilingual API sequences from big source code data. Experimental results indicate that DeepAM significantly increases the accuracy of API mappings as well as the number of API mappings, when compared with the state-of-the-art approaches.

Thinking Fast and Slow: Optimization Decomposition Across Timescales

Many real-world control systems, such as the smart grid and human sensorimotor control systems, have decentralized components that react quickly using local information and centralized components that react slowly using a more global view. This paper seeks to provide a theoretical framework for how to design controllers that are decomposed across timescales in this way. The framework is analogous to how the network utility maximization framework uses optimization decomposition to distribute a global control problem across independent controllers, each of which solves a local problem; except our goal is to decompose a global problem temporally, extracting a timescale separation. Our results highlight that decomposition of a multi-timescale controller into a fast timescale, reactive controller and a slow timescale, predictive controller can be near-optimal in a strong sense. In particular, we exhibit such a design, named Multi-timescale Reflexive Predictive Control (MRPC), which maintains a per-timestep cost within a constant factor of the offline optimal in an adversarial setting.

Introspective Generative Modeling: Decide Discriminatively

We study unsupervised learning by developing introspective generative modeling (IGM) that attains a generator using progressively learned deep convolutional neural networks. The generator is itself a discriminator, capable of introspection: being able to self-evaluate the difference between its generated samples and the given training data. When followed by repeated discriminative learning, desirable properties of modern discriminative classifiers are directly inherited by the generator. IGM learns a cascade of CNN classifiers using a synthesis-by-classification algorithm. In the experiments, we observe encouraging results on a number of applications including texture modeling, artistic style transferring, face modeling, and semi-supervised learning.

Localization and transport in a strongly driven Anderson insulator

Multifractal metal in a disordered Josephson Junctions Array

Predicting Native Language from Gaze

Prominent Object Detection and Recognition: A Saliency-based Pipeline

Strictly Balancing Matrices in Polynomial Time Using Osborne’s Iteration

BDSAR: a new package on Bregman divergence for Bayesian simultaneous autoregressive models

Ruminating Reader: Reasoning with Gated Multi-Hop Attention

Destructive Impact of Molecular Noise on Nanoscale Electrochemical Oscillators

Is there a Teichmüller principle in higher dimensions?

Recognizing Descriptive Wikipedia Categories for Historical Figures

A Challenge Set Approach to Evaluating Machine Translation

Active Bias: Training a More Accurate Neural Network by Emphasizing High Variance Samples

Can Saliency Information Benefit Image Captioning Models?

Detecting English Writing Styles For Non Native Speakers

Entanglement between random and clean quantum spin chains

Controllability of the impulsive semi linear beam equation with memory and delay

On Prediction and Tolerance Intervals for Dynamic Treatment Regimes

Dimer models on cylinders over Dynkin diagrams and cluster algebras

A generic approach to nonparametric function estimation with mixed data

Denoising Linear Models with Permuted Data

Polynomial Norms

GaKCo: a Fast GApped k-mer string Kernel using COunting

The plasticity of some mass transportation networks in the three dimensional Euclidean Space

Strong approximation of density dependent Markov chains on bounded domains

Continuously Differentiable Exponential Linear Units

Bootstrapping Graph Convolutional Neural Networks for Autism Spectrum Disorder Classification

Multi-Task Video Captioning with Video and Entailment Generation

A Context Aware and Video-Based Risk Descriptor for Cyclists

How Close Can I Be? – A Comprehensive Analysis of Cellular Interference on ATC Radar

Covering Uncertain Points in a Tree

Leveraging Patient Similarity and Time Series Data in Healthcare Predictive Models

PPMF: A Patient-based Predictive Modeling Framework for Early ICU Mortality Prediction

Persistence in Stochastic Lotka–Volterra food chains with intraspecific competition

A Labeling-Free Approach to Supervising Deep Neural Networks for Retinal Blood Vessel Segmentation

Learning of Human-like Algebraic Reasoning Using Deep Feedforward Neural Networks

Dynamic Model Selection for Prediction Under a Budget

Some Like it Hoax: Automated Fake News Detection in Social Networks

Scalable Planning with Tensorflow for Hybrid Nonlinear Domains

Bayes model selection

Deep Over-sampling Framework for Classifying Imbalanced Data

Stein Variational Gradient Descent as Gradient Flow

Exponential Change of Measure for General Piecewise Deterministic Markov Processes

High dimensional confounding adjustment using continuous spike and slab priors

Abstract Syntax Networks for Code Generation and Semantic Parsing

Path Planning with Kinematic Constraints for Robot Groups

Popular Matching with Lower Quotas

Semi-supervised Bayesian Deep Multi-modal Emotion Recognition

Beyond WYSIWYG: Sharing Contextual Sensing Data Through mmWave V2V Communications

A relevance-scalability-interpretability tradeoff with temporally evolving user personas

Molecular De Novo Design through Deep Reinforcement Learning

Adversarial Multi-Criteria Learning for Chinese Word Segmentation

General thinning characterizations of distributions and point processes

Sharing deep generative representation for perceived image reconstruction from human brain activity

Towards a quality metric for dense light fields

Estimating Sparse Signals Using Integrated Wideband Dictionaries

Skeleton-based Action Recognition with Convolutional Neural Networks

Learning Agents in Black-Scholes Financial Markets: Consensus Dynamics and Volatility Smiles

Mapping and discrimination of networks in the complexity-entropy plane

Surrogate-based artifact removal from single-channel EEG

Benefits of spatio-temporal modelling for short term wind power forecasting at both individual and aggregated levels

Spectral Methods – Part 1: A fast and accurate approach for solving nonlinear diffusive problems

A Survey on MIMO Transmission with Discrete Input Signals: Technical Challenges, Advances, and Future Trends

Joint Transmit and Receive Filter Optimization for Sub-Nyquist Wireless Channel Estimation

Controlling percolation with limited resources

Optimal Demand-Side Management for Joint Privacy-Cost Optimization with Energy Storage

Joint POS Tagging and Dependency Parsing with Transition-based Neural Networks

Optical Non-Orthogonal Multiple Access for Visible Light Communication

Tapping the sensorimotor trajectory

280 Birds with One Stone: Inducing Multilingual Taxonomies from Wikipedia using Character-level Classification

Indexing Weighted Sequences: Neat and Efficient

Taxonomy Induction using Hypernym Subsequences

Wireless Surveillance of Two-Hop Communications

Joint Layout Estimation and Global Multi-View Registration for Indoor Reconstruction

Controlling the Error on Target Motion through Real-time Mesh Adaptation: Applications to Deep Brain Stimulation

Violation of the sphericity assumption and its effect on Type-I error rates in repeated measures ANOVA and multi-level linear models (MLM)

Reply to Hicks et al 2017, Reply to Morrison et al 2016 Refining the relevant population in forensic voice comparison, Reply to Hicks et al 2015 The importance of distinguishing info from evidence/observations when formulating propositions

Bayesian nonparametric estimation of multivariate survival functions

Fast Space Optimal Leader Election in Population Protocols

Graph Sampling for Covariance Estimation

An All-Pair Approach for Big Data Multiclass Classification with Quantum SVM

On the spherical convexity of quadratic functions

Some New Balanced and Almost Balanced Quaternary Sequences with Low Autocorrelation

Single-Pass PCA of Large High-Dimensional Data

Arrangements of pseudocircles on surfaces

Coding for Arbitrarily Varying Remote Sources

Perivascular Spaces Segmentation in Brain MRI Using Optimal 3D Filtering

An efficient data structure for counting all linear extensions of a poset, calculating its jump number, and the likes

Inception Recurrent Convolutional Neural Network for Object Recognition

Succinct Approximate Rank Queries

Masked Signal Decomposition Using Subspace Representation and Its Applications

Residual $q$-Fano Planes and Related Structures

System of unbiased representatives for a collection of bicolorings

Complete diagrammatics of the single ring theorem

Speeding up Convolutional Neural Networks By Exploiting the Sparsity of Rectifier Units

Geometric Nature Rules Structure/Potential-Energy-Surface Correspondence

Lattices and quadratic forms from tight frames in Euclidean spaces

Smoothness-constrained model for nonparametric item response theory

High storage capacity in the Hopfield model with auto-interactions – stability analysis

Secure Transmission with Large Numbers of Antennas and Finite Alphabet Inputs

Fine-Grained Entity Typing with High-Multiplicity Assignments

Alternating Sign Matrices and Hypermatrices, and a Generalization of Latin Square

Joint Sequence Learning and Cross-Modality Convolution for 3D Biomedical Segmentation

User Profile Based Research Paper Recommendation

Email Babel: Does Language Affect Criminal Activity in Compromised Webmail Accounts?

Low-Complexity Robust MISO Downlink Precoder Design With Per-Antenna Power Constraints

A lower bound on the differential entropy of log-concave random vectors with applications

Tractable Dual Optimal Stochastic Model Predictive Control: An Example in Healthcare

Performance of Model Predictive Control of POMDPs

Constructing and Counting Hexaflexagons

Magnetic Field Dependence of Spin Glass Free Energy Barriers

Matching microscopic and macroscopic responses in glasses

Tight bounds on the coefficients of partition functions via stability

Finding Exogenous Variation in Data

On variance estimation for generalizing from a trial to a target population

FWDA: a Fast Wishart Discriminant Analysis with its Application to Electronic Health Records Data Classification

Concave Flow on Small Depth Directed Networks

Arabidopsis roots segmentation based on morphological operations and CRFs

Ribbon graphs and the fundamental group of surfaces

Limitations on Transversal Computation through Quantum Homomorphic Encryption

Invariant Measures for TASEP with a Slow Bond

Variation of ionic conductivity in a plastic-crystalline mixture

SfM-Net: Learning of Structure and Motion from Video

A Note on Experiments and Software For Multidimensional Order Statistics

A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates

Hand Keypoint Detection in Single Images using Multiview Bootstrapping

Unsupervised Learning of Depth and Ego-Motion from Video

The Use of Multidimensional RAS Method in Input-Output Matrix Estimation

Introspective Classifier Learning: Empower Generatively