GPU-based Commonsense Paradigms Reasoning for Real-Time Query Answering and Multimodal Analysis

We utilize commonsense knowledge bases to address the problem of real- time multimodal analysis. In particular, we focus on the problem of multimodal sentiment analysis, which consists in the simultaneous analysis of different modalities, e.g., speech and video, for emotion and polarity detection. Our approach takes advantages of the massively parallel processing power of modern GPUs to enhance the performance of feature extraction from different modalities. In addition, in order to extract important textual features from multimodal sources we generate domain-specific graphs based on commonsense knowledge and apply GPU-based graph traversal for fast feature detection. Then, powerful ELM classifiers are applied to build the sentiment analysis model based on the extracted features. We conduct our experiments on the YouTube dataset and achieve an accuracy of 78% which outperforms all previous systems. In term of processing speed, our method shows improvements of several orders of magnitude for feature extraction compared to CPU-based counterparts.

Talent Flow Analytics in Online Professional Network

Analyzing job hopping behavior is important for the understanding of job preference and career progression of working individuals. When analyzed at the workforce population level, job hop analysis helps to gain insights of talent flow among different jobs and organizations. Traditionally, surveys are conducted on job seekers and employers to study job hop behavior. Beyond surveys, job hop behavior can also be studied in a highly scalable and timely manner using a data driven approach in response to fast-changing job landscape. Fortunately, the advent of online professional networks (OPNs) has made it possible to perform a large-scale analysis of talent flow. In this paper, we present a new data analytics framework to analyze the talent flow patterns of close to 1 million working professionals from three different countries/regions using their publicly-accessible profiles in an established OPN. As OPN data are originally generated for professional networking applications, our proposed framework re-purposes the same data for a different analytics task. Prior to performing job hop analysis, we devise a job title normalization procedure to mitigate the amount of noise in the OPN data. We then devise several metrics to measure the amount of work experience required to take up a job, to determine that existence duration of the job (also known as the job age), and the correlation between the above metric and propensity of hopping. We also study how job hop behavior is related to job promotion/demotion. Lastly, we perform connectivity analysis at job and organization levels to derive insights on talent flow as well as job and organizational competitiveness.

Recurrent Neural Networks for Long and Short-Term Sequential Recommendation

Recommender systems objectives can be broadly characterized as modeling user preferences over short-or long-term time horizon. A large body of previous research studied long-term recommendation through dimensionality reduction techniques applied to the historical user-item interactions. A recently introduced session-based recommendation setting highlighted the importance of modeling short-term user preferences. In this task, Recurrent Neural Networks (RNN) have shown to be successful at capturing the nuances of user’s interactions within a short time window. In this paper, we evaluate RNN-based models on both short-term and long-term recommendation tasks. Our experimental results suggest that RNNs are capable of predicting immediate as well as distant user interactions. We also find the best performing configuration to be a stacked RNN with layer normalization and tied item embeddings.

Algorithm Selection for Collaborative Filtering: the influence of graph metafeatures and multicriteria metatargets

To select the best algorithm for a new problem is an expensive and difficult task. However, there are automatic solutions to address this problem: using Metalearning, which takes advantage of problem characteristics (i.e. metafeatures), one is able to predict the relative performance of algorithms. In the Collaborative Filtering scope, recent works have proposed diverse metafeatures describing several dimensions of this problem. Despite interesting and effective findings, it is still unknown whether these are the most effective metafeatures. Hence, this work proposes a new set of graph metafeatures, which approach the Collaborative Filtering problem from a Graph Theory perspective. Furthermore, in order to understand whether metafeatures from multiple dimensions are a better fit, we investigate the effects of comprehensive metafeatures. These metafeatures are a selection of the best metafeatures from all existing Collaborative Filtering metafeatures. The impact of the most representative metafeatures is investigated in a controlled experimental setup. Another contribution we present is the use of a Pareto-Efficient ranking procedure to create multicriteria metatargets. These new rankings of algorithms, which take into account multiple evaluation measures, allow to explore the algorithm selection problem in a fairer and more detailed way. According to the experimental results, the graph metafeatures are a good alternative to related work metafeatures. However, the results have shown that the feature selection procedure used to create the comprehensive metafeatures is is not effective, since there is no gain in predictive performance. Finally, an extensive metaknowledge analysis was conducted to identify the most influential metafeatures.

RAIM: Recurrent Attentive and Intensive Model of Multimodal Patient Monitoring Data

With the improvement of medical data capturing, vast amount of continuous patient monitoring data, e.g., electrocardiogram (ECG), real-time vital signs and medications, become available for clinical decision support at intensive care units (ICUs). However, it becomes increasingly challenging to model such data, due to high density of the monitoring data, heterogeneous data types and the requirement for interpretable models. Integration of these high-density monitoring data with the discrete clinical events (including diagnosis, medications, labs) is challenging but potentially rewarding since richness and granularity in such multimodal data increase the possibilities for accurate detection of complex problems and predicting outcomes (e.g., length of stay and mortality). We propose Recurrent Attentive and Intensive Model (RAIM) for jointly analyzing continuous monitoring data and discrete clinical events. RAIM introduces an efficient attention mechanism for continuous monitoring data (e.g., ECG), which is guided by discrete clinical events (e.g, medication usage). We apply RAIM in predicting physiological decompensation and length of stay in those critically ill patients at ICU. With evaluations on MIMIC- III Waveform Database Matched Subset, we obtain an AUC-ROC score of 90.18% for predicting decompensation and an accuracy of 86.82% for forecasting length of stay with our final model, which outperforms our six baseline models.

Supporting Very Large Models using Automatic Dataflow Graph Partitioning

There is a trend towards using very large deep neural networks (DNN) to improve the accuracy of complex machine learning tasks. However, the size of DNN models that can be explored today is limited by the amount of GPU device memory. This paper presents Tofu, a system for partitioning very large DNN models across multiple GPU devices. Tofu is designed for a tensor-based dataflow system: for each operator in the dataflow graph, it partitions its input/output tensors and parallelizes its execution across workers. Tofu can automatically discover how each operator can be partitioned by analyzing its semantics expressed in a simple specification language. Tofu uses a search algorithm based on dynamic programming to determine the best partition strategy for each operator in the entire dataflow graph. Our experiments on an 8-GPU machine show that Tofu enables the training of very large CNN and RNN models. It also achieves better performance than alternative approaches to train very large models on multiple GPUs.

A Structured Perspective of Volumes on Active Learning

Active Learning (AL) is a learning task that requires learners interactively query the labels of the sampled unlabeled instances to minimize the training outputs with human supervisions. In theoretical study, learners approximate the version space which covers all possible classification hypothesis into a bounded convex body and try to shrink the volume of it into a half-space by a given cut size. However, only the hypersphere with finite VC dimensions has obtained formal approximation guarantees that hold when the classes of Euclidean space are separable with a margin. In this paper, we approximate the version space to a structured {hypersphere} that covers most of the hypotheses, and then divide the available AL sampling approaches into two kinds of strategies: Outer Volume Sampling and Inner Volume Sampling. After providing provable guarantees for the performance of AL in version space, we aggregate the two kinds of volumes to eliminate their sampling biases via finding the optimal inscribed hyperspheres in the enclosing space of outer volume. To touch the version space from Euclidean space, we propose a theoretical bridge called Volume-based Model that increases the `sampling target-independent’. In non-linear feature space, spanned by kernel, we use sequential optimization to globally optimize the original space to a sparse space by halving the size of the kernel space. Then, the EM (Expectation Maximization) model which returns the local center helps us to find a local representation. To describe this process, we propose an easy-to-implement algorithm called Volume-based AL (VAL).

Anomaly Detection of Complex Networks Based on Intuitionistic Fuzzy Set Ensemble

Ensemble learning for anomaly detection of data structured into complex network has been barely studied due to the inconsistent performance of complex network characteristics and lack of inherent objective function. In this paper, we propose the IFSAD, a new two-phase ensemble method for anomaly detection based on intuitionistic fuzzy set, and applies it to the abnormal behavior detection problem in temporal complex networks. First, it constructs the intuitionistic fuzzy set of single network characteristic which quantifies the degree of membership, non-membership and hesitation of each of network characteristic to the defined linguistic variables so that makes the unuseful or noise characteristics become part of the detection. To build an objective intuitionistic fuzzy relationship, we propose an Gaussian distribution-based membership function which gives a variable hesitation degree. Then, for the fuzzification of multiple network characteristics, the intuitionistic fuzzy weighted geometric operator is adopted to fuse multiple IFSs and to avoid the inconsistent of multiple characteristics. Finally, the score function and precision function are used to sort the fused IFS. Finally we carried out extensive experiments on several complex network datasets for anomaly detection, and the results demonstrate the superiority of our method to state-of-the-art approaches, validating the effectiveness of our method.

Anomaly detection in static networks using egonets

Network data has rapidly emerged as an important and active area of statistical methodology. In this paper we consider the problem of anomaly detection in networks. Given a large background network, we seek to detect whether there is a small anomalous subgraph present in the network, and if such a subgraph is present, which nodes constitute the subgraph. We propose an inferential tool based on egonets to answer this question. The proposed method is computationally efficient and naturally amenable to parallel computing, and easily extends to a wide variety of network models. We demonstrate through simulation studies that the egonet method works well under a wide variety of network models. We obtain some fascinating empirical results by applying the egonet method on several well-studied benchmark datasets.

A Note on Clustering Aggregation

We consider the clustering aggregation problem in which we are given a set of clusterings and want to find an aggregated clustering which minimizes the sum of mismatches to the input clusterings. In the binary case (each clustering is a bipartition) this problem was known to be NP-hard under Turing reduction. We strengthen this result by providing a polynomial-time many-one reduction. Our result also implies that no 2^{o(n)} \cdot |I|^{O(1)}-time algorithm exists for any clustering instance I with n elements, unless the Exponential Time Hypothesis fails. On the positive side, we show that the problem is fixed-parameter tractable with respect to the number of input clusterings.

Asymptotically Optimal Quickest Change Detection In Multistream Data – Part 1: General Stochastic Models

Assume that there are multiple data streams (channels, sensors) and in each stream the process of interest produces generally dependent and non-identically distributed observations. When the process is in a normal mode (in-control), the (pre-change) distribution is known, but when the process becomes abnormal there is a parametric uncertainty, i.e., the post-change (out-of-control) distribution is known only partially up to a parameter. Both the change point and the post-change parameter are unknown. Moreover, the change affects an unknown subset of streams, so that the number of affected streams and their location are unknown in advance. A good changepoint detection procedure should detect the change as soon as possible after its occurrence while controlling for a risk of false alarms. We consider a Bayesian setup with a given prior distribution of the change point and propose two sequential mixture-based change detection rules, one mixes a Shiryaev-type statistic over both the unknown subset of affected streams and the unknown post-change parameter and another mixes a Shiryaev-Roberts-type statistic. These rules generalize the mixture detection procedures studied by Tartakovsky (2018) in a single-stream case. We provide sufficient conditions under which the proposed multistream change detection procedures are first-order asymptotically optimal with respect to moments of the delay to detection as the probability of false alarm approaches zero.

Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need!

Argumentation mining (AM) requires the identification of complex discourse structures and has lately been applied with success monolingually. In this work, we show that the existing resources are, however, not adequate for assessing cross-lingual AM, due to their heterogeneity or lack of complexity. We therefore create suitable parallel corpora by (human and machine) translating a popular AM dataset consisting of persuasive student essays into German, French, Spanish, and Chinese. We then compare (i) annotation projection and (ii) bilingual word embeddings based direct transfer strategies for cross-lingual AM, finding that the former performs considerably better and almost eliminates the loss from cross-lingual transfer. Moreover, we find that annotation projection works equally well when using either costly human or cheap machine translations. Our code and data are available at \url{http://…/coling2018-xling_argument_mining}.

Collective Matrix Completion

Matrix completion aims to reconstruct a data matrix based on observations of a small number of its entries. Usually in matrix completion a single matrix is considered, which can be, for example, a rating matrix in recommendation system. However, in practical situations, data is often obtained from multiple sources which results in a collection of matrices rather than a single one. In this work, we consider the problem of collective matrix completion with multiple and heterogeneous matrices, which can be count, binary, continuous, etc. We first investigate the setting where, for each source, the matrix entries are sampled from an exponential family distribution. Then, we relax the assumption of exponential family distribution for the noise and we investigate the distribution-free case. In this setting, we do not assume any specific model for the observations. The estimation procedures are based on minimizing the sum of a goodness-of-fit term and the nuclear norm penalization of the whole collective matrix. We prove that the proposed estimators achieve fast rates of convergence under the two considered settings and we corroborate our results with numerical experiments.

Uncertainty Modelling in Deep Networks: Forecasting Short and Noisy Series

Deep Learning is a consolidated, state-of-the-art Machine Learning tool to fit a function when provided with large data sets of examples. However, in regression tasks, the straightforward application of Deep Learning models provides a point estimate of the target. In addition, the model does not take into account the uncertainty of a prediction. This represents a great limitation for tasks where communicating an erroneous prediction carries a risk. In this paper we tackle a real-world problem of forecasting impending financial expenses and incomings of customers, while displaying predictable monetary amounts on a mobile app. In this context, we investigate if we would obtain an advantage by applying Deep Learning models with a Heteroscedastic model of the variance of a network’s output. Experimentally, we achieve a higher accuracy than non-trivial baselines. More importantly, we introduce a mechanism to discard low-confidence predictions, which means that they will not be visible to users. This should help enhance the user experience of our product.

Constraint-Based Visual Generation
Every square can be tiled with T-tetrominos and no more than 5 monominos
The Power of One Clean Qubit in Communication Complexity
$S_{12}$ and $P_{12}$-colorings of cubic graphs
On the Geodetic Hull Number of Complementary Prisms
Finite Time Adaptive Stabilization of LQ Systems
Mitigation of Human RF Exposure in 5G Downlink
Sleep Staging by Modeling Sleep Stage Transitions using Deep CRF
Human peripheral blur is optimal for object recognition
Necessary and Sufficient Topological Conditions for Identifiability of Dynamical Networks
A Cognitive Sub-Nyquist MIMO Radar Prototype
Clearing noisy annotations for computed tomography imaging
Identity Preserving Face Completion for Large Ocular Region Occlusion
Maximum rank-distance codes with maximum left and right idealisers
Peeking Behind Objects: Layered Depth Prediction from a Single Image
Two Algorithms to Find Primes in Patterns
Fast Vessel Segmentation and Tracking in Ultra High-Frequency Ultrasound Images
A Study on the Strong Duality of Conic Relaxation of AC Optimal Power Flow in Radial Networks
Dynamics of Langton’s ant allowed to periodically go straight
PCNNA: A Photonic Convolutional Neural Network Accelerator
Theta-vexillary signed permutations
Runoff on rooted trees
Time and place of the maximum for one-dimensional diffusion bridges and meanders
A Faster Deterministic Distributed Algorithm for Weighted APSP Through Pipelining
Hierarchical Classification using Binary Data
Stable Multiple Time Step Simulation/Prediction from Lagged Dynamic Network Regression Models
Fisher Information and Logarithmic Sobolev Inequality for Matrix Valued Functions
Characterizing health informatics journals by subject-level dependencies: a citation network analysis
Lesion segmentation using U-Net network
The $g$-good neighbor conditional diagnosability of locally exchanged twisted cubes
Complex self-sustained oscillation patterns in modular excitable networks
Exact solution of some quarter plane walks with interacting boundaries
Weak in the NEES?: Auto-tuning Kalman Filters with Bayesian Optimization
Toward a language-theoretic foundation for planning and filtering
StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction
One for All, All for One: A Heterogeneous Data Plane for Flexible P4 Processing
On the number of simultaneous core partitions with $d$-distinct parts
Proximal Averages for Minimization of Entropy Functionals
Top-Down Feedback for Crowd Counting Convolutional Neural Network
Sublinear Algorithms for $(Δ+ 1)$ Vertex Coloring
An Efficient System for Subgraph Discovery
Skin Lesion Segmentation Using Atrous Convolution via DeepLab v3
ClusterNet: Instance Segmentation in RGB-D Images
State-space analysis of an Ising model reveals contributions of pairwise interactions to sparseness, fluctuation, and stimulus coding of monkey V1 neurons
Self-produced Guidance for Weakly-supervised Object Localization
Traffic-Aware Backscatter Communications in Wireless-Powered Heterogeneous Networks
Pilot Spoofing Attack by Multiple Eavesdroppers
Meta-Learning Priors for Efficient Online Bayesian Regression
Variation of a Signal in Schwarzschild Spacetime
Panchromatic Sharpening of Remote Sensing Images Using a Multi-scale Approach
The Variational Homoencoder: Learning to learn high capacity generative models from few examples
Competitive Inner-Imaging Squeeze and Excitation for Residual Network
A decision theoretic approach to model evaluation in computational drug discovery
Bivariate network meta-analysis for surrogate endpoint evaluation
CReaM: Condensed Real-time Models for Depth Prediction using Convolutional Neural Networks
Remark on Barnette’s Conjecture
SAAGs: Biased Stochastic Variance Reduction Methods
Combining Heterogeneously Labeled Datasets For Training Segmentation Networks
Semiparametric Slepian-Bangs Formula for Complex Elliptically Symmetric Distributions
On Brownian exit times from some non-convex domains
A Temporal Difference Reinforcement Learning Theory of Emotion: unifying emotion, cognition and adaptive behavior
Example Mining for Incremental Learning in Medical Imaging
Hyperspectral Images Classification Using Energy Profiles of Spatial and Spectral Features
Otem&Utem: Over- and Under-Translation Evaluation Metric for NMT
Dermoscopic Image Analysis for ISIC Challenge 2018
Weak input-to-state stability: characterizations and counterexamples
Convex computation of extremal invariant measures of nonlinear dynamical systems and Markov processes
The Double Sphere Camera Model
Space-Time Extension of the MEM Approach for Electromagnetic Neuroimaging
Note on the zero-free region of the hard-core model
Contrast function estimation for the drift parameter of ergodic jump diffusion process
Computational speedups using small quantum devices
Statistical Characterization of Second Order Scattering Fading Channels
Asymptotic Optimality of Mixture Rules for Detecting Changes in General Stochastic Models
On the equality of the induced matching number and the uniquely restricted matching number for subcubic graphs
Utility maximization for L{é}vy switching models
Strong convergence rates of modified truncated EM methods for neutral stochastic differential delay equations
Composite likelihood estimation for a Gaussian process under fixed domain asymptotics
Deep-CLASS at ISIC Machine Learning Challenge 2018
Spatial growth processes with long range dispersion: microscopics, mesoscopics, and discrepancy in spread rate
Stabilization of an unstable wave equation using an infinite dimensional dynamic controller
Speakers account for asymmetries in visual perspective so listeners don’t have to
Rule Based Metadata Extraction Framework from Academic Articles
Optimal control of resources for species survival
The periodic Schur process and free fermions at finite temperature
Exploring Tehran with excitable medium
On critical and maximal digraphs
Bounding the Number of Minimal Transversals in Tripartite 3-Uniform Hypergraphs
Behavior of the empirical Wasserstein distance in R^d under moment conditions
Connected greedy coloring $H$-free graphs
Dynamic Optimization of Thermodynamically Rigorous Models of Multiphase Flow in Porous Subsurface Oil Reservoirs
Likelihood-based meta-analysis with few studies: Empirical and simulation studies
Are RLL Codes Suitable for Simultaneous Energy and Information Transfer?
Mean asymptotics for a Poisson-Voronoi cell on a Riemannian manifold
Shortfall-Minimising Dispatch of Heterogeneous Stores and Application to Adequacy Studies
Transient Performance of Electric Power Networks under Colored Noise
On complexity of post-processing in analyzing GATE-driven X-ray spectrum
CaricatureShop: Personalized and Photorealistic Caricature Sketching
Height and contour processes of Crump-Mode-Jagers forests (II): The Bellman-Harris universality class
Remarks on the transcendence of certain infinite products
Feature Fusion through Multitask CNN for Large-scale Remote Sensing Image Segmentation
Learning Human Poses from Actions
On consistency and inconsistency of nonparametric tests
Optional Stopping with Bayes Factors: a categorization and extension of folklore results, with an application to invariant situations
An entropy minimization approach to second-order variational mean-field games
ISIC 2017 Skin Lesion Segmentation Using Deep Encoder-Decoder Network
The Möbius function of ${\rm PSU}(3,2^{2^n})$
Decision Variance in Online Learning
Revisiting the Challenges of MaxClique
Symplectic Isometries of Stabilizer Codes
Noncoherent Multi-User MIMO Communications using Covariance CSIT
Chromosome Painting
Moderate deviations for a stochastic Burgers equation
Cameron-Liebler line classes of ${\rm PG}(3,q)$ admitting ${\rm PGL}(2,q)$
Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition
Zeros of Holant problems: locations and algorithms
Projected Stochastic Gradients for Convex Constrained Problems in Hilbert Spaces
Minimum supports of functions on the Hamming graphs with spectral constrains
Symmetries in left-invariant optimal control problems
Inexact Variable Metric Stochastic Block-Coordinate Descent for Regularized Optimization
Collaborative double robustness using the $e$-score
The realization problem for discrete Morse functions on trees
Residual Network based Aggregation Model for Skin Lesion Classification
Coagulation-transport equations and the nested coalescents
QUEST: Quadriletral Senary bit Pattern for Facial Expression Recognition
The Soft Multivariate Truncated Normal Distribution
Robust Group Comparison Using Non-Parametric Block-Based Statistics
An argument in favor of strong scaling for deep neural networks with small datasets
Partial Person Re-identification with Alignment and Hallucination
Skin disease identification from dermoscopy images using deep convolutional neural network
Convolutional Simplex Projection Network (CSPN) for Weakly Supervised Semantic Segmentation
Robot Imitation through Vision, Kinesthetic and Force Features with Online Adaptation to Changing Environments
Strong randomness criticality in the scratched-XY model
PReMVOS: Proposal-generation, Refinement and Merging for Video Object Segmentation
Multicolumn Networks for Face Recognition
Chromatic transitions in the emergence of syntax networks
A convex formulation for Discrete Tomography
Self-Paced Learning with Adaptive Deep Visual Embeddings
Theoretical Perspective of Convergence Complexity of Evolutionary Algorithms Adopting Optimal Mixing
Face Mask Extraction in Video Sequence
Deterministic Fitting of Multiple Structures using Iterative MaxFS with Inlier Scale Estimation and Subset Updating
Hardware-In-The-Loop Vulnerability Analysis of a Single-Machine Infinite-Bus Power System
Multi-Class Lesion Diagnosis with Pixel-wise Classification Network
Combinatorics of the Deodhar decomposition of the Grassmannian
Deep Learning on Retina Images as Screening Tool for Diagnostic Decision Support
Improving pairwise comparison models using Empirical Bayes shrinkage
Hierarchical infinite factor model for improving the prediction of surgical complications for geriatric patients
Markov semi-groups associated with the complex unimodular group $Sl(2,\mathbb{C})$
Unsupervised Learning of Latent Physical Properties Using Perception-Prediction Networks
Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks
Likely equilibria of the stochastic Rivlin cube
GANimation: Anatomically-aware Facial Animation from a Single Image
Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision
Time Correlation Exponents in Last Passage Percolation