An Automated Text Categorization Framework based on Hyperparameter Optimization

The amount of textual data generated in environments such as social media, blogs, online newspapers, and so on, have attracted the attention of the scientific community in order to automatize and improve several tasks that were manually performed such as sentiment analysis, user profiling, or text categorization, just to mention a few. Fortunately, several of these activities can be posed as a classification problem, i.e., a problem where one is interested in developing a function, from a set of texts with associated labels, capable of predicting a label given an unseen text. In this contribution, we propose a text classifier, named \muTC. \muTC is composed of a number of easy to implement text transformation, text representation and a machine learning algorithm that produce a competitive classifier even over informal written text when these parts are correctly configured. We provide a detailed description of \muTC along with an extensive experimental comparison with the relevant state-of-the-art methods. \muTC was compared on 30 different datasets obtaining the best performance (regarding accuracy) in 18 of them. The different datasets include several problems like topic and polarity classification, spam detection, user profiling and authorship attribution. Furthermore, it is important to comment that our approach allows the usage of the technology even for users without knowledge of machine learning and natural language processing.

Supervised Deep Hashing for Hierarchical Labeled Data

Recently, hashing methods have been widely used in large-scale image retrieval. However, most existing hashing methods did not consider the hierarchical relation of labels, which means that they ignored the rich information stored in the hierarchy. Moreover, most of previous works treat each bit in a hash code equally, which does not meet the scenario of hierarchical labeled data. In this paper, we propose a novel deep hashing method, called supervised hierarchical deep hashing (SHDH), to perform hash code learning for hierarchical labeled data. Specifically, we define a novel similarity formula for hierarchical labeled data by weighting each layer, and design a deep convolutional neural network to obtain a hash code for each data point. Extensive experiments on several real-world public datasets show that the proposed method outperforms the state-of-the-art baselines in the image retrieval task.

Conceptualization Topic Modeling

Recently, topic modeling has been widely used to discover the abstract topics in text corpora. Most of the existing topic models are based on the assumption of three-layer hierarchical Bayesian structure, i.e. each document is modeled as a probability distribution over topics, and each topic is a probability distribution over words. However, the assumption is not optimal. Intuitively, it’s more reasonable to assume that each topic is a probability distribution over concepts, and then each concept is a probability distribution over words, i.e. adding a latent concept layer between topic layer and word layer in traditional three-layer assumption. In this paper, we verify the proposed assumption by incorporating the new assumption in two representative topic models, and obtain two novel topic models. Extensive experiments were conducted among the proposed models and corresponding baselines, and the results show that the proposed models significantly outperform the baselines in terms of case study and perplexity, which means the new assumption is more reasonable than traditional one.

Multivariate Count Autoregression

We are studying the problems of modeling and inference for multivariate count time series data with Poisson marginals. The focus is on linear and log-linear models. For studying the properties of such processes we develop a novel conceptual framework which is based on copulas. However, our approach does not impose the copula on a vector of counts; instead the joint distribution is determined by imposing a copula function on a vector of associated continuous random variables. This specific construction avoids conceptual difficulties resulting from the joint distribution of discrete random variables yet it keeps the properties of the Poisson process marginally. We employ Markov chain theory and the notion of weak dependence to study ergodicity and stationarity of the models we consider. We obtain easily verifiable conditions for both linear and log-linear models under both theoretical frameworks. Suitable estimating equations are suggested for estimating unknown model parameters. The large sample properties of the resulting estimators are studied in detail. The work concludes with some simulations and a real data example.

When is Network Lasso Accurate?

The network Lasso is a recently proposed method for clustering and optimization problems arising from massive network structured datasets, i.e., big data over networks. It is a variant of the well-known least absolute shrinkage and selection operator (Lasso), which is underlying many methods in learning and signal processing involving sparse models. While some work has been devoted to studying efficient and scalable implementations of the network Lasso, only little is known about conditions on the underlying network structure required by network Lasso to be accurate. We address this gap by giving precise conditions on the underlying network topology which guarantee the network lasso to be accurate.

Hierarchical Clustering: Objective Functions and Algorithms

Hierarchical clustering is a recursive partitioning of a dataset into clusters at an increasingly finer granularity. Motivated by the fact that most work on hierarchical clustering was based on providing algorithms, rather than optimizing a specific objective, Dasgupta framed similarity-based hierarchical clustering as a combinatorial optimization problem, where a `good’ hierarchical clustering is one that minimizes some cost function. He showed that this cost function has certain desirable properties. We take an axiomatic approach to defining `good’ objective functions for both similarity and dissimilarity-based hierarchical clustering. We characterize a set of ‘admissible’ objective functions (that includes Dasgupta’s one) that have the property that when the input admits a `natural’ hierarchical clustering, it has an optimal value. Equipped with a suitable objective function, we analyze the performance of practical algorithms, as well as develop better algorithms. For similarity-based hierarchical clustering, Dasgupta showed that the divisive sparsest-cut approach achieves an O(\log^{3/2} n)-approximation. We give a refined analysis of the algorithm and show that it in fact achieves an O(\sqrt{\log n})-approx. (Charikar and Chatziafratis independently proved that it is a O(\sqrt{\log n})-approx.). This improves upon the LP-based O(\log n)-approx. of Roy and Pokutta. For dissimilarity-based hierarchical clustering, we show that the classic average-linkage algorithm gives a factor 2 approx., and provide a simple and better algorithm that gives a factor 3/2 approx.. Finally, we consider `beyond-worst-case’ scenario through a generalisation of the stochastic block model for hierarchical clustering. We show that Dasgupta’s cost function has desirable properties for these inputs and we provide a simple 1 + o(1)-approximation in this setting.

Variance Based Moving K-Means Algorithm

Clustering is a useful data exploratory method with its wide applicability in multiple fields. However, data clustering greatly relies on initialization of cluster centers that can result in large intra-cluster variance and dead centers, therefore leading to sub-optimal solutions. This paper proposes a novel variance based version of the conventional Moving K-Means (MKM) algorithm called Variance Based Moving K-Means (VMKM) that can partition data into optimal homogeneous clusters, irrespective of cluster initialization. The algorithm utilizes a novel distance metric and a unique data element selection criteria to transfer the selected elements between clusters to achieve low intra-cluster variance and subsequently avoid dead centers. Quantitative and qualitative comparison with various clustering techniques is performed on four datasets selected from image processing, bioinformatics, remote sensing and the stock market respectively. An extensive analysis highlights the superior performance of the proposed method over other techniques.

TransNets: Learning to Transform for Recommendation

Recently, deep learning methods have been shown to improve the performance of recommender systems over traditional methods, especially when review text is available. For example, a recent model, DeepCoNN, uses neural nets to learn one latent representation for the text of all reviews written by a target user, and a second latent representation for the text of all reviews for a target item, and then combines these latent representations to obtain state-of-the-art performance on recommendation tasks. We show that (unsurprisingly) much of the predictive value of review text comes from reviews of the target user for the target item. We then introduce a way in which this information can be used in recommendation, even when the target user’s review for the target item is not available. Our model, called TransNets, extends the DeepCoNN model by introducing an additional latent layer representing the target user-target item pair. We then regularize this layer, at training time, to be similar to another latent representation of the target user’s review of the target item. We show that TransNets and extensions of it improve substantially over the previous state-of-the-art.

Adversarial Generator-Encoder Networks

We present a new autoencoder-type architecture, that is trainable in an unsupervised mode, sustains both generation and inference, and has the quality of conditional and unconditional samples boosted by adversarial learning. Unlike previous hybrids of autoencoders and adversarial networks, the adversarial game in our approach is set up directly between the encoder and the generator, and no external mappings are trained in the process of learning. The game objective compares the divergences of each of the real and the generated data distributions with the canonical distribution in the latent space. We show that direct generator-vs-encoder game leads to a tight coupling of the two components, resulting in samples and reconstructions of a comparable quality to some recently-proposed more complex architectures.

The quasiprobability behind the out-of-time-ordered correlator

A stable and optimally convergent LaTIn-Cut Finite Element Method for multiple unilateral contact problems

Strictly positive models for propensity scores

A Characterization of Undirected Graphs Admitting Optimal Cost Shares

A Delay-Aware Caching Algorithm for Wireless D2D Caching Networks

Random Access Analysis for Massive IoT Networks under A New Spatio-Temporal Model: A Stochastic Geometry Approach

An efficient algorithm for compression-based compressed sensing

Secure Transmission of Delay-Sensitive Data over Wireless Fading Channels

Optimizing Adiabatic Quantum Program Compilation using a Graph-Theoretic Framework

A Comparison of Parallel Graph Processing Benchmarks

DIMM-SC: A Dirichlet mixture model for clustering droplet-based single cell transcriptomic data

A Software-equivalent SNN Hardware using RRAM-array for Asynchronous Real-time Learning

Conflict-Free Coloring of Intersection Graphs of Geometric Objects

Associative content-addressable networks with exponentially many robust stable states

Minimum energy for linear systems with finite horizon: a non-standard Riccati equation

Using stacking to average Bayesian predictive distributions

Detecting optimality and extracting solutions in polynomial optimization with the truncated GNS construction

Treatment-Response Models for Counterfactual Reasoning with Continuous-time, Continuous-valued Interventions

Speech signals frequency modulation decoding via deep neural networks

Link Flow Correction For Inconsistent Traffic Flow Data Via $\ell_1$-Minimization

Optimal Las Vegas Locality Sensitive Data Structures

Angle-Based Joint and Individual Variation Explained

Probabilistic Recurrence Relations for Work and Span of Parallel Algorithms

Tree-based unrooted phylogenetic networks

Scaling limit of random forests with prescribed degree sequences

Computational Approaches for Zero Forcing and Related Problems

Ordering of bicyclic graphs by matching energy

Prediction with Dimension Reduction of Multiple Molecular Data Sources for Patient Survival

Convolutional Neural Pyramid for Image Processing

Distributed Average Tracking for Lipschitz-Type Nonlinear Dynamical Systems

Codes with Unequal Disjoint Local Erasure Correction Constraints

Conversation Modeling on Reddit using a Graph-Structured LSTM

Evolution in Groups: A deeper look at synaptic cluster driven evolution of deep neural networks

Continuous data assimilation for the magnetohydrodynamic equations in 2D using one component of the velocity and magnetic fields

‘RAPID’ Regions-of-Interest Detection In Big Histopathological Images

A second-order PHD filter with mean and variance in target number

A Zero Knowledge Sumcheck and its Applications

Precise Real-Time Navigation of LEO Satellites Using a Single-Frequency GPS Receiver and Ultra-Rapid Ephemerides

Axiomatisability and hardness for universal Horn classes of hypergraphs

Large deviations for i.i.d. replications of the total progeny of a Galton–Watson process

Generalized fractional Brownian motion

Total Variation Minimization in Compressed Sensing

Restricted Isometry Property of Gaussian Random Projection for Finite Set of Subspaces

Non-linear maximum rank distance codes in the cyclic model for the field reduction of finite geometries

Generalized Rank Pooling for Activity Recognition

Improving content marketing processes with the approaches by artificial intelligence

Partial Face Detection in the Mobile Domain

On joint weak convergence of partial sum and maxima processes

Jet Constituents for Deep Neural Network Based Top Quark Tagging

An entropic interpolation problem for incompressible viscid fluids

Modeling and Analysis of HetNets with mm-Wave Multi-RAT Small Cells Deployed Along Roads

$\boldsymbol{\mathbb{L}^p(p\ge2)}$-solutions of generalized BSDEs with jumps and monotone generator in a general filtration

Adposition Supersenses v2

Randomly stopped maximum and maximum of sums with consistently varying distributions

On Approximate Diagnosability of Nonlinear Systems

Asymptotic behaviour of non-isotropic random walks with heavy tails

Quantum ensembles of quantum classifiers

The Meaning Factory at SemEval-2017 Task 9: Producing AMRs with Neural Semantic Parsing

Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation

Dynamics of the extended Aubry-André-Harper model with localization transition

ReLayNet: Retinal Layer and Fluid Segmentation of Macular Optical Coherence Tomography using Fully Convolutional Network

Locally-adapted convolution-based super-resolution of irregularly-sampled ocean remote sensing data

Egocentric Video Description based on Temporally-Linked Sequences

Exchangeable pairs on Wiener chaos

Semi-Latent GAN: Learning to generate and modify facial images from attributes

Invasion probabilities, hitting times, and some fluctuation theory for the stochastic logistic process

The $\mathfrak{sl}_\infty$-crystal combinatorics of higher level Fock spaces

A backward Kolmogorov equation approach to compute means, moments and correlations of non-smooth stochastic dynamical systems

On the size of $k$-cross-free families

New Subquadratic Approximation Algorithms for the Girth

A Graphical method for simplifying Bayesian Games

Proportional Approval Voting, Harmonic k-median, and Negative Association

Optimality in cellular storage via the Pontryagin Maximum Principle

The (1+$λ$) Evolutionary Algorithm with Self-Adjusting Mutation Rate

Langevin dynamics for ramified structures

Could you guess an interesting movie from the posters?: An evaluation of vision-based features on movie poster database

Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor

Privacy-Preserving Visual Learning Using Doubly Permuted Homomorphic Encryption

High-Quality Correspondence and Segmentation Estimation for Dual-Lens Smart-Phone Portraits

Introducing Inner Nested Sampling

A Joint Quantile and Expected Shortfall Regression Framework

EELECTION at SemEval-2017 Task 10: Ensemble of nEural Learners for kEyphrase ClassificaTION

OBTAIN: Real-Time Beat Tracking in Audio Signals

Investigating Natural Image Pleasantness Recognition using Deep Features and Eye Tracking for Loosely Controlled Human-computer Interaction

Empirical best prediction for small area estimation using nonparametric maximum likelihood

Hand3D: Hand Pose Estimation using 3D Neural Network

Clothing and People – A Social Signal Processing Perspective

On the number of perfect lattices

On the First-Order Complexity of Induced Subgraph Isomorphism

Echantillonnage de signaux sur graphes via des processus déterminantaux

Modulation in the Air: Backscatter Communication over Ambient OFDM Carrier

Learned Watershed: End-to-End Learning of Seeded Segmentation

Recurrent Environment Simulators

Weak vs. Strong Disorder Superfluid-Bose Glass Transition in One Dimension

Vectorization of Hybrid Breadth First Search on the Intel Xeon Phi

A Converse Bound on Wyner-Ahlswede-Körner Network via Gray-Wyner Network

NILC-USP at SemEval-2017 Task 4: A Multi-view Ensemble for Twitter Sentiment Analysis

Axiomatization of an importance index for $k$-ary games

Deep Unsupervised Similarity Learning using Partially Ordered Sets

A Highly-Efficient Memory-Compression Scheme for GPU-Accelerated Intrusion Detection Systems

Spatial Alignment of Coding and Modulation Helps Content Delivery

GLoP: Enabling Massively Parallel Incident Response Through GPU Log Processing

DeepCoder: Semi-parametric Variational Autoencoders for Facial Action Unit Intensity Estimation

Thresholding Bandits with Augmented UCB

A Bayesian Estimation for the Fractional Order of the Differential Equation that Models Transport in Unconventional Hydrocarbon Reservoirs

Threat analysis of IoT networks Using Artificial Neural Network Intrusion Detection System

Comparison of Global Algorithms in Word Sense Disambiguation

Topological States in the Kuramoto Model

On Optimal Weighted-Delay Scheduling in Input-Queued Switches

Algorithms for Stable Matching and Clustering in a Grid

Matrix Scaling and Balancing via Box Constrained Newton’s Method and Interior Point Methods

A Constrained Sequence-to-Sequence Neural Model for Sentence Simplification

Much Faster Algorithms for Matrix Scaling