Fast Heuristic Algorithm for Multi-scale Hierarchical Community Detection

Complex networks constitute the backbones of many complex systems such as social networks. Detecting the community structure in a complex network is both a challenging and a computationally expensive task. In this paper, we present the HAMUHI-CODE, a novel fast heuristic algorithm for multi-scale hierarchical community detection inspired on an agglomerative hierarchical clustering technique. We define a new structural similarity of vertices based on the classical cosine similarity by removing some vertices in order to increase the probability of identifying inter-cluster edges. Then we use the proposed structural similarity in a new agglomerative hierarchical algorithm that does not merge only clusters with maximal similarity as in the classical approach, but merges any cluster that does not meet a parameterized community definition with its most similar adjacent cluster. The algorithm computes all the similar clusters at the same time is checking if each cluster meets the parameterized community definition. It is done in linear time complexity in terms of the number of cluster in the iteration. Since a complex network is a sparse graph, our approach HAMUHI-CODE has a super-linear time complexity with respect to the size of the input in the worst-case scenario (if the clusters merge in pairs), making it suitable to be applied on large-scale complex networks. To test the properties and the efficiency of our algorithm we have conducted extensive experiments on real world and synthetic benchmark networks by comparing it to several baseline state-of-the-art algorithms.

Learning Mixture of Gaussians with Streaming Data

In this paper, we study the problem of learning a mixture of Gaussians with streaming data: given a stream of N points in d dimensions generated by an unknown mixture of k spherical Gaussians, the goal is to estimate the model parameters using a single pass over the data stream. We analyze a streaming version of the popular Lloyd’s heuristic and show that the algorithm estimates all the unknown centers of the component Gaussians accurately if they are sufficiently separated. Assuming each pair of centers are C\sigma distant with C=\Omega((k\log k)^{1/4}\sigma) and where \sigma^2 is the maximum variance of any Gaussian component, we show that asymptotically the algorithm estimates the centers optimally (up to constants); our center separation requirement matches the best known result for spherical Gaussians \citep{vempalawang}. For finite samples, we show that a bias term based on the initial estimate decreases at O(1/{\rm poly}(N)) rate while variance decreases at nearly optimal rate of \sigma^2 d/N. Our analysis requires seeding the algorithm with a good initial estimate of the true cluster centers for which we provide an online PCA based clustering algorithm. Indeed, the asymptotic per-step time complexity of our algorithm is the optimal d\cdot k while space complexity of our algorithm is O(dk\log k). In addition to the bias and variance terms which tend to 0, the hard-thresholding based updates of streaming Lloyd’s algorithm is agnostic to the data distribution and hence incurs an approximation error that cannot be avoided. However, by using a streaming version of the classical (soft-thresholding-based) EM method that exploits the Gaussian distribution explicitly, we show that for a mixture of two Gaussians the true means can be estimated consistently, with estimation error decreasing at nearly optimal rate, and tending to 0 for N\rightarrow \infty.

Effective Approaches to Batch Parallelization for Dynamic Neural Network Architectures

We present a simple dynamic batching approach applicable to a large class of dynamic architectures that consistently yields speedups of over 10x. We provide performance bounds when the architecture is not known a priori and a stronger bound in the special case where the architecture is a predetermined balanced tree. We evaluate our approach on Johnson et al.’s recent visual question answering (VQA) result of his CLEVR dataset by Inferring and Executing Programs (IEP). We also evaluate on sparsely gated mixture of experts layers and achieve speedups of up to 1000x over the naive implementation.

Combining Forecasts Using Ensemble Learning

The problem of combining individual forecasters to produce a forecaster with improved performance is considered. The connections between probability elicitation and classification are used to pose the combining forecaster problem as that of ensemble learning. With this connection in place, a number of theoretically sound ensemble learning methods such as Bagging and Boosting are adapted for combining forecasters. It is shown that the simple yet effective method of averaging the forecasts is equivalent to Bagging. This provides theoretical insight into why the well established averaging of forecasts method works so well. Also, a nonlinear combination of forecasters can be attained through Boosting which is shown to theoretically produce combined forecasters that are both calibrated and highly refined. Finally, the proposed methods of combining forecasters are applied to the Good Judgment Project data set and are shown to outperform the individual forecasters.

Tailoring Artificial Neural Networks for Optimal Learning

As one of the most important paradigms of recurrent neural networks, the echo state network (ESN) has been applied to a wide range of fields, from robotics to medicine to finance, and language processing. A key feature of the ESN paradigm is its reservoir —a directed and weighted network— that represents the connections between neurons and projects the input signals into a high dimensional space. Despite extensive studies, the impact of the reservoir network on the ESN performance remains unclear. Here we systematically address this fundamental question. Through spectral analysis of the reservoir network we reveal a key factor that largely determines the ESN memory capacity and hence affects its performance. Moreover, we find that adding short loops to the reservoir network can tailor ESN for specific tasks and optimal learning. We validate our findings by applying ESN to forecast both synthetic and real benchmark time series. Our results provide a new way to design task-specific recurrent neural networks, as well as new insights in understanding complex networked systems.

Event Stream Processing with Multiple Threads

Current runtime verification tools seldom make use of multi-threading to speed up the evaluation of a property on a large event trace. In this paper, we present an extension to the BeepBeep 3 event stream engine that allows the use of multiple threads during the evaluation of a query. Various parallelization strategies are presented and described on simple examples. The implementation of these strategies is then evaluated empirically on a sample of problems. Compared to the previous, single-threaded version of the BeepBeep engine, the allocation of just a few threads to specific portions of a query provides dramatic improvement in terms of running time.

Deepest Neural Networks

This paper shows that a long chain of perceptrons (that is, a multilayer perceptron, or MLP, with many hidden layers of width one) can be a universal classifier. The classification procedure is not necessarily computationally efficient, but the technique throws some light on the kind of computations possible with narrow and deep MLPs.

Primal-Dual Group Convolutions for Deep Neural Networks

In this paper, we present a simple and modularized neural network architecture, named primal-dual group convolutional neural networks (PDGCNets). The main point lies in a novel building block, a pair of two successive group convolutions: primal group convolution and dual group convolution. The two group convolutions are complementary: (i) the convolution on each primal partition in primal group convolution is a spatial convolution, while on each dual partition in dual group convolution, the convolution is a point-wise convolution; (ii) the channels in the same dual partition come from different primal partitions. We discuss one representative advantage: Wider than a regular convolution with the number of parameters and the computation complexity preserved. We also show that regular convolutions, group convolution with summation fusion (as used in ResNeXt), and the Xception block are special cases of primal-dual group convolutions. Empirical results over standard benchmarks, CIFAR-10, CIFAR-100, SVHN and ImageNet demonstrate that our networks are more efficient in using parameters and computation complexity with similar or higher accuracy.

Unsupervised Learning of Task-Specific Tree Structures with Tree-LSTMs

For years, recursive neural networks (RvNNs) have shown to be suitable for representing text into fixed-length vectors and achieved good performance on several natural language processing tasks. However, the main drawback of RvNN is that it requires explicit tree structure (e.g. parse tree), which makes data preparation and model implementation hard. In this paper, we propose a novel tree-structured long short-term memory (Tree-LSTM) architecture that efficiently learns how to compose task-specific tree structures only from plain text data. To achieve this property, our model uses Straight-Through (ST) Gumbel-Softmax estimator to decide the parent node among candidates and to calculate gradients of the discrete decision. We evaluate the proposed model on natural language interface and sentiment analysis and show that our model outperforms or at least comparable to previous Tree-LSTM-based works. We also find that our model converges significantly faster and needs less memory than other models of complex structures.

A Generalized Recurrent Neural Architecture for Text Classification with Multi-Task Learning

Multi-task learning leverages potential correlations among related tasks to extract common features and yield performance gains. However, most previous works only consider simple or weak interactions, thereby failing to model complex correlations among three or more tasks. In this paper, we propose a multi-task learning architecture with four types of recurrent neural layers to fuse information across multiple related tasks. The architecture is structurally flexible and considers various interactions among tasks, which can be regarded as a generalized case of many previous works. Extensive experiments on five benchmark datasets for text classification show that our model can significantly improve performances of related tasks with additional information from others.

A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques

The amount of text that is generated every day is increasing dramatically. This tremendous volume of mostly unstructured text cannot be simply processed and perceived by computers. Therefore, efficient and effective techniques and algorithms are required to discover useful patterns. Text mining is the task of extracting meaningful information from text, which has gained significant attentions in recent years. In this paper, we describe several of the most fundamental text mining tasks and techniques including text pre-processing, classification and clustering. Additionally, we briefly explain text mining in biomedical and health care domains.

Revisiting Unreasonable Effectiveness of Data in Deep Learning Era

The success of deep learning in vision can be attributed to: (a) models with high capacity; (b) increased computational power; and (c) availability of large-scale labeled data. Since 2012, there have been significant advances in representation capabilities of the models and computational capabilities of GPUs. But the size of the biggest dataset has surprisingly remained constant. What will happen if we increase the dataset size by 10x or 100x? This paper takes a step towards clearing the clouds of mystery surrounding the relationship between `enormous data’ and deep learning. By exploiting the JFT-300M dataset which has more than 375M noisy labels for 300M images, we investigate how the performance of current vision tasks would change if this data was used for representation learning. Our paper delivers some surprising (and some expected) findings. First, we find that the performance on vision tasks still increases linearly with orders of magnitude of training data size. Second, we show that representation learning (or pre-training) still holds a lot of promise. One can improve performance on any vision tasks by just training a better base model. Finally, as expected, we present new state-of-the-art results for different vision tasks including image classification, object detection, semantic segmentation and human pose estimation. Our sincere hope is that this inspires vision community to not undervalue the data and develop collective efforts in building larger datasets.

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking
Optimal Binary Constant Weight Codes and Affine Linear Groups over Finite Fields
Learning Efficient Image Representation for Person Re-Identification
The limit point of the pentagram map
Shrinkage Estimation Strategies in Generalized Ridge Regression Models Under Low/High-Dimension Regime
Robust Wald-type tests for non-homogeneous observations based on minimum density power divergence estimator
Fast Stochastic Hierarchical Bayesian MAP for Tomographic Imaging
Limitations on the Achievable Repair Bandwidth of Piggybacking Codes with Low Substriping
Milstein-type Schemes of SDE Driven by Lévy Noise with Super-linear Diffusion Coefficients
Tracy-Widom at each edge of real covariance estimators
Towards Zero-Shot Frame Semantic Parsing for Domain Scaling
Pairwise Well-Formed Modes and Transformations
Correlational Dueling Bandits with Application to Clinical Treatment in Large Decision Spaces
Efficient Vector Representation for Documents through Corruption
Enhancing PHY Security of Cooperative Cognitive Radio Multicast Communications
Disjoint cycles on Lichiardopol’s conjecture in tournaments
The confirmation of a conjecture on disjoint cycles in a graph
Representation Learning and Adversarial Generation of 3D Point Clouds
On the Capacity of the Carbon Copy onto Dirty Paper Channel
Embedding Visual Hierarchy with Deep Networks for Large-Scale Visual Recognition
Bijections for inversion sequences, ascent sequences and 3-nonnesting set partitions
Estimation Efficiency Under Privacy Constraints
Translation-based Recommendation
Application of Transfer Learning Approaches in Multimodal Wearable Human Activity Recognition
Efficient Approximate Distance Oracles for Vertex-Labeled Planar Graphs
Stability, Fairness and Random Walks in the Bargaining Problem
Estimating the Spot Covariation of Asset Prices – Statistical Theory and Empirical Evidence
A Similarity Measure for GPU Kernel Subgraph Matching
Deep Learning for Vanishing Point Detection Using an Inverse Gnomonic Projection
Combinatorial Optimization Problems with Interaction Costs: Complexity and Solvable Cases
Deep Semantic Segmentation for Automated Driving: Taxonomy, Roadmap and Challenges
Self Adversarial Training for Human Pose Estimation
The Cat and the Noisy Mouse
Global optimality conditions for deep neural networks
The spectrum of the Heisenberg ferromagnet and graph theory
Improving Multilingual Named Entity Recognition with Wikipedia Entity Type Mapping
Subspace Clustering with Missing and Corrupted Data
Adversarial Examples, Uncertainty, and Transfer Testing Robustness in Gaussian Process Hybrid Deep Networks
Hyperspectral Image Restoration via Total Variation Regularized Low-rank Tensor Decomposition
Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection
Estimation of mean residual life
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
Optimized Multiuser Computation Offloading with Multi-antenna NOMA
PageRank on inhomogeneous random digraphs
Analysis of Footnote Chasing and Citation Searching in an Academic Search Engine
Faster and more accurate computation of the $\mathcal{H}_\infty$ norm via optimization
Predicting the Quality of Short Narratives from Social Media
Marginalization in nonlinear mixed-effects models with an application to dose-response analysis
Demand Shaping in Cellular Networks
Assouad dimension of random processes
Variational Inference via Transformations on Distributions
Crystals and Schur $P$-positive expansions
An Adaptive, Multivariate Partitioning Algorithm for Global Optimization of Nonconvex Programs
A Fast Integrated Planning and Control Framework for Autonomous Driving via Imitation Learning
Model-free Load Control for High Penetration of Solar Photovoltaic Generation
Some results on optimal stopping problems for one-dimensional regular diffusions
Contact graphs of ball packings
Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data
Optimal Sampling and Remote Estimation of the Wiener Process over a Channel with Random Delay
Exploiting Active Subspaces in Global Optimization: How Complex is your Problem?
Some conditional probabilities in the TASEP with second class particles
Ona relation between classical and free infinitely divisible transforms
Efficient Context Management and Personalized User Recommendations in a Smart Social TV environment
Unified Method for Markov Chain Transition Model Estimation Using Incomplete Survey Data
Counting Numerical Semigroups
Visual Analytics of Movement Pattern Based on Time-Spatial Data: A Neural Net Approach
Asymptotic Theory for the Maximum of an Increasing Sequence of Parametric Functions
GraphMP: An Efficient Semi-External-Memory Big Graph Processing System on a Single Machine
Multiantenna Quantum Backscatter Communications
Holonomic Gradient Method for the Distribution Function of the Largest Root of Complex Non-central Wishart Matrices
Gelfand-Kirillov Dimensions of Highest Weight Harish-Chandra Modules for $SU(p,q)$
Overcoming the curse of dimensionality: Solving high-dimensional partial differential equations using deep learning
On orthogonal tensors and best rank-one approximation ratio
Evaluating race and sex diversity in the world’s largest companies using deep neural networks
Vertical Dependency in Sequences of Categorical Random Variables
Neural Machine Translation between Herbal Prescriptions and Diseases
Dynamic clustering to minimize the sum of radii
Class-Weighted Convolutional Features for Visual Instance Search
The algebraic Bethe Ansatz and combinatorial trees
Dynamic Quantile Function Models
Exploiting the Tradeoff between Program Accuracy and Soft-error Resiliency Overhead for Machine Learning Workloads
Parameter uncertainty in structural equation models: Confidence sets and fungible estimates
Quitting Games and Linear Complementarity Problems
The stringy Euler number of Calabi-Yau hypersurfaces in toric varieties and the Mavlyutov duality
Detection of bimanual gestures everywhere: why it matters, what we need and what is missing
Few-Shot Learning Through an Information Retrieval Lens
Ramsey expansions of metrically homogeneous graphs
Aggregation-Based Datacenter Energy Management in Wholesale Electricity Markets
Controlling Linguistic Style Aspects in Neural Language Generation
The smallest singular value of a shifted $d$-regular random square matrix
Local Activity-tuned Image Filtering for Noise Removal and Image Smoothing
The complexity of independent set reconfiguration on bipartite graphs
Towards a Comprehensive Framework for Telemetry Data in HPC Environments
Automated versus do-it-yourself methods for causal inference: Lessons learned from a data analysis competition
Integration of LiDAR and Hyperspectral Data for Land-cover Classification: A Case Study
The Junta Method for Hypergraphs and Chvátal’s Simplex Conjecture
Cappuccino: Efficient Inference Software Synthesis for Mobile System-on-Chips
Analysis of a Finite State Many Player Game Using its Master Equation
Nonlinear Sequential Accepts and Rejects for Identification of Top Arms in Stochastic Bandits
On the Min-Max-Delay Problem: Complexity, Algorithm, and Int-Gap
Generation and analysis of lamplighter programs
A Human and Group Behaviour Simulation Evaluation Framework utilising Composition and Video Analysis
Macdonald cumulants, $G$-inversion polynomials and $G$-parking functions
PELESent: Cross-domain polarity classification using distant supervision
Likelihood Ratio Gradient Estimation for Steady-State Parameters
Toric tableaux and the inhomogeneous two-species TASEP on a ring
Exponential decay for the near-critical scaling limit of the planar Ising model
Accelerated Stochastic Power Iteration
A Necessary Condition for Power Flow Insolvability in Power Distribution Systems with Distributed Generators
Deep Reinforcement Learning for Improving Downlink mmWave Communication Performance
Learning in High-Dimensional Multimedia Data: The State of the Art
A General Framework of Enhancing Sparsity of Generalized Polynomial Chaos Expansions
The speed of sequential asymptotic learning
Anisotropic Diffusion-based Kernel Matrix Model for Face Liveness Detection
Symmetrized importance samplers for stochastic differential equations
Rank Two Non-Commutative Laurent Phenomenon and Pseudo-Positivity
Tridiagonal Models for Dyson Brownian Motion
Composition Properties of Inferential Privacy for Time-Series Data
Topology Reduction in Deep Convolutional Feature Extraction Networks
Regularity and Stability for the Semigroup of Jump Diffusions with State-Dependent Intensity
Transition to Reconstructibility in Weakly Coupled Networks
Stochastic Variance Reduction Gradient for a Non-convex Problem Using Graduated Optimization
Best-Effort Inductive Logic Programming via Fine-grained Cost-based Hypothesis Generation
Synthesis-based Robust Low Resolution Face Recognition
Residual Value Forecasting Using Asymmetric Cost Functions
Efficient Enumeration of Induced Matchings in a Graph without Cycles with Length Four
A Set of Sequences of Complexity $2n+1$
Backpropagation in matrix notation
Robust Imitation of Diverse Behaviors
Confidence biases and learning among intuitive Bayesians
Improving speaker turn embedding by crossmodal transfer learning from face embedding
A Local-Search Algorithm for Steiner Forest
Subdeterminant Maximization via Nonconvex Relaxations and Anti-concentration
Approximating time to extinction for endemic infection models
A succinct data structure for self-indexing ternary relations
A non-linear wave equation with fractional perturbation
Compressed Representation of Dynamic Binary Relations with Applications
Understanding State Preferences With Text As Data: Introducing the UN General Debate Corpus
Block modelling in dynamic networks with non-homogeneous Poisson processes and exact ICL
Deep Reinforcement Learning Attention Selection for Person Re-Identification
Exploiting Parallelism in Optical Network Systems: A Case Study of Random Linear Network Coding (RLNC) in Ethernet-over-Optical Networks
Bipartite bi-Cayley graphs over metacyclic groups of odd prime-power order
Semi-Supervised Haptic Material Recognition for Robots using Generative Adversarial Networks
High Order Random Walks: Beyond Spectral Gap
Marginals, measurable modifications of stochastic processes, and the product lifting problem
Towards Crafting Text Adversarial Samples
Scale-Regularized Filter Learning
Multi-splits and tropical linear spaces from nested matroids
Adaptive Binarization for Weakly Supervised Affordance Segmentation
A Low-Complexity Soft-Output wMD Decoding for Uplink MIMO Systems with One-Bit ADCs
Deep Bilateral Learning for Real-Time Image Enhancement
Beyond Massive-MIMO: The Potential of Data-Transmission with Large Intelligent Surfaces
Locally Feller processes and martingale local problems. Part II: discrete schemes and applications
Entanglement verification protocols for distributed systems based on the Quantum Recursive Network Architecture
On the metric dimension of incidence graphs
An Analysis of Human-centered Geolocation
Equivalent Representations of Max-Stable Processes via $\ell^p$ Norms
Low Dose CT Image Reconstruction With Learned Sparsifying Transform
Counting tree-like graphs in locally dense graphs
Frames, $A$-paths and the Erdős-Pósa property
Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End Learning from Demonstration
Enhanced Deep Residual Networks for Single Image Super-Resolution
Wavelet-based Reflection Symmetry Detection via Textural and Color Histograms
Complexity of eye fixation duration time series in reading of Persian texts: A multifractal detrended fluctuation analysis
Checkerboard artifact free sub-pixel convolution: A note on sub-pixel convolution, resize convolution and convolution resize
Rapid Computational Optimization of Molecular Properties using Genetic Algorithms: Searching Across Millions of Compounds for Organic Photovoltaic Materials
Relaxing Integrity Requirements for Attack-Resilient Cyber-Physical Systems
An Interactive Greedy Approach to Group Sparsity in High Dimension
Counterexample to global convergence of DSOS and SDSOS hierarchies
Upper and Lower Bounds on the Speed of a One Dimensional Excited Random Walk