Associative Domain Adaptation

We propose associative domain adaptation, a novel technique for end-to-end domain adaptation with neural networks, the task of inferring class labels for an unlabeled target domain based on the statistical properties of a labeled source domain. Our training scheme follows the paradigm that in order to effectively derive class labels for the target domain, a network should produce statistically domain invariant embeddings, while minimizing the classification error on the labeled source domain. We accomplish this by reinforcing associations between source and target data directly in embedding space. Our method can easily be added to any existing classification network with no structural and almost no computational overhead. We demonstrate the effectiveness of our approach on various benchmarks and achieve state-of-the-art results across the board with a generic convolutional neural network architecture not specifically tuned to the respective tasks. Finally, we show that the proposed association loss produces embeddings that are more effective for domain adaptation compared to methods employing maximum mean discrepancy as a similarity measure in embedding space.

Network Community Detection: A Review and Visual Survey

Community structure is an important area of research. It has received a considerable attention from the scientific community. Despite its importance, one of the key problems in locating information about community detection is the diverse spread of related articles across various disciplines. To the best of our knowledge, there is no current comprehensive review of recent literature which uses a scientometric analysis using complex networks analysis covering all relevant articles from the Web of Science (WoS). Here we present a visual survey of key literature using CiteSpace. The idea is to identify emerging trends besides using network techniques to examine the evolution of the domain. Towards that end, we identify the most influential, central, as well as active nodes using scientometric analyses. We examine authors, key articles, cited references, core subject categories, key journals, institutions, as well as countries. The exploration of the scientometric literature of the domain reveals that Yong Wang is a pivot node with the highest centrality. Additionally, we have observed that Mark Newman is the most highly cited author in the network. We have also identified that the journal, ‘Reviews of Modern Physics’ has the strongest citation burst. In terms of cited documents, an article by Andrea Lancichinetti has the highest centrality score. We have also discovered that the origin of the key publications in this domain is from the United States. Whereas Scotland has the strongest and longest citation burst. Additionally, we have found that the categories of ‘Computer Science’ and ‘Engineering’ lead other categories based on frequency and centrality respectively.

Practical Statistics

Accelerators and detectors are expensive, both in terms of money and human effort. It is thus important to invest effort in performing a good statistical analysis of the data, in order to extract the best information from it. This series of five lectures deals with practical aspects of statistical issues that arise in typical High Energy Physics analyses.

Beyond Low Rank: A Data-Adaptive Tensor Completion Method

Low rank tensor representation underpins much of recent progress in tensor completion. In real applications, however, this approach is confronted with two challenging problems, namely (1) tensor rank determination; (2) handling real tensor data which only approximately fulfils the low-rank requirement. To address these two issues, we develop a data-adaptive tensor completion model which explicitly represents both the low-rank and non-low-rank structures in a latent tensor. Representing the non-low-rank structure separately from the low-rank one allows priors which capture the important distinctions between the two, thus enabling more accurate modelling, and ultimately, completion. Through defining a new tensor rank, we develop a sparsity induced prior for the low-rank structure, with which the tensor rank can be automatically determined. The prior for the non-low-rank structure is established based on a mixture of Gaussians which is shown to be flexible enough, and powerful enough, to inform the completion process for a variety of real tensor data. With these two priors, we develop a Bayesian minimum mean squared error estimate (MMSE) framework for inference which provides the posterior mean of missing entries as well as their uncertainty. Compared with the state-of-the-art methods in various applications, the proposed model produces more accurate completion results.

Revisiting Activation Regularization for Language RNNs

Recurrent neural networks (RNNs) serve as a fundamental building block for many sequence tasks across natural language processing. Recent research has focused on recurrent dropout techniques or custom RNN cells in order to improve performance. Both of these can require substantial modifications to the machine learning model or to the underlying RNN configurations. We revisit traditional regularization techniques, specifically L2 regularization on RNN activations and slowness regularization over successive hidden states, to improve the performance of RNNs on the task of language modeling. Both of these techniques require minimal modification to existing RNN architectures and result in performance improvements comparable or superior to more complicated regularization techniques or custom cell architectures. These regularization techniques can be used without any modification on optimized LSTM implementations such as the NVIDIA cuDNN LSTM.

Sensor Transformation Attention Networks

Recent work on encoder-decoder models for sequence-to-sequence mapping has shown that integrating both temporal and spatial attention mechanisms into neural networks increases the performance of the system substantially. In this work, we report on the application of an attentional signal not on temporal and spatial regions of the input, but instead as a method of switching among inputs themselves. We evaluate the particular role of attentional switching in the presence of dynamic noise in the sensors, and demonstrate how the attentional signal responds dynamically to changing noise levels in the environment to achieve increased performance on both audio and visual tasks in three commonly-used datasets: TIDIGITS, Wall Street Journal, and GRID. Moreover, the proposed sensor transformation network architecture naturally introduces a number of advantages that merit exploration, including ease of adding new sensors to existing architectures, attentional interpretability, and increased robustness in a variety of noisy environments not seen during training. Finally, we demonstrate that the sensor selection attention mechanism of a model trained only on the small TIDIGITS dataset can be transferred directly to a pre-existing larger network trained on the Wall Street Journal dataset, maintaining functionality of switching between sensors to yield a dramatic reduction of error in the presence of noise.

A glass-box interactive machine learning approach for solving NP-hard problems with the human-in-the-loop

The goal of Machine Learning to automatically learn from data, extract knowledge and to make decisions without any human intervention. Such automatic (aML) approaches show impressive success. Recent results even demonstrate intriguingly that deep learning applied for automatic classification of skin lesions is on par with the performance of dermatologists, yet outperforms the average. As human perception is inherently limited, such approaches can discover patterns, e.g. that two objects are similar, in arbitrarily high-dimensional spaces what no human is able to do. Humans can deal only with limited amounts of data, whilst big data is beneficial for aML; however, in health informatics, we are often confronted with a small number of data sets, where aML suffer of insufficient training samples and many problems are computationally hard. Here, interactive machine learning (iML) may be of help, where a human-in-the-loop contributes to reduce the complexity of NP-hard problems. A further motivation for iML is that standard black-box approaches lack transparency, hence do not foster trust and acceptance of ML among end-users. Rising legal and privacy aspects, e.g. with the new European General Data Protection Regulations, make black-box approaches difficult to use, because they often are not able to explain why a decision has been made. In this paper, we present some experiments to demonstrate the effectiveness of the human-in-the-loop approach, particularly in opening the black-box to a glass-box and thus enabling a human directly to interact with an learning algorithm. We selected the Ant Colony Optimization framework, and applied it on the Traveling Salesman Problem, which is a good example, due to its relevance for health informatics, e.g. for the study of protein folding. From studies of how humans extract so much from so little data, fundamental ML-research also may benefit.

Preselection via Classification: A Case Study on Evolutionary Multiobjective Optimization

In evolutionary algorithms, a preselection operator aims to select the promising offspring solutions from a candidate offspring set. It is usually based on the estimated or real objective values of the candidate offspring solutions. In a sense, the preselection can be treated as a classification procedure, which classifies the candidate offspring solutions into promising ones and unpromising ones. Following this idea, we propose a classification based preselection (CPS) strategy for evolutionary multiobjective optimization. When applying classification based preselection, an evolutionary algorithm maintains two external populations (training data set) that consist of some selected good and bad solutions found so far; then it trains a classifier based on the training data set in each generation. Finally it uses the classifier to filter the unpromising candidate offspring solutions and choose a promising one from the generated candidate offspring set for each parent solution. In such cases, it is not necessary to estimate or evaluate the objective values of the candidate offspring solutions. The classification based preselection is applied to three state-of-the-art multiobjective evolutionary algorithms (MOEAs) and is empirically studied on two sets of test instances. The experimental results suggest that classification based preselection can successfully improve the performance of these MOEAs.

DSOD: Learning Deeply Supervised Object Detectors from Scratch

We present Deeply Supervised Object Detector (DSOD), a framework that can learn object detectors from scratch. State-of-the-art object objectors rely heavily on the off-the-shelf networks pre-trained on large-scale classification datasets like ImageNet, which incurs learning bias due to the difference on both the loss functions and the category distributions between classification and detection tasks. Model fine-tuning for the detection task could alleviate this bias to some extent but not fundamentally. Besides, transferring pre-trained models from classification to detection between discrepant domains is even more difficult (e.g. RGB to depth images). A better solution to tackle these two critical problems is to train object detectors from scratch, which motivates our proposed DSOD. Previous efforts in this direction mostly failed due to much more complicated loss functions and limited training data in object detection. In DSOD, we contribute a set of design principles for training object detectors from scratch. One of the key findings is that deep supervision, enabled by dense layer-wise connections, plays a critical role in learning a good detector. Combining with several other principles, we develop DSOD following the single-shot detection (SSD) framework. Experiments on PASCAL VOC 2007, 2012 and MS COCO datasets demonstrate that DSOD can achieve better results than the state-of-the-art solutions with much more compact models. For instance, DSOD outperforms SSD on all three benchmarks with real-time detection speed, while requires only 1/2 parameters to SSD and 1/10 parameters to Faster RCNN. Our code and models are available at: https://…/DSOD .

L1-norm Principal-Component Analysis of Complex Data

L1-norm Principal-Component Analysis (L1-PCA) of real-valued data has attracted significant research interest over the past decade. However, L1-PCA of complex-valued data remains to date unexplored despite the many possible applications (e.g., in communication systems). In this work, we establish theoretical and algorithmic foundations of L1-PCA of complex-valued data matrices. Specifically, we first show that, in contrast to the real-valued case for which an optimal polynomial-cost algorithm was recently reported by Markopoulos et al., complex L1-PCA is formally NP-hard in the number of data points. Then, casting complex L1-PCA as a unimodular optimization problem, we present the first two suboptimal algorithms in the literature for its solution. Our experimental studies illustrate the sturdy resistance of complex L1-PCA against faulty measurements/outliers in the processed data.

Push-sum on random graphs
A Periodic Isoperimetric Problem Related to the Unique Games Conjecture
Flat2Sphere: Learning Spherical Convolution for Fast Features from 360° Imagery
Echo State Learning for Wireless Virtual Reality Resource Allocation in UAV-enabled LTE-U Networks
Combining Keystroke Dynamics and Face Recognition for User Verification
Null Decomposition of Trees
An Energy Minimization Approach to 3D Non-Rigid Deformable Surface Estimation Using RGBD Data
Subgroup analysis of treatment effects for misclassified biomarkers with time-to-event data
Predicting Human Activities Using Stochastic Grammar
Semantic Instance Labeling Leveraging Hierarchical Segmentation
Convergence of Glauber dynamic on Ising-like models with Kac interaction to $Φ^{2n}_2$
Mean Estimation from Adaptive One-bit Measurements
Generating High-Quality Crowd Density Maps using Contextual Pyramid CNNs
Hamiltonian Monte Carlo with Energy Conserving Subsampling
Low Dose CT Image Denoising Using a Generative Adversarial Network with Wasserstein Distance and Perceptual Loss
Efficient hybrid search algorithm on ordered datasets
How many eigenvalues of a product of truncated orthogonal matrices are real?
Graphs having extremal monotonic topological indices with bounded vertex $k$-partiteness
Attention Transfer from Web Images for Video Recognition
ORGB: Offset Correction in RGB Color Space for Illumination-Robust Image Processing
New Results on the DMC Capacity and Renyi’s Divergence
Photo-realistic Face Images Synthesis for Learning-based Fine-scale 3D Face Reconstruction
Multi-Planar Deep Segmentation Networks for Cardiac Substructures from MRI and CT
Localization in One-Dimensional Tight-Binding Model with Chaotic Binary Sequences
A study of the morphology, dynamics, and folding pathways of ring polymers with supramolecular topological constraints using molecular simulation and nonlinear manifold learning
Participation of an Energy Storage Aggregator in Electricity Markets
Exploiting Linguistic Resources for Neural Machine Translation Using Multi-task Learning
Reinforcement learning techniques for Outer Loop Link Adaptation in 4G/5G systems
Rank-metric LCD codes
Extreme Low Resolution Activity Recognition with Multi-Siamese Embedding Learning
Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization
Improved Deterministic Distributed Construction of Spanners
On the convergence properties of a $K$-step averaging stochastic gradient descent algorithm for nonconvex optimization
Co-Optimization Scheme for Distributed Energy Resource Planning in Community Microgrids
Modified Viterbi Algorithm Based Distribution System Restoration Strategy for Grid Resiliency
CRF Autoencoder for Unsupervised Dependency Parsing
When Kernel Methods meet Feature Learning: Log-Covariance Network for Action Recognition from Skeletal Data
Economic Power Capacity Design of Distributed Energy Resources for Reliable Community Microgrids
What Will I Do Next? The Intention from Motion Experiment
Detection of Abnormal Input-Output Associations
Optimal Stopping and the Sufficiency of Randomized Threshold Strategies
Nonmonotonous classical magneto-conductivity of a two-dimensional electron gas in a disordered array of obstacles
New Canonical Decomposition in Matching Theory
The random k-matching-free process
Entropic multipliers method for langevin diffusion and weighted log sobolev inequalities
Graph-based Features for Automatic Online Abuse Detection
Reader-Aware Multi-Document Summarization: An Enhanced Model and The First Dataset
Amplitude- and Frequency-based Dispersion Patterns and Entropy
Optimal rate list decoding over bounded alphabets using algebraic-geometric codes
A Sparse Completely Positive Relaxation of the Modularity Maximization for Community Detection
Generalized variational inequalities for maximal monotone operators
The Spatial Shape of Avalanches
The Gram-Schmidt Walk: A Cure for the Banaszczyk Blues
Lower bounds for the measurable chromatic number of the hyperbolic plane
Lorenz curves interpretations of the Bruss-Duerinckx theorem for resource dependent branching processes
Using the SLEUTH urban growth model to simulate the impacts of future policy scenarios on urban land use in the Tehran metropolitan area in Iran
On regular induced subgraphs of generalized polygons
Learning Feature Pyramids for Human Pose Estimation
Numerical properties of Koszul connections
Arc Transitive Maps with underlying Rose Window Graphs
Heden’s bound on the tail of a vector space partition
Asymptotic behaviour of randomised fractional volatility models
A Unified View-Graph Selection Framework for Structure from Motion
Efficient pattern matching in degenerate strings with the Burrows-Wheeler transform
Long range forces in a performance portable Molecular Dynamics framework
Lecture hall partitions and the affine hyperoctahedral group
Automatic Segmentation and Disease Classification Using Cardiac Cine MR Images
Three-dimensional planar model estimation using multi-constraint knowledge based on k-means and RANSAC
Learning Directed Acyclic Graphs with Hidden Variables via Latent Gaussian Graphical Model Selection
Existence and uniqueness results for Itô-SDEs with locally integrable drifts and Sobolev diffusion coefficients
Deep MR to CT Synthesis using Unpaired Data
Using Graph Properties to Speed-up GPU-based Graph Traversal: A Model-driven Approach
Good Applications for Crummy Entity Linkers? The Case of Corpus Selection in Digital Humanities
A Deep Convolutional Neural Network to Analyze Position Averaged Convergent Beam Electron Diffraction Patterns
New Constructions of Permutation Polynomials of the Form $x^rh\left(x^{q-1}\right)$ over $\mathbb{F}_{q^2}$
Applying advanced machine learning models to classify electro-physiological activity of human brain for use in biometric identification
Finite temperature phase transition in the two-dimensional Coulomb glass at low disorders
Topological boundary invariants for Floquet systems and quantum walks
Continuous Association Schemes and Hypergroups
Patch-based adaptive weighting with segmentation and scale (PAWSS) for visual tracking
Optimal Power Allocation Scheme for Non-Orthogonal Multiple Access with $α$-Fairness
Unsupervised Video Understanding by Reconciliation of Posture Similarities
Incorporating genuine prior information about between-study heterogeneity in random effects pairwise and network meta-analyses
Estimating speech from lip dynamics
Detecting early signs of depressive and manic episodes in patients with bipolar disorder using the signature-based model
Semantic Augmented Reality Environment with Material-Aware Physical Interactions
A Ramsey Property of Random Regular and $k$-out Graphs
Polynomial tuning of multiparametric combinatorial samplers
Recent Developments and Future Challenges in Medical Mixed Reality
The LOOP Estimator: Adjusting for Covariates in Randomized Experiments
Optimal constants for a non-local approximation of Sobolev norms and total variation
Equidistant Polarizing Transforms
Real-time Geometry-Aware Augmented Reality in Minimally Invasive Surgery
Multiscale mixing patterns in networks
Image reconstruction with imperfect forward models and applications in deblurring
Unsupervised Representation Learning by Sorting Sequences