Towards String-to-Tree Neural Machine Translation

We present a simple method to incorporate syntactic information about the target language in a neural machine translation system by translating into linearized, lexicalized constituency trees. An experiment on the WMT16 German-English news translation task resulted in an improved BLEU score when compared to a syntax-agnostic NMT baseline trained on the same dataset. An analysis of the translations from the syntax-aware system shows that it performs more reordering during translation in comparison to the baseline. A small-scale human evaluation also showed an advantage to the syntax-aware system.

AnchorNet: A Weakly Supervised Network to Learn Geometry-sensitive Features For Semantic Matching

Despite significant progress of deep learning in recent years, state-of-the-art semantic matching methods still rely on legacy features such as SIFT or HoG. We argue that the strong invariance properties that are key to the success of recent deep architectures on the classification task make them unfit for dense correspondence tasks, unless a large amount of supervision is used. In this work, we propose a deep network, termed AnchorNet, that produces image representations that are well-suited for semantic matching. It relies on a set of filters whose response is geometrically consistent across different object instances, even in the presence of strong intra-class, scale, or viewpoint variations. Trained only with weak image-level labels, the final representation successfully captures information about the object structure and improves results of state-of-the-art semantic matching methods such as the deformable spatial pyramid or the proposal flow methods. We show positive results on the cross-instance matching task where different instances of the same object category are matched as well as on a new cross-category semantic matching task aligning pairs of instances each from a different object class.

A Hybrid ACO Algorithm for the Next Release Problem

In this paper, we propose a Hybrid Ant Colony Optimization algorithm (HACO) for Next Release Problem (NRP). NRP, a NP-hard problem in requirement engineering, is to balance customer requests, resource constraints, and requirement dependencies by requirement selection. Inspired by the successes of Ant Colony Optimization algorithms (ACO) for solving NP-hard problems, we design our HACO to approximately solve NRP. Similar to traditional ACO algorithms, multiple artificial ants are employed to construct new solutions. During the solution construction phase, both pheromone trails and neighborhood information will be taken to determine the choices of every ant. In addition, a local search (first found hill climbing) is incorporated into HACO to improve the solution quality. Extensively wide experiments on typical NRP test instances show that HACO outperforms the existing algorithms (GRASP and simulated annealing) in terms of both solution uality and running time.

Random Walk Sampling for Big Data over Networks

It has been shown recently that graph signals with small total variation can be accurately recovered from only few samples if the sampling set satisfies a certain condition, referred to as the network nullspace property. Based on this recovery condition, we propose a sampling strategy for smooth graph signals based on random walks. Numerical experiments demonstrate the effectiveness of this approach for graph signals obtained from a synthetic random graph model as well as a real-world dataset.

k-Means is a Variational EM Approximation of Gaussian Mixture Models

We show that k-means (Lloyd’s algorithm) is equivalent to a variational EM approximation of a Gaussian Mixture Model (GMM) with isotropic Gaussians. The k-means algorithm is obtained if truncated posteriors are used as variational distributions. In contrast to the standard way to relate k-means and GMMs, we show that it is not required to consider the limit case of Gaussians with zero variance. There are a number of consequences following from our observation: (A) k-means can be shown to monotonously increase the free-energy associated with truncated distributions; (B) Using the free-energy, we can derive an explicit and compact formula of a lower GMM likelihood bound which uses the k-means objective as argument; (C) We can generalize k-means using truncated variational EM, and relate such generalizations to other k-means-like algorithms. In general, truncated variational EM provides a natural and quantitative link between k-means-like clustering and GMM clustering algorithms which may be very relevant for future theoretical as well as empirical studies.

A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes

We propose a model to automatically describe changes introduced in the source code of a program using natural language. Our method receives as input a set of code commits, which contains both the modifications and message introduced by an user. These two modalities are used to train an encoder-decoder architecture. We evaluated our approach on twelve real world open source projects from four different programming languages. Quantitative and qualitative results showed that the proposed approach can generate feasible and semantically sound descriptions not only in standard in-project settings, but also in a cross-project setting.

Gang of GANs: Generative Adversarial Networks with Maximum Margin Ranking

Traditional generative adversarial networks (GAN) and many of its variants are trained by minimizing the KL or JS-divergence loss that measures how close the generated data distribution is from the true data distribution. A recent advance called the WGAN based on Wasserstein distance can improve on the KL and JS-divergence based GANs, and alleviate the gradient vanishing, instability, and mode collapse issues that are common in the GAN training. In this work, we aim at improving on the WGAN by first generalizing its discriminator loss to a margin-based one, which leads to a better discriminator, and in turn a better generator, and then carrying out a progressive training paradigm involving multiple GANs to contribute to the maximum margin ranking loss so that the GAN at later stages will improve upon early stages. We call this method Gang of GANs (GoGAN). We have shown theoretically that the proposed GoGAN can reduce the gap between the true data distribution and the generated data distribution by at least half in an optimally trained WGAN. We have also proposed a new way of measuring GAN quality which is based on image completion tasks. We have evaluated our method on four visual datasets: CelebA, LSUN Bedroom, CIFAR-10, and 50K-SSFF, and have seen both visual and quantitative improvement over baseline WGAN.

Introspection: Accelerating Neural Network Training By Learning Weight Evolution

Neural Networks are function approximators that have achieved state-of-the-art accuracy in numerous machine learning tasks. In spite of their great success in terms of accuracy, their large training time makes it difficult to use them for various tasks. In this paper, we explore the idea of learning weight evolution pattern from a simple network for accelerating training of novel neural networks. We use a neural network to learn the training pattern from MNIST classification and utilize it to accelerate training of neural networks used for CIFAR-10 and ImageNet classification. Our method has a low memory footprint and is computationally efficient. This method can also be used with other optimizers to give faster convergence. The results indicate a general trend in the weight evolution during training of neural networks.

Morpheo: Traceable Machine Learning on Hidden data

Morpheo is a transparent and secure machine learning platform collecting and analysing large datasets. It aims at building state-of-the art prediction models in various fields where data are sensitive. Indeed, it offers strong privacy of data and algorithm, by preventing anyone to read the data, apart from the owner and the chosen algorithms. Computations in Morpheo are orchestrated by a blockchain infrastructure, thus offering total traceability of operations. Morpheo aims at building an attractive economic ecosystem around data prediction by channelling crypto-money from prediction requests to useful data and algorithms providers. Morpheo is designed to handle multiple data sources in a transfer learning approach in order to mutualize knowledge acquired from large datasets for applications with smaller but similar datasets.

Trigger for the SoLid Reactor Antineutrino Experiment

Structure and Randomness of Continuous-Time Discrete-Event Processes

Quantization Design and Channel Estimation for Massive MIMO Systems with One-Bit ADCs

Stochastic MPC Design for a Two-Component Granulation Process

Learn-Memorize-Recall-Reduce A Robotic Cloud Computing Paradigm

Optimal Oil Production and Taxation in Presence of Global Disruptions

Probabilistic boundaries of finite extensions of quantum groups

Deep Learning Based Regression and Multi-class Models for Acute Oral Toxicity Prediction

FML-based Prediction Agent and Its Application to Game of Go

Understanding Norm Change: An Evolutionary Game-Theoretic Approach (Extended Version)

Angle-Domain Doppler Pre-Compensation for High-Mobility OFDM Uplink with a Massive ULA

Stochastic Boundedness of State Trajectories of Stable LTI Systems in the Presence of Nonvanishing Stochastic Perturbation

Ordered and size-biased frequencies in GEM and Gibbs models for species sampling

Extending Owen’s integral table and a new multivariate Bernoulli distribution

Percolation in Media with Columnar Disorder

Majority Is Asymptotically the Most Stable Resilient Function

Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent

In-Datacenter Performance Analysis of a Tensor Processing Unit

Approximating the Backbone in the Weighted Maximum Satisfiability Problem

A Security Monitoring Framework For Virtualization Based HEP Infrastructures

Energy Efficient Adaptive Network Coding Schemes for Satellite Communications

Network Coding Channel Virtualization Schemes for Satellite Multicast Communications

Harvesting Multiple Views for Marker-less 3D Human Pose Annotations

Outward Influence and Cascade Size Estimation in Billion-scale Networks

Mean Square Error of Neural Spike Train Sequence Matching with Optogenetics

Replicator Equation: Applications Revisited

Simultaneous Inference for High Dimensional Mean Vectors

Stationary coupling method for renewal process in continuous time (application to strong bounds for the convergence rate of the distribution of the regenerative process)

A Novel Experimental Platform for In-Vessel Multi-Chemical Molecular Communications

Wireless Communication using Unmanned Aerial Vehicles (UAVs): Optimal Transport Theory for Hover Time Optimization

Linear Transceiver Design for Bidirectional Full-Duplex MIMO OFDM Systems

Shrinking characteristics of precision matrix estimators

A second order equation for Schrödinger bridges with applications to the hot gas experiment and entropic transportation cost

CT Image Reconstruction in a Low Dimensional Manifold

Versatile Robust Clustering of Ad Hoc Cognitive Radio Network

Tight Bounds for Sandpile Transience on the Two-Dimensional Grid up to Polylogarithmic Factors

Boosting with Structural Sparsity: A Differential Inclusion Approach

Mixture modeling on related samples by $ψ$-stick breaking and kernel perturbation

Self-Adaptive Differential Evolution for Bio-Inspired Neuromorphic Collision Avoidance

Fooling intersections of low-weight halfspaces

A Nonparametric Bayesian Methodology for Regression Discontinuity Designs

Learning Character-level Compositionality with Visual Features

Caching Policy Optimization for D2D Communications by Learning User Preference

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Effective Warm Start for the Online Actor-Critic Reinforcement Learning based mHealth Intervention

Coalescing particle systems and applications to nonlinear Fokker-Planck equations

An improved ellipsoid fitting algorithm using iterative random projections

A Sport Tournament Scheduling by Genetic Algorithm with Swapping Method

Monoidal computer III: A coalgebraic view of computability and complexity

Multi-View Image Generation from a Single-View

Envy-Free Matchings with Lower Quotas

Algorithmic releases on spanning trees of Jahangir graphs

Discretization Error of Stochastic Iterated Integrals

Markov-Dubins Path via Optimal Control Theory

Capacity-Achieving Input Distributions in Nondispersive Optical Fibers

Pseudorehearsal in actor-critic agents

A convex approach to differential inclusions with prox-regular sets: stability analysis and observer design

Deep Joint Entity Disambiguation with Local Neural Attention

Exact tests to compare contingency tables under quasi-independence and quasi-symmetry

Site Percolation on a Disordered Triangulation of the Square Lattice

Deep Relaxation: partial differential equations for optimizing deep neural networks

Certificate Transparency with Enhancements and Short Proofs

Shuffle group laws. Applications in free probability

Two point function for critical points of a random plane wave

Covert Communication in Wireless Relay Networks

Space-Optimal Majority in Population Protocols

AMTnet: Action-Micro-Tube regression by end-to-end trainable deep architecture

Adversarial and Clean Data Are Not Twins

Bayesian Hybrid Matrix Factorisation for Data Integration

The Combinatorics of Weighted Vector Compositions

Larger is Better: The Effect of Learning Rates Enjoyed by Stochastic Optimization with Progressive Variance Reduction

Absorption probabilities for Gaussian polytopes, and regular spherical simplices

Probabilistic programs for inferring the goals of autonomous agents

Multimodal Prediction and Personalization of Photo Edits with Deep Generative Models

Interval Arithmetic and Interval-Aware Operators for Genetic Programming

On Strong Determinacy of Countable Stochastic Games

Generalized Projections in Zn

Low Complexity Coefficient Selection Algorithms for Compute-and-Forward

A simple comparison between Skorokhod & Russo-Vallois integration for insider trading

End-to-end 3D face reconstruction with deep neural networks

Sparse Communication for Distributed Gradient Descent

Quivers with additive labelings: classification and algebraic entropy

Optimal Multi-Unit Mechanisms with Private Demands

Distributions-oriented wind forecast verification by a hidden Markov model for multivariate circular-linear data

Spatio-temporal circular models with non-separable covariance structure

The wrapped skew Gaussian process for analyzing spatio-temporal data

Hidden Markov model for discrete circular-linear wind data time series

Fast multi-output relevance vector regression

Counting Process Based Dimension Reduction Method for Censored Outcomes