CortexNet: a Generic Network Family for Robust Visual Temporal Representations

In the past five years we have observed the rise of incredibly well performing feed-forward neural networks trained supervisedly for vision related tasks. These models have achieved super-human performance on object recognition, localisation, and detection in still images. However, there is a need to identify the best strategy to employ these networks with temporal visual inputs and obtain a robust and stable representation of video data. Inspired by the human visual system, we propose a deep neural network family, CortexNet, which features not only bottom-up feed-forward connections, but also it models the abundant top-down feedback and lateral connections, which are present in our visual cortex. We introduce two training schemes – the unsupervised MatchNet and weakly supervised TempoNet modes – where a network learns how to correctly anticipate a subsequent frame in a video clip or the identity of its predominant subject, by learning egomotion clues and how to automatically track several objects in the current scene. Find the project website at https://…/.

Avoiding Discrimination through Causal Reasoning

Recent work on fairness in machine learning has focused on various statistical discrimination criteria and how they trade off. Most of these criteria are observational: They depend only on the joint distribution of predictor, protected attribute, features, and outcome. While convenient to work with, observational criteria have severe inherent limitations that prevent them from resolving matters of fairness conclusively. Going beyond observational criteria, we frame the problem of discrimination based on protected attributes in the language of causal reasoning. This viewpoint shifts attention from ‘What is the right fairness criterion?’ to ‘What do we want to assume about the causal data generating process?’ Through the lens of causality, we make several contributions. First, we crisply articulate why and when observational criteria fail, thus formalizing what was before a matter of opinion. Second, our approach exposes previously ignored subtleties and why they are fundamental to the problem. Finally, we put forward natural causal non-discrimination criteria and develop algorithms that satisfy them.

Gated Orthogonal Recurrent Units: On Learning to Forget

We present a novel recurrent neural network (RNN) architecture that combines the remembering ability of unitary RNNs with the ability of gated RNNs to effectively forget redundant information in the input sequence. We achieve this by extending Unitary RNNs with a gating mechanism. Our model is able to outperform LSTMs, GRUs and Unitary RNNs on different benchmark tasks, as the ability to simultaneously remember long term dependencies and forget irrelevant information in the input sequence helps with many natural long term sequential tasks such as language modeling and question answering. We provide competitive results along with an analysis of our model on the bAbI Question Answering task, PennTreeBank, as well as synthetic tasks that involve long-term dependencies such as parenthesis, denoising and copying tasks.

Random projections for linear programming

Random projections are random linear maps, sampled from appropriate distributions, that approx- imately preserve certain geometrical invariants so that the approximation improves as the dimension of the space grows. The well-known Johnson-Lindenstrauss lemma states that there are random ma- trices with surprisingly few rows that approximately preserve pairwise Euclidean distances among a set of points. This is commonly used to speed up algorithms based on Euclidean distances. We prove that these matrices also preserve other quantities, such as the distance to a cone. We exploit this result to devise a probabilistic algorithm to solve linear programs approximately. We show that this algorithm can approximately solve very large randomly generated LP instances. We also showcase its application to an error correction coding problem.

Source Forager: A Search Engine for Similar Source Code

Developers spend a significant amount of time searching for code: e.g., to understand how to complete, correct, or adapt their own code for a new context. Unfortunately, the state of the art in code search has not evolved much beyond text search over tokenized source. Code has much richer structure and semantics than normal text, and this property can be exploited to specialize the code-search process for better querying, searching, and ranking of code-search results. We present a new code-search engine named Source Forager. Given a query in the form of a C/C++ function, Source Forager searches a pre-populated code database for similar C/C++ functions. Source Forager preprocesses the database to extract a variety of simple code features that capture different aspects of code. A search returns the k functions in the database that are most similar to the query, based on the various extracted code features. We tested the usefulness of Source Forager using a variety of code-search queries from two domains. Our experiments show that the ranked results returned by Source Forager are accurate, and that query-relevant functions can be reliably retrieved even when searching through a large code database that contains very few query-relevant functions. We believe that Source Forager is a first step towards much-needed tools that provide a better code-search experience.

Granger Causality Networks for Categorical Time Series

We present a new framework for learning Granger causality networks for multivariate categorical time series, based on the mixture transition distribution (MTD) model. Traditionally, MTD is plagued by a nonconvex objective, non-identifiability, and presence of many local optima. To circumvent these problems, we recast inference in the MTD as a convex problem. The new formulation facilitates the application of MTD to high-dimensional multivariate time series. As a baseline, we also formulate a multi-output logistic autoregressive model (mLTD), which while a straightforward extension of autoregressive Bernoulli generalized linear models, has not been previously applied to the analysis of multivariate categorial time series. We develop novel identifiability conditions of the MTD model and compare them to those for mLTD. We further devise novel and efficient optimization algorithm for the MTD based on the new convex formulation, and compare the MTD and mLTD in both simulated and real data experiments. Our approach simultaneously provides a comparison of methods for network inference in categorical time series and opens the door to modern, regularized inference with the MTD model.

The FastMap Algorithm for Shortest Path Computations

We present a new preprocessing algorithm for embedding the nodes of a given edge-weighted undirected graph into a Euclidean space. In this space, the Euclidean distance between any two nodes approximates the length of the shortest path between them in the given graph. Later, at runtime, a shortest path between any two nodes can be computed using A* search with the Euclidean distances as heuristic estimates. Our preprocessing algorithm, dubbed FastMap, is inspired by the Data Mining algorithm of the same name and runs in near-linear time. Hence, FastMap is orders of magnitude faster than competing approaches that produce a Euclidean embedding using Semidefinite Programming. Our FastMap algorithm also produces admissible and consistent heuristics and therefore guarantees the generation of optimal paths. Moreover, FastMap works on general undirected graphs for which many traditional heuristics, such as the Manhattan Distance heuristic, are not always well defined. Empirically too, we demonstrate that the FastMap heuristic is competitive with other state-of-the-art heuristics like the Differential heuristic.

A Maximum Matching Algorithm for Basis Selection in Spectral Learning

We present a solution to scale spectral algorithms for learning sequence functions. We are interested in the case where these functions are sparse (that is, for most sequences they return 0). Spectral algorithms reduce the learning problem to the task of computing an SVD decomposition over a special type of matrix called the Hankel matrix. This matrix is designed to capture the relevant statistics of the training sequences. What is crucial is that to capture long range dependencies we must consider very large Hankel matrices. Thus the computation of the SVD becomes a critical bottleneck. Our solution finds a subset of rows and columns of the Hankel that realizes a compact and informative Hankel submatrix. The novelty lies in the way that this subset is selected: we exploit a maximal bipartite matching combinatorial algorithm to look for a sub-block with full structural rank, and show how computation of this sub-block can be further improved by exploiting the specific structure of Hankel matrices.

Symmetry Learning for Function Approximation in Reinforcement Learning

In this paper we explore methods to exploit symmetries for ensuring sample efficiency in reinforcement learning (RL), this problem deserves ever increasing attention with the recent advances in the use of deep networks for complex RL tasks which require large amount of training data. We introduce a novel method to detect symmetries using reward trails observed during episodic experience and prove its completeness. We also provide a framework to incorporate the discovered symmetries for functional approximation. Finally we show that the use of potential based reward shaping is especially effective for our symmetry exploitation mechanism. Experiments on various classical problems show that our method improves the learning performance significantly by utilizing symmetry information.

Causes and Corrections for Bimodal Multipath Scanning with Structured Light
Random projections for trust region subproblems
Capacity Comparison between MIMO-NOMA and MIMO-OMA with Multiple Users in a Cluster
Climbing a shaky ladder: Better adaptive risk estimation
Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM
Statistically Characterizing the Electrical Parameters of the Grid Transformers and Transmission Lines
Sympathy Begins with a Smile, Intelligence Begins with a Word: Use of Multimodal Features in Spoken Human-Robot Interaction
Local behavior of local times of super Brownian motion
Evolutionary Multitasking for Multiobjective Continuous Optimization: Benchmark Problems, Performance Metrics and Baseline Results
Optimizing expected word error rate via sampling for speech recognition
Setting Players’ Behaviors in World of Warcraft through Semi-Supervised Learning
Linear Hashing is Awesome
Optimal parameters for bloom-filtered joins in Spark
On the Development of Intelligent Agents for MOBA Games
From orbital measures to Littlewood-Richardson coefficients and hive polytopes
Rapid Randomized Restarts for Multi-Agent Path Finding Solvers
A Deep Causal Inference Approach to Measuring the Effects of Forming Group Loans in Online Non-profit Microfinance Platform
Dynamic Difficulty Adjustment on MOBA Games
Renewal-Theoretic Packet Collision Modeling under Long-Tailed Heterogeneous Traffic
Delocalization in infinite disordered 2D lattices of different geometry
Semipullbacks of labelled Markov processes
Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds
Learning to Embed Words in Context for Syntactic Tasks
A randomized Halton algorithm in R
Asymptotic behaviors in the homology of symmetric group and finite general linear group quandles
Rigorous statistical analysis of HTTPS reachability
From Bayesian Sparsity to Gated Recurrent Nets
TextureGAN: Controlling Deep Image Synthesis with Texture Patches
Patch planting spin-glass solution for benchmarking
On-line Assembling Mitochondrial DNA from de novo transcriptome
Time Series Using Exponential Smoothing Cells
A note on the maximum number of triangles in a $C_5$-free graph
A Tutor Agent for MOBA Games
Affine Type $A$ Geometric Crystal on the Grassmannian
Weakly supervised training of deep convolutional neural networks for overhead pedestrian localization in depth fields
Joint Beamforming and Power Splitting Control in Downlink Cooperative SWIPT NOMA Systems
Efficient Fast-Convolution-Based Waveform Processing for 5G Physical Layer
Reinforced coverage of space by random sets
Assigning personality/identity to a chatting machine for coherent conversation generation
Face Detection through Scale-Friendly Deep Convolutional Networks
Class-specific Poisson denoising by patch-based importance sampling
Adaptive Consensus ADMM for Distributed Optimization
A New Randomized Block-Coordinate Primal-Dual Proximal Algorithm for Distributed Optimization
Overview of the NLPCC 2017 Shared Task: Chinese News Headline Categorization
Learning to Learn from Noisy Web Videos
CiNCT: Compression and retrieval for massive vehicular trajectories via relative movement labeling
Global Convergence of the (1+1) Evolution Strategy
DCCO: Towards Deformable Continuous Convolution Operators
MirBot, a collaborative object recognition system for smartphones using convolutional neural networks
The structure of ABC-minimal trees with given number of leaves
Embedding Quartic Eulerian Digraphs on the Plane
Assessing the Performance of Deep Learning Algorithms for Newsvendor Problem
An Efficient Manifold Algorithm for Constructive Interference based Constant Envelope Precoding
Multi-Modal Obstacle Detection in Unstructured Environments with Conditional Random Fields
Whitney’s Theorem for 2-Regular Planar Digraphs
Extinction in lower Hessenberg branching processes with countably many types
End-to-End Musical Key Estimation Using a Convolutional Neural Network
Sesqui-arrays, including triple arrays
Unsupervised object learning from dense equivariant image labelling
Common agency dilemma with information asymmetry in continuous time
Wiener integrals with respect to Yeh processes
An Efficient Algorithm for Computing High-Quality Paths amid Polygonal Obstacles
Bayesian nonparametrics for stochastic epidemic models
Backbone scaling limit of the high-dimensional IIC: extended version
Limit theorems for random polytopes with vertices on convex surfaces
On the role of the overall effect in exponential families
TIP: Typifying the Interpretability of Procedures
Some Stability Properties of Parametric Quadratically Constrained Nonconvex Quadratic Programs in Hilbert Spaces
On the Strong Scaling of the Spectral Element Solver Nek5000 on Petascale Systems
Probability, Statistics and Planet Earth, I: Geotemporal covariances
Cell-size distribution and scaling in a one-dimensional KJMA lattice model with continuous nucleation
Monte-Carlo Tree Search by Best Arm Identification
Exactly solvable model of memristive circuits: Lyapunov functional and mean field theory
Learning to Detect Red Lesions in Fundus Photographs: An Ensemble Approach based on Deep Learning
Multi-rubric Models for Ordinal Spatial Data with Application to Online Ratings from Yelp
Manifold Regularized Slow Feature Analysis for Dynamic Texture Recognition
Modular Forms and $k$-colored Generalized Frobenius Partitions
Invariance Pressure for Control Systems
Characterizations of multinormality and corresponding tests of fit, including for Garch models
Practical Evaluation of Lempel-Ziv-78 and Lempel-Ziv-Welch Tries
Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection
Hypersurfaces in weighted projective spaces over finite fields with applications to coding theory
Existence of an unbounded vacant set for subcritical continuum percolation
Labeled plane binary trees and Schur-positivity
Depthwise Separable Convolutions for Neural Machine Translation