Identifying Spatial Relations in Images using Convolutional Neural Networks

Traditional approaches to building a large scale knowledge graph have usually relied on extracting information (entities, their properties, and relations between them) from unstructured text (e.g. Dbpedia). Recent advances in Convolutional Neural Networks (CNN) allow us to shift our focus to learning entities and relations from images, as they build robust models that require little or no pre-processing of the images. In this paper, we present an approach to identify and extract spatial relations (e.g., The girl is standing behind the table) from images using CNNs. Our research addresses two specific challenges: providing insight into how spatial relations are learned by the network and which parts of the image are used to predict these relations. We use the pre-trained network VGGNet to extract features from an image and train a Multi-layer Perceptron (MLP) on a set of synthetic images and the sun09 dataset to extract spatial relations. The MLP predicts spatial relations without a bounding box around the objects or the space in the image depicting the relation. To understand how the spatial relations are represented in the network, a heatmap is overlayed on the image to show the regions that are deemed important by the network. Also, we analyze the MLP to show the relationship between the activation of consistent groups of nodes and the prediction of a spatial relation. We show how the loss of these groups affects the networks ability to identify relations.


On Optimistic versus Randomized Exploration in Reinforcement Learning

We discuss the relative merits of optimistic and randomized approaches to exploration in reinforcement learning. Optimistic approaches presented in the literature apply an optimistic boost to the value estimate at each state-action pair and select actions that are greedy with respect to the resulting optimistic value function. Randomized approaches sample from among statistically plausible value functions and select actions that are greedy with respect to the random sample. Prior computational experience suggests that randomized approaches can lead to far more statistically efficient learning. We present two simple analytic examples that elucidate why this is the case. In principle, there should be optimistic approaches that fare well relative to randomized approaches, but that would require intractable computation. Optimistic approaches that have been proposed in the literature sacrifice statistical efficiency for the sake of computational efficiency. Randomized approaches, on the other hand, may enable simultaneous statistical and computational efficiency.


Online Convolutional Dictionary Learning for Multimodal Imaging

Computational imaging methods that can exploit multiple modalities have the potential to enhance the capabilities of traditional sensing systems. In this paper, we propose a new method that reconstructs multimodal images from their linear measurements by exploiting redundancies across different modalities. Our method combines a convolutional group-sparse representation of images with total variation (TV) regularization for high-quality multimodal imaging. We develop an online algorithm that enables the unsupervised learning of convolutional dictionaries on large-scale datasets that are typical in such applications. We illustrate the benefit of our approach in the context of joint intensity-depth imaging.


Teaching Compositionality to CNNs

Convolutional neural networks (CNNs) have shown great success in computer vision, approaching human-level performance when trained for specific tasks via application-specific loss functions. In this paper, we propose a method for augmenting and training CNNs so that their learned features are compositional. It encourages networks to form representations that disentangle objects from their surroundings and from each other, thereby promoting better generalization. Our method is agnostic to the specific details of the underlying CNN to which it is applied and can in principle be used with any CNN. As we show in our experiments, the learned representations lead to feature activations that are more localized and improve performance over non-compositional baselines in object recognition tasks.


SEARNN: Training RNNs with Global-Local Losses

We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the ‘learning to search’ (L2S) approach to structured prediction. RNNs have been widely successful in structured prediction applications such as machine translation or parsing, and are commonly trained using maximum likelihood estimation (MLE). Unfortunately, this training loss is not always an appropriate surrogate for the test error: by only maximizing the ground truth probability, it fails to exploit the wealth of information offered by structured losses. Further, it introduces discrepancies between training and predicting (such as exposure bias) that may hurt test performance. Instead, SEARNN leverages test-alike search space exploration to introduce global-local losses that are closer to the test error. We demonstrate improved performance over MLE on three different tasks: OCR, spelling correction and text chunking. Finally, we propose a subsampling strategy to enable SEARNN to scale to large vocabulary sizes.


On Calibration of Modern Neural Networks

Confidence calibration — the problem of predicting probability estimates representative of the true correctness likelihood — is important for classification models in many applications. We discover that modern neural networks, unlike those from a decade ago, are poorly calibrated. Through extensive experiments, we observe that depth, width, weight decay, and Batch Normalization are important factors influencing calibration. We evaluate the performance of various post-processing calibration methods on state-of-the-art architectures with image and document classification datasets. Our analysis and experiments not only offer insights into neural network learning, but also provide a simple and straightforward recipe for practical settings: on most datasets, temperature scaling — a single-parameter variant of Platt Scaling — is surprisingly effective at calibrating predictions.


Identifying Condition-Action Statements in Medical Guidelines Using Domain-Independent Features
Hybrid Reward Architecture for Reinforcement Learning
Adversarially Regularized Autoencoders for Generating Discrete Structures
Inner Rank and Lower Bounds for Matrix Multiplication
Picking Winners: A Framework For Venture Capital Investment
A Hybrid Observer for a Distributed Linear System with a Changing Neighbor Graph
Boundary Controllability Of Two Vibrating Strings Connected By A Point Mass With Variable Coefficients
Star of David and other patterns in the Hosoya-like polynomials triangles
Turán numbers for Berge-hypergraphs and related extremal problems
Online Estimation and Adaptive Control for a Class of History Dependent Functional Differential Equations
Complex Contagions with Timers
Automatic Localization of Deep Stimulation Electrodes Using Trajectory-based Segmentation Approach
Structured Connectivity Augmentation
The ‘something something’ video database for learning and evaluating visual common sense
Optimization by a quantum reinforcement algorithm
von Mises-Fisher Mixture Model-based Deep learning: Application to Face Verification
Transfer entropy-based feedback improves performance in artificial neural networks
Stochastic Optimal Power Flow Based on Data-Driven Distributionally Robust Optimization
Action Search: Learning to Search for Human Activities in Untrimmed Videos
On the risk of convex-constrained least squares estimators under misspecification
AFIF4: Deep Gender Classification based on AdaBoost-based Fusion of Isolated Facial Features and Foggy Faces
Structure and Interpretation of Dual-Feasible Functions
When Image Denoising Meets High-Level Vision Tasks: A Deep Learning Approach
Saliency detection by aggregating complementary background template with optimization framework
Leveraging Node Attributes for Incomplete Relational Data
A general method for lower bounds on fluctuations of random variables
Compressed Secret Key Agreement: Maximizing Multivariate Mutual Information Per Bit
Network Simplex Algorithm associated with the Maximum Flow Problem
Accurate Pulmonary Nodule Detection in Computed Tomography Images Using Deep Convolutional Neural Networks
Dueling Bandits With Weak Regret
Photo-realistic Facial Texture Transfer
Emergent Bistability in 2D Dusty Plasma Crystals
RoboCup 2D Soccer Simulation League: Evaluation Challenges
A Class of Discrete-time Mean-field Stochastic Linear-quadratic Optimal Control Problems with Financial Application
Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics
Hierarchical Gaussian Descriptors with Application to Person Re-Identification
Assessing Economic Outcomes in Simulated Reverse Clock Auctions for Radio Spectrum
Transfer Learning for Neural Semantic Parsing
Stochastic Kuramoto oscillators with discrete phase states
MATIC: Adaptation and In-situ Canaries for Energy-Efficient Neural Network Acceleration
On Gallai’s and Hajós’ Conjectures for graphs with treewidth at most 3
Predictive modelling of training loads and injury in Australian football
Anonymization of System Logs for Privacy and Storage Benefits
Towards Adaptive Resilience in High Performance Computing
WLS-Based Self-Localization Using Perturbed Anchor Positions and RSSI Measurements
Sequential Channel Estimation in the Presence of Random Phase Noise in NB-IoT Systems
Magnetic properties of nanoparticles compacts with controlled broadening of the particle size distribution
Effects of parametric uncertainties in cascaded open quantum harmonic oscillators and robust generation of Gaussian invariant states
Recommending links through influence maximization
From Relational Data to Graphs: Inferring Significant Links using Generalized Hypergeometric Ensembles
A survey of dimensionality reduction techniques based on random projection
Zoom-in-Net: Deep Mining Lesions for Diabetic Retinopathy Detection
Dissipativity Theory for Nesterov’s Accelerated Method
Shape-Color Differential Moment Invariants under Affine Transformations
Alignment Distances on Systems of Bags
Fine-grained human evaluation of neural versus phrase-based machine translation
Spatio-Temporal Forecasting by Coupled Stochastic Differential Equations: Applications to Solar Power
$ν$-net: Deep Learning for Generalized Biventricular Cardiac Mass and Function Parameters
Enhanced discrete particle swarm optimization path planning for UAV vision-based surface inspection
Runtime Verification for Business Processes Utilizing the Bitcoin Blockchain
Strong converse bounds for high-dimensional estimation
Revisiting the Hamiltonian Theme in the Square of a Block: The Case of DT-Graphs
The ratio of normalizing constants for Bayesian graphical Gaussian model selection
Is Natural Language Strongly Nonergodic? A Stronger Theorem about Facts and Words
On Distributed Power Control for Uncoordinated Dual Energy Harvesting Links: Performance Bounds and Near-Optimal Policies
Time-optimal control strategies in SIR epidemic models
Correlations between thresholds and degrees: An analytic approach to model attacks and failure cascades
Hybrid Collaborative Recommendation via Semi-AutoEncoder
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
Positivity of Cylindric skew Schur functions
Simultaneous merging multiple grid maps using the robust motion averaging
Statistical properties of coronal hole rotation rates: Are they linked to the solar interior?
Tropical Kraus maps for optimal control of switched systems
SalProp: Salient object proposals via aggregated edge cues
Idea density for predicting Alzheimer’s disease from transcribed speech
A conditional greedy algorithm for edge-coloring
Learning and Evaluating Musical Features with Deep Autoencoders
Large-Scale YouTube-8M Video Understanding with Deep Neural Networks
Learning local shape descriptors with view-based convolutional networks
Comparison results for highly degenerate parabolic equations with univariate convex data and optimal strategies for options on trading accounts
Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification
On sojourn of Brownian motion inside moving boundaries
Evaluating Personal Assistants on Mobile devices
Free Energy of the Cauchy Directed Polymer Model at High Temperature
Edge-Erasures and Chordal Graphs
On Error Detection in Asymmetric Channels
Accelerated Reinforcement Learning Algorithms with Nonparametric Function Approximation for Opportunistic Spectrum Access
Block-space GPU Mapping for Embedded Sierpiński Gasket Fractals
Graphs with degree complete labeling
Neural Models for Key Phrase Detection and Question Generation
Quantifying genuine multipartite correlations and their pattern complexity
Realized volatility and parametric estimation of Heston SDEs
A Fast Foveated Fully Convolutional Network Model for Human Peripheral Vision
Deep Learning Methods for Efficient Large Scale Video Labeling
The Opacity of Backbones and Backdoors Under a Weak Assumption
Hyperscaling breakdown and Ising Spin Glasses: the Binder cumulant
Learning without Prejudice: Avoiding Bias in Webly-Supervised Action Recognition
Provable benefits of representation learning
Nudged elastic band calculations accelerated with Gaussian process regression

Advertisements