The Effectiveness of Data Augmentation in Image Classification using Deep Learning

In this paper, we explore and compare multiple solutions to the problem of data augmentation in image classification. Previous work has demonstrated the effectiveness of data augmentation through simple techniques, such as cropping, rotating, and flipping input images. We artificially constrain our access to data to a small subset of the ImageNet dataset, and compare each data augmentation technique in turn. One of the more successful data augmentations strategies is the traditional transformations mentioned above. We also experiment with GANs to generate images of different styles. Finally, we propose a method to allow a neural net to learn augmentations that best improve the classifier, which we call neural augmentation. We discuss the successes and shortcomings of this method on various datasets.

Ellipsoid Method for Linear Programming made simple

In this paper, ellipsoid method for linear programming is derived using only minimal knowledge of algebra and matrices. Unfortunately, most authors first describe the algorithm, then later prove its correctness, which requires a good knowledge of linear algebra.

Mathematics of Deep Learning

Recently there has been a dramatic increase in the performance of recognition systems due to the introduction of deep architectures for representation learning and classification. However, the mathematical reasons for this success remain elusive. This tutorial will review recent work that aims to provide a mathematical justification for several properties of deep networks, such as global optimality, geometric stability, and invariance of the learned representations.

A short characterization of relative entropy

We prove characterization theorems for relative entropy (also known as Kullback-Leibler divergence), q-logarithmic entropy (also known as Tsallis entropy), and q-logarithmic relative entropy. All three have been characterized axiomatically before, but we show that earlier proofs can be simplified considerably, at the same time relaxing some of the hypotheses.

Evolving Unsupervised Deep Neural Networks for Learning Meaningful Representations

Deep Learning (DL) aims at learning the \emph{meaningful representations}. A meaningful representation refers to the one that gives rise to significant performance improvement of associated Machine Learning (ML) tasks by replacing the raw data as the input. However, optimal architecture design and model parameter estimation in DL algorithms are widely considered to be intractable. Evolutionary algorithms are much preferable for complex and non-convex problems due to its inherent characteristics of gradient-free and insensitivity to local optimum. In this paper, we propose a computationally economical algorithm for evolving \emph{unsupervised deep neural networks} to efficiently learn \emph{meaningful representations}, which is very suitable in the current Big Data era where sufficient labeled data for training is often expensive to acquire. In the proposed algorithm, finding an appropriate architecture and the initialized parameter values for a ML task at hand is modeled by one computational efficient gene encoding approach, which is employed to effectively model the task with a large number of parameters. In addition, a local search strategy is incorporated to facilitate the exploitation search for further improving the performance. Furthermore, a small proportion labeled data is utilized during evolution search to guarantee the learnt representations to be meaningful. The performance of the proposed algorithm has been thoroughly investigated over classification tasks. Specifically, error classification rate on MNIST with 1.15\% is reached by the proposed algorithm consistently, which is a very promising result against state-of-the-art unsupervised DL algorithms.

MentorNet: Regularizing Very Deep Neural Networks on Corrupted Labels

Recent studies have discovered that deep networks are capable of memorizing the entire data even when the labels are completely random. Since deep models are trained on big data where labels are often noisy, the ability to overfit noise can lead to poor performance. To overcome the overfitting on corrupted training data, we propose a novel technique to regularize deep networks in the data dimension. This is achieved by learning a neural network called MentorNet to supervise the training of the base network, namely, StudentNet. Our work is inspired by curriculum learning and advances the theory by learning a curriculum from data by neural networks. We demonstrate the efficacy of MentorNet on several benchmarks. Comprehensive experiments show that it is able to significantly improve the generalization performance of the state-of-the-art deep networks on corrupted training data.

A Two-stage Online Monitoring Procedure for High-Dimensional Data Streams

Advanced computing and data acquisition technologies have made possible the collection of high-dimensional data streams in many fields. Efficient online monitoring tools which can correctly identify any abnormal data stream for such data are highly sought after. However, most of the existing monitoring procedures directly apply the false discover rate (FDR) controlling procedure to the data at each time point, and the FDR at each time point (the point-wise FDR) is either specified by users or determined by the in-control (IC) average run length (ARL). If the point-wise FDR is specified by users, the resulting procedure lacks control of the global FDR and keeps users in the dark in terms of the IC-ARL. If the point-wise FDR is determined by the IC-ARL, the resulting procedure does not give users the flexibility to choose the number of false alarms (Type-I errors) they can tolerate when identifying abnormal data streams, which often makes the procedure too conservative. To address those limitations, we propose a two-stage monitoring procedure that can control both the IC-ARL and Type-I errors at the levels specified by users. As a result, the proposed procedure allows users to choose not only how often they expect any false alarms when all data streams are IC, but also how many false alarms they can tolerate when identifying abnormal data streams. With this extra flexibility, our proposed two-stage monitoring procedure is shown in the simulation study and real data analysis to outperform the exiting methods.

Relation Extraction : A Survey

With the advent of the Internet, large amount of digital text is generated everyday in the form of news articles, research publications, blogs, question answering forums and social media. It is important to develop techniques for extracting information automatically from these documents, as lot of important information is hidden within them. This extracted information can be used to improve access and management of knowledge hidden in large text corpora. Several applications such as Question Answering, Information Retrieval would benefit from this information. Entities like persons and organizations, form the most basic unit of the information. Occurrences of entities in a sentence are often linked through well-defined relations; e.g., occurrences of person and organization in a sentence may be linked through relations such as employed at. The task of Relation Extraction (RE) is to identify such relations automatically. In this paper, we survey several important supervised, semi-supervised and unsupervised RE techniques. We also cover the paradigms of Open Information Extraction (OIE) and Distant Supervision. Finally, we describe some of the recent trends in the RE techniques and possible future research directions. This survey would be useful for three kinds of readers – i) Newcomers in the field who want to quickly learn about RE; ii) Researchers who want to know how the various RE techniques evolved over time and what are possible future research directions and iii) Practitioners who just need to know which RE technique works best in various settings.

Point-wise Convolutional Neural Network

Deep learning with 3D data such as reconstructed point clouds and CAD models has received great research interests recently. However, the capability of using point clouds with convolutional neural network has been so far not fully explored. In this technical report, we present a convolutional neural network for semantic segmentation and object recognition with 3D point clouds. At the core of our network is point-wise convolution, a convolution operator that can be applied at each point of a point cloud. Our fully convolutional network design, while being simple to implement, can yield competitive accuracy in both semantic segmentation and object recognition task.

Transfer Adversarial Hashing for Hamming Space Retrieval
Balance and Frustration in Signed Networks under Different Contexts
Stochastic Low-Rank Bandits
Learning Disentangling and Fusing Networks for Face Completion Under Structured Occlusions
Asymptotic properties of expansive Galton-Watson trees
Sixty years of percolation
Variance reduction via empirical variance minimization: convergence and complexity
Everything You Always Wanted to Know About TREC RTS* (*But Were Afraid to Ask)
Convex programming in optimal control and information theory
Adaptation to criticality through organizational invariance in embodied agents
Can Balloons Produce Li-Fi? A Disaster Management Perspective
Stability Selection for Structured Variable Selection
Energy-Efficient Non-Orthogonal Transmission under Reliability and Finite Blocklength Constraints
Duality of optimization problems with gauge functions
Interference Characterization in Downlink Li-Fi Optical Attocell Networks
UV-GAN: Adversarial Facial UV Map Completion for Pose-invariant Face Recognition
The Enhanced Hybrid MobileNet
On Generalized Edge Corona Product of Graphs
Calculus for directional limiting normal cones and subdifferentials
Sensitivity of rough differential equations: an approach through the Omega lemma
Differentiable lower bound for expected BLEU score
A Quantum Extension of Variational Bayes Inference
Regularization and Optimization strategies in Deep Convolutional Neural Network
Efficient Computation of the Stochastic Behavior of Partial Sum Processes
Bayesian graphical compositional regression for microbiome data
Gorenstein liaison for toric ideals of graphs
Limit theorems for the Multiplicative Binomial Distribution (MBD)
On the critical threshold for continuum AB percolation
Random permutations without macroscopic cycles
Error Performance of Wireless Powered Cognitive Relay Networks with Interference Alignment
On the Capacity of Wireless Powered Cognitive Relay Network with Interference Alignment
Ergodic Capacity Analysis of Wireless Powered AF Relaying Systems over $α$-$μ$ Fading Channels
Exponential convergence of testing error for stochastic gradient methods
Self-normalized Cramer type moderate deviations for martingales
Penalty Dual Decomposition Method For Nonsmooth Nonconvex Optimization
Biggins’ Martingale Convergence for Branching Lévy Processes
Approximation of Sojourn Times of Gaussian Processes
Random non-Abelian circulant matrices. Spectrum of random convolution operators on large finite groups
Multiple testing for outlier detection in functional data
GMM-Based Synthetic Samples for Classification of Hyperspectral Images With Limited Training Data
Creating New Language and Voice Components for the Updated MaryTTS Text-to-Speech Synthesis Platform
A Multimodal Corpus of Expert Gaze and Behavior during Phonetic Segmentation Tasks
Open data, open review and open dialogue in making social sciences plausible
Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments
A duality principle for a semi-linear model in micro-magnetism
The Hyperbolic-type Point Process
Explicit bounds for Lipschitz constant of solution to basic problem in calculus of variations
Ballpark Crowdsourcing: The Wisdom of Rough Group Comparisons
Explicit bounds for solutions to optimal control problems
Symbol detection in online handwritten graphics using Faster R-CNN
MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
Optimal Stochastic Desencoring and Applications to Calibration of Market Models
A Permutation Test on Complex Sample Data
Self-Supervised Depth Learning for Urban Scene Understanding
Rethinking Spatiotemporal Feature Learning For Video Understanding
A User-Study on Online Adaptation of Neural Machine Translation to Human Post-Edits
Active phase for activated random walks on $\mathbb{Z}^d$, $ d \geq 3$, with density less than one and arbitrary sleeping rate
Rough Fuzzy Quadratic Minimum Spanning Tree Problem
Spatial-temporal wind field prediction by Artificial Neural Networks
A study of elliptic partial differential equations with jump diffusion coefficients
A combinatorial description of the centralizer algebras connected to the Links-Gould Invariant
Distance magic labelings of product graphs
Geometric ergodicity for some space-time max-stable Markov chains
Closing in on Time and Space Optimal Construction of Compressed Indexes
Refuting the cavity-method threshold for random 3-SAT
The Edge Universality of Correlated Matrices
Performance Analysis of Approximate Message Passing for Distributed Compressed Sensing
Approximate controllability for Navier–Stokes equations in $\mathrm{3D}$ rectangles under Lions boundary conditions
Reasoning in Systems with Elements that Randomly Switch Characteristics
FFT-Based Deep Learning Deployment in Embedded Systems
Statistical physics on a product of trees
Learning Objectives for Treatment Effect Estimation
The trisection genus of standard simply connected PL 4-manifolds
Multiplicative Convolution of Real Asymmetric and Real Antisymmetric Matrices
Recognizing Linked Domain in Polynomial Time
Tensor Sensing for RF Tomographic Imaging
Combination Networks with or without Secrecy Constraints: The Impact of Caching Relays
Localization of Extended Quantum Objects
Real-time Egocentric Gesture Recognition on Mobile Head Mounted Displays
Fractal dimension of interfaces in Edwards-Anderson spin glasses for up to six space dimensions
An Improved Feedback Coding Scheme for the Wire-tap Channel
Persistent Memory Programming Abstractions in Context of Concurrent Applications
Predicting Station-level Hourly Demands in a Large-scale Bike-sharing Network: A Graph Convolutional Neural Network Approach
The List Linear Arboricity of Graphs
Permuted composition tableaux, 0-Hecke algebra and labeled binary trees
QPTAS and Subexponential Algorithm for Maximum Clique on Disk Graphs
Local False Discovery Rate Based Methods for Multiple Testing of One-Way Classified Hypotheses
Learning Low-shot facial representations via 2D warping
Deep Prior
Lock-free B-slack trees: Highly Space Efficient B-trees
Unsupervised Histopathology Image Synthesis
Magnetotransport in a model of a disordered strange metal
Parametrizations of $k$-Nonnegative Matrices: Cluster Algebras and $k$-Positivity Tests
Reservation-Based Federated Scheduling for Parallel Real-Time Tasks
Step bunching with both directions of the current: Vicinal W(110) surfaces versus atomistic scale model
A Particle Swarm Optimization-based Flexible Convolutional Auto-Encoder for Image Classification
Pediatric Bone Age Assessment Using Deep Convolutional Neural Networks
Outcome Based Matching
Statistical Inference in Fractional Poisson Ornstein-Uhlenbeck Process
Neural networks catching up with finite differences in solving partial differential equations in higher dimensions
Nonparametric Adaptive CUSUM Chart for Detecting Arbitrary Distributional Changes
Quantum ergodicity in the SYK model
Weakly Supervised Action Localization by Sparse Temporal Pooling Network
Extreme 3D Face Reconstruction: Looking Past Occlusions
Learning to Navigate by Growing Deep Networks
Optimized Sampling for Multiscale Dynamics
Learning Binary Residual Representations for Domain-specific Video Streaming
DAMPE squib? Significance of the 1.4 TeV DAMPE excess
The central limit theorem for the number of clusters of the Arratia flow
The Sound and the Fury: Hiding Communications in Noisy Wireless Networks with Interference Uncertainty
Range Queries in Non-blocking $k$-ary Search Trees
Optimality Of Community Structure In Complex Networks
Detection and Attention: Diagnosing Pulmonary Lung Cancer from CT by Imitating Physicians
Corrigendum to ‘SPN graphs: when copositive $=$ SPN’
Multi-appearance Segmentation and Extended 0-1 Program for Dense Small Object Tracking
Passing the Brazilian OAB Exam: data preparation and some experiments
An Enhanced Access Reservation Protocol with a Partial Preamble Transmission Mechanism in NB-IoT Systems
Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition
A Statistical Model with Qualitative Input
Queueing Analysis for Block Fading Rayleigh Channels in the Low SNR Regime
Age of Information in Two-way Updating Systems Using Wireless Power Transfer
Nonlinearity-tolerant 8D modulation formats by set-partitioning PDM-QPSK
$\forall \exists \mathbb{R}$-completeness and area-universality
Optimized Interface Diversity for Ultra-Reliable Low Latency Communication (URLLC)
Fast robust correlation for high dimensional data
Structural and computational results on platypus graphs
Fluctuation Theorem and Thermodynamic Formalism
Analysis of Latency and MAC-layer Performance for Class A LoRaWAN
Rasa: Open Source Language Understanding and Dialogue Management
Rate of Change Analysis for Interestingness Measures
Towards Deep Modeling of Music Semantics using EEG Regularizers
Semi-Automatic Algorithm for Breast MRI Lesion Segmentation Using Marker-Controlled Watershed Transformation
Cellular Automata Applications in Shortest Path Problem
Constrained BSDEs driven by a non quasi-left-continuous random measure and optimal control of PDMPs on bounded domains
Approximation Algorithms for Replenishment Problems with Fixed Turnover Times
Data Structures for Representing Symmetry in Quadratically Constrained Quadratic Programs
Response of entanglement to annealed vis-à-vis quenched disorder in quantum spin models
Isogeometric shape optimization for nonlinear ultrasound focusing
Context-specific independencies for ordinal variables in chain regression models
Robust Estimation of Similarity Transformation for Visual Object Tracking with Correlation Filters
Generalized Degrees of Freedom of the Symmetric Cache-Aided MISO Broadcast Channel with Partial CSIT
Intrinsic Point of Interest Discovery from Trajectory Data
Image Super-resolution via Feature-augmented Random Forest
Proximodistal Exploration in Motor Learning as an Emergent Property of Optimization
The evaluation of geometric Asian power options under time changed mixed fractional Brownian motion
Poisson brackets symmetry from the pentagon-wheel cocycle in the graph complex
A Performance Evaluation of Local Features for Image Based 3D Reconstruction
Strictly proper kernel scores and characteristic kernels on compact spaces
A Bayesian Clearing Mechanism for Combinatorial Auctions
Constraint and Mathematical Programming Models for Integrated Port Container Terminal Operations
A quantum algorithm to train neural networks using low-depth circuits
Quantifying over boolean announcements
Prior Distributions for the Bradley-Terry Model of Paired Comparisons
Deep CNN ensembles and suggestive annotations for infant brain MRI segmentation
The effect of asymmetry of the coil block on self-assembly in ABC coil-rod-coil triblock copolymers
Model comparison for Gibbs random fields using noisy reversible jump Markov chain Monte Carlo
A Probability Monad as the Colimit of Finite Powers
Analysis and calibration of a linear model for structured cell populations with unidirectional motion : Application to the morphogenesis of ovarian follicles
Monotonic Chunkwise Attention
Equilibria in the Tangle
Partisan gerrymandering with geographically compact districts
Systems of BSDEs with oblique reflection and related optimal switching problem
swordfish: Efficient Forecasting of New Physics Searches without Monte Carlo