**Swish: a Self-Gated Activation Function**

**Gradient-free Policy Architecture Search and Adaptation**

**Linear Regression with Sparsely Permuted Data**

**The Bayesian Sorting Hat: A Decision-Theoretic Approach to Size-Constrained Clustering**

**Reply With: Proactive Recommendation of Email Attachments**

**Constrained Factor Models for High-Dimensional Matrix-Variate Time Series**

**On the challenges of learning with inference networks on sparse, high-dimensional data**

**Convolutional Recurrent Neural Networks for Electrocardiogram Classification**

**Deep Gaussian Covariance Network**

**Iterative Supervised Principal Components**

• Embedding an Edge-colored $K(a^{(p)};λ,μ)$ into a Hamiltonian Decomposition of $K(a^{(p+r)};λ,μ)$

• Embedding factorizations for 3-uniform hypergraphs

• Dominating 2-broadcast in graphs: complexity, bounds and extremal graphs

• HyperDense-Net: A hyper-densely connected CNN for multi-modal image semantic segmentation

• Bisected theta series, least $r$-gaps in partitions, and polygonal numbers

• A Short Note on Improved ROSETA

• Asymptotic distribution of least square estimators for linear models with dependent errors : regular designs

• Derivation of the Chapman-Kolmogorov type equation from a stochastic hybrid system

• Selection of calibrated subaction when temperature goes to zero in the discounted problem

• Convolutional Neural Networks for Sentiment Classification on Business Reviews

• Safe Medicine Recommendation via Medical Knowledge Graph Embedding

• SpecWatch: A Framework for Adversarial Spectrum Monitoring with Unknown Statistics

• Pushing the envelope in deep visual recognition for mobile platforms

• An operational characterization of mutual information in algorithmic information theory

• Free mutual information for two projections

• Contributed Discussion to Uncertainty Quantification for the Horseshoe by Stéphanie van der Pas, Botond Szabó and Aad van der Vaart

• Volumetric Data Exploration with Machine Learning-Aided Visualization in Neutron Science

• The Sandpile Group of a Thick Cycle Graph

• When Do Birds of a Feather Flock Together? $K$-Means, Proximity, and Conic Programming

• The quantum adjacency algebra and subconstituent algebra of a graph

• VAMPnets: Deep learning of molecular kinetics

• Estimating reducible stochastic differential equations by conversion to a least-squares problem

• Global exact controllability of the bilinear of Schroedinger potential type models on quantum graphs

• Quantum query complexity of entropy estimation

• Targeting Interventions in Networks

• Large Scale Replication Projects in Contemporary Psychological Research

• Checking the Soundness of Statistical Tests for Random Number Generators by Using a Three-Level Test

• Stochastic Variance Reduction for Policy Gradient Estimation

• On Hamilton Decompositions of Line Graphs of Non-Hamiltonian Graphs and Graphs without Separating Transitions

• Renormalized Solutions to Stochastic Continuity Equations with Rough Coefficients

• Repetition in Colored Sequences of Balls

• Evolution in Virtual Worlds

• Asymptotically Optimal Sequential Design for Rank Aggregation

• CancerLinker: Explorations of Cancer Study Network

• Data analysis recipes: Using Markov Chain Monte Carlo

• PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts

• Incremental Subgradient Methods for Minimizing The Sum of Quasi-convex Functions

• Estimate exponential memory decay in Hidden Markov Model and its applications

• Optimal Actuator Location of the Minimum Norm Controls for Stochastic Heat Equations

• Discovering Adversarial Examples with Momentum

• Box-Cox elliptical distributions with application

• Matroids and Canonical Forms: Theory and Applications

• Hierarchical Fog-Cloud Computing for IoT Systems: A Computation Offloading Game

• Face Transfer with Generative Adversarial Network

• Multi-Tenant C-RAN With Spectrum Pooling: Downlink Optimization Under Privacy Constraints

• Spontaneous Symmetry Breaking in Neural Networks

• Primal-Dual $π$ Learning: Sample Complexity and Sublinear Run Time for Ergodic Markov Decision Problems

• Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55

• CASICT Tibetan Word Segmentation System for MLWS2017

• A tightness property of relatively smooth permutations

• Countable infinitary theories admitting an invariant measure

• Higher Nerves of Simplicial Complexes

• Scalable Dense Monocular Surface Reconstruction

• Saddle representations of positively homogeneous functions by linear functions

• Operational thermal load forecasting in district heating networks using machine learning and expert advice

• Universal-homogeneous structures are generic

• Planck-scale distribution of nodal length of arithmetic random waves

• Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networks

• Combining LiDAR Space Clustering and Convolutional Neural Networks for Pedestrian Detection

• Detecting Bias in Black-Box Models Using Transparent Model Distillation

• Learning to Learn Image Classifiers with Informative Visual Analogy

• Accretion-induced spin-wandering effects on the neutron star in Scorpius X-1: Implications for continuous gravitational wave searches

• Bits through queues with feedback

• Strong Consistency of Spectral Clustering for Stochastic Block Models

• Hybrid Precoder and Combiner Design with Low Resolution Phase Shifters in mmWave MIMO Systems

• A New Coherence-Penalized Minimal Path Model with Application to Retinal Vessel Centerline Delineation

• Partition C*-algebras

• Continuants, run lengths, and Barry’s modified Pascal triangle

• Projective reconstruction in algebraic vision

• Learning to Transfer Initializations for Bayesian Hyperparameter Optimization

• Extremes of $2d$ Coulomb gas: universal intermediate deviation regime

• Fusion of LiDAR and Camera Sensor Data for Environment Sensing in Driverless Vehicles

• 3D Object Discovery and Modeling Using Single RGB-D Images Containing Multiple Object Instances

• Analysis of feature detector and descriptor combinations with a localization experiment for various performance metrics

• Nonlinear Interference Mitigation via Deep Neural Networks

• Real-time marker-less multi-person 3D pose estimation in RGB-Depth camera networks

• Single Shot Temporal Action Detection

• Integrated mmWave Access and Backhaul in 5G: Bandwidth Partitioning and Downlink Analysis

• Stochastic reaction networks with input processes: Analysis and applications to reporter gene systems

• Convergence Rate of Riemannian Hamiltonian Monte Carlo and Faster Polytope Volume Computation

• Procedural Modeling and Physically Based Rendering for Synthetic Data Generation in Automotive Applications

• Combinatorial Penalties: Which structures are preserved by convex relaxations?

• Smooth and Sparse Optimal Transport

• Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

• Existence and uniqueness of reflecting diffusions in cusps

• A tight Erdős-Pósa function for wheel minors

• Preliminary steps toward a universal economic dynamics for monetary and fiscal policy

• On the skeleton of the pyramidal tours polytope

• A Deep Learning Approach for Reconstruction Filter Kernel Discretization

• VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition

• A generative model for sparse, evolving digraphs

• Describing Natural Images Containing Novel Objects with Knowledge Guided Assitance

• Towards CT-quality Ultrasound Imaging using Deep Learning

• A group version of stable regularity

• Paying Attention to Multi-Word Expressions in Neural Machine Translation

• DASHMM Accelerated Adaptive Fast Multipole Poisson-Boltzmann Solver on Distributed Memory Architecture

• Beat by Beat: Classifying Cardiac Arrhythmias with Recurrent Neural Networks

• Symbol Erasure Correction Capability of Spread Codes

• Factor Models for High-Dimensional Dynamic Networks: with Application to International Trade Flow Time Series 1981-2015

• Distributed algorithm for empty vehicles management in personal rapid transit (PRT) network

• Hemisystems of the Hermitian Surface

• Understanding the Correlation Gap for Matchings

• Compound Poisson approximation of subgraph counts in stochastic block models with multiple edges

• Reflection local times of diffusions at elastic boundaries

• Multivariate Spatio-temporal Kriging on Latent Low-dimensional Functional Structures with Non-stationarity

• Spectra of Wishart Matrices with size-dependent entries

• Wigner functions for the pair angle and orbital angular momentum: Possible applications in quantum information theories

• Good Arm Identification via Bandit Feedback

• On the spectrum of directed uniform and non-uniform hypergraphs

• Specialising Word Vectors for Lexical Entailment

• The Hard Problems Are Almost Everywhere For Random CNF-XOR Formulas

• Containment problem and combinatorics

• Convergence diagnostics for stochastic gradient descent with constant step size

• Projective planes and set multipartite Ramsey numbers for $C_4$ versus star

• Efficient Neighbor-Finding on Space-Filling Curves

• Fishing for Clickbaits in Social Images and Texts with Linguistically-Infused Neural Network Models

• RETUYT in TASS 2017: Sentiment Analysis for Spanish Tweets using SVM and CNN

• Laying Down the Yellow Brick Road: Development of a Wizard-of-Oz Interface for Collecting Human-Robot Dialogue

• Multi-task Domain Adaptation for Deep Learning of Instance Grasping from Simulation

• Domain Randomization and Generative Models for Robotic Grasping