Robust Estimation of Data-Dependent Causal Effects based on Observing a Single Time-Series

Consider the case that one observes a single time-series, where at each time t one observes a data record O(t) involving treatment nodes A(t), possible covariates L(t) and an outcome node Y(t). The data record at time t carries information for an (potentially causal) effect of the treatment A(t) on the outcome Y(t), in the context defined by a fixed dimensional summary measure Co(t). We are concerned with defining causal effects that can be consistently estimated, with valid inference, for sequentially randomized experiments without further assumptions. More generally, we consider the case when the (possibly causal) effects can be estimated in a double robust manner, analogue to double robust estimation of effects in the i.i.d. causal inference literature. We propose a general class of averages of conditional (context-specific) causal parameters that can be estimated in a double robust manner, therefore fully utilizing the sequential randomization. We propose a targeted maximum likelihood estimator (TMLE) of these causal parameters, and present a general theorem establishing the asymptotic consistency and normality of the TMLE. We extend our general framework to a number of typically studied causal target parameters, including a sequentially adaptive design within a single unit that learns the optimal treatment rule for the unit over time. Our work opens up robust statistical inference for causal questions based on observing a single time-series on a particular unit.

Wavelet estimation of the dimensionality of curve time series

Functional data analysis is ubiquitous in most areas of sciences and engineering. Several paradigms are proposed to deal with the dimensionality problem which is inherent to this type of data. Sparseness, penalization, thresholding, among other principles, have been used to tackle this issue. We discuss here a solution based on a finite-dimensional functional space. We employ wavelet representation of the functionals to estimate this finite dimension, and successfully model a time series of curves. The proposed method is shown to have nice asymptotic properties. Moreover, the wavelet representation permits the use of several bootstrap procedures, and it results in faster computing algorithms. Besides the theoretical and computational properties, some simulation studies and an application to real data are provided.

Deep Smoke Segmentation

Inspired by the recent success of fully convolutional networks (FCN) in semantic segmentation, we propose a deep smoke segmentation network to infer high quality segmentation masks from blurry smoke images. To overcome large variations in texture, color and shape of smoke appearance, we divide the proposed network into a coarse path and a fine path. The first path is an encoder-decoder FCN with skip structures, which extracts global context information of smoke and accordingly generates a coarse segmentation mask. To retain fine spatial details of smoke, the second path is also designed as an encoder-decoder FCN with skip structures, but it is shallower than the first path network. Finally, we propose a very small network containing only add, convolution and activation layers to fuse the results of the two paths. Thus, we can easily train the proposed network end to end for simultaneous optimization of network parameters. To avoid the difficulty in manually labelling fuzzy smoke objects, we propose a method to generate synthetic smoke images. According to results of our deep segmentation method, we can easily and accurately perform smoke detection from videos. Experiments on three synthetic smoke datasets and a realistic smoke dataset show that our method achieves much better performance than state-of-the-art segmentation algorithms based on FCNs. Test results of our method on videos are also appealing.

Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text

Open Domain Question Answering (QA) is evolving from complex pipelined systems to end-to-end deep neural networks. Specialized neural models have been developed for extracting answers from either text alone or Knowledge Bases (KBs) alone. In this paper we look at a more practical setting, namely QA over the combination of a KB and entity-linked text, which is appropriate when an incomplete KB is available with a large text corpus. Building on recent advances in graph representation learning we propose a novel model, GRAFT-Net, for extracting answers from a question-specific subgraph containing text and KB entities and relations. We construct a suite of benchmark tasks for this problem, varying the difficulty of questions, the amount of training data, and KB completeness. We show that GRAFT-Net is competitive with the state-of-the-art when tested using either KBs or text alone, and vastly outperforms existing methods in the combined setting. Source code is available at https://…/GraftNet .

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation

We introduce Texar, an open-source toolkit aiming to support the broad set of text generation tasks that transforms any inputs into natural language, such as machine translation, summarization, dialog, content manipulation, and so forth. With the design goals of modularity, versatility, and extensibility in mind, Texar extracts common patterns underlying the diverse tasks and methodologies, creates a library of highly reusable modules and functionalities, and allows arbitrary model architectures and algorithmic paradigms. In Texar, model architecture, losses, and learning processes are fully decomposed. Modules at high concept level can be freely assembled or plugged in/swapped out. These features make Texar particularly suitable for researchers and practitioners to do fast prototyping and experimentation, as well as foster technique sharing across different text generation tasks. We provide case studies to demonstrate the use and advantage of the toolkit. Texar is released under Apache license 2.0 at https://…/texar.

A Recurrent Neural Network for Sentiment Quantification

Quantification is a supervised learning task that consists in predicting, given a set of classes C and a set D of unlabelled items, the prevalence (or relative frequency) p(c|D) of each class c in C. Quantification can in principle be solved by classifying all the unlabelled items and counting how many of them have been attributed to each class. However, this ‘classify and count’ approach has been shown to yield suboptimal quantification accuracy; this has established quantification as a task of its own, and given rise to a number of methods specifically devised for it. We propose a recurrent neural network architecture for quantification (that we call QuaNet) that observes the classification predictions to learn higher-order ‘quantification embeddings’, which are then refined by incorporating quantification predictions of simple classify-and-count-like methods. We test {QuaNet on sentiment quantification on text, showing that it substantially outperforms several state-of-the-art baselines.

Understanding Regularization in Batch Normalization

Batch Normalization (BN) makes output of hidden neuron had zero mean and unit variance, improving convergence and generalization when training neural networks. This work understands these phenomena theoretically. We analyze BN by using a building block of neural networks, which consists of a weight layer, a BN layer, and a nonlinear activation function. This simple network helps us understand the characteristics of BN, where the results are generalized to deep models in numerical studies. We explore BN in three aspects. First, by viewing BN as a stochastic process, an analytical form of regularization inherited in BN is derived. Second, the optimization dynamic with this regularization shows that BN enables training converged with large maximum and effective learning rates. Third, BN’s generalization with regularization is explored by using random matrix theory and statistical mechanics. Both simulations and experiments support our analyses.

A Neural Network Model for Determining the Success or Failure of High-tech Projects Development: A Case of Pharmaceutical industry

Financing high-tech projects always entails a great deal of risk. The lack of a systematic method to pinpoint the risk of such projects has been recognized as one of the most salient barriers for evaluating them. So, in order to develop a mechanism for evaluating high-tech projects, an Artificial Neural Network (ANN) has been developed through this study. The structure of this paper encompasses four parts. The first part deals with introducing paper’s whole body. The second part gives a literature review. The collection process of risk related variables and the process of developing a Risk Assessment Index system (RAIS) through Principal Component Analysis (PCA) are those issues that are discussed in the third part. The fourth part particularly deals with pharmaceutical industry. Finally, the fifth part has focused on developing an ANN for pattern recognition of failure or success of high-tech projects. Analysis of model’s results and a final conclusion are also presented in this part.

Geometric Operator Convolutional Neural Network

The Convolutional Neural Network (CNN) has been successfully applied in many fields during recent decades; however it lacks the ability to utilize prior domain knowledge when dealing with many realistic problems. We present a framework called Geometric Operator Convolutional Neural Network (GO-CNN) that uses domain knowledge, wherein the kernel of the first convolutional layer is replaced with a kernel generated by a geometric operator function. This framework integrates many conventional geometric operators, which allows it to adapt to a diverse range of problems. Under certain conditions, we theoretically analyze the convergence and the bound of the generalization errors between GO-CNNs and common CNNs. Although the geometric operator convolution kernels have fewer trainable parameters than common convolution kernels, the experimental results indicate that GO-CNN performs more accurately than common CNN on CIFAR-10/100. Furthermore, GO-CNN reduces dependence on the amount of training examples and enhances adversarial stability. In the practical task of medically diagnosing bone fractures, GO-CNN obtains 3% improvement in terms of the recall.

Parameter Transfer Extreme Learning Machine based on Projective Model

Recent years, transfer learning has attracted much attention in the community of machine learning. In this paper, we mainly focus on the tasks of parameter transfer under the framework of extreme learning machine (ELM). Unlike the existing parameter transfer approaches, which incorporate the source model information into the target by regularizing the di erence between the source and target domain parameters, an intuitively appealing projective-model is proposed to bridge the source and target model parameters. Specifically, we formulate the parameter transfer in the ELM networks by the means of parameter projection, and train the model by optimizing the projection matrix and classifier parameters jointly. Further more, the `L2,1-norm structured sparsity penalty is imposed on the source domain parameters, which encourages the joint feature selection and parameter transfer. To evaluate the e ectiveness of the proposed method, comprehensive experiments on several commonly used domain adaptation datasets are presented. The results show that the proposed method significantly outperforms the non-transfer ELM networks and other classical transfer learning methods.

JobComposer: Career Path Optimization via Multicriteria Utility Learning

With online professional network platforms (OPNs, e.g., LinkedIn, Xing, etc.) becoming popular on the web, people are now turning to these platforms to create and share their professional profiles, to connect with others who share similar professional aspirations and to explore new career opportunities. These platforms however do not offer a long-term roadmap to guide career progression and improve workforce employability. The career trajectories of OPN users can serve as a reference but they are not always optimal. A career plan can also be devised through consultation with career coaches, whose knowledge may however be limited to a few industries. To address the above limitations, we present a novel data-driven approach dubbed JobComposer to automate career path planning and optimization. Its key premise is that the observed career trajectories in OPNs may not necessarily be optimal, and can be improved by learning to maximize the sum of payoffs attainable by following a career path. At its heart, JobComposer features a decomposition-based multicriteria utility learning procedure to achieve the best tradeoff among different payoff criteria in career path planning. Extensive studies using a city state-based OPN dataset demonstrate that JobComposer returns career paths better than other baseline methods and the actual career paths.

Chi-Square Test Neural Network: A New Binary Classifier based on Backpropagation Neural Network

We introduce the chi-square test neural network: a single hidden layer backpropagation neural network using chi-square test theorem to redefine the cost function and the error function. The weights and thresholds are modified using standard backpropagation algorithm. The proposed approach has the advantage of making consistent data distribution over training and testing sets. It can be used for binary classification. The experimental results on real world data sets indicate that the proposed algorithm can significantly improve the classification accuracy comparing to related approaches.

An outlier-resistant indicator of anomalies among inter-laboratory comparison data with associated uncertainty

A new robust pairwise statistic, the pairwise median scaled difference (MSD), is proposed for the detection of anomalous location/uncertainty pairs in heteroscedastic interlaboratory study data with associated uncertainties. The distribution for the IID case is presented and approximate critical values for routine use are provided. The determination of observation-specific quantiles and p-values for heteroscedastic data, using parametric bootstrapping, is demonstrated by example. It is shown that the statistic has good power for detecting anomalies compared to a previous pairwise statistic, and offers much greater resistance to multiple outlying values.

DeepPINK: reproducible feature selection in deep neural networks

Deep learning has become increasingly popular in both supervised and unsupervised machine learning thanks to its outstanding empirical performance. However, because of their intrinsic complexity, most deep learning methods are largely treated as black box tools with little interpretability. Even though recent attempts have been made to facilitate the interpretability of deep neural networks (DNNs), existing methods are susceptible to noise and lack of robustness. Therefore, scientists are justifiably cautious about the reproducibility of the discoveries, which is often related to the interpretability of the underlying statistical models. In this paper, we describe a method to increase the interpretability and reproducibility of DNNs by incorporating the idea of feature selection with controlled error rate. By designing a new DNN architecture and integrating it with the recently proposed knockoffs framework, we perform feature selection with a controlled error rate, while maintaining high power. This new method, DeepPINK (Deep feature selection using Paired-Input Nonlinear Knockoffs), is applied to both simulated and real data sets to demonstrate its empirical utility.

Causal Explanation Analysis on Social Media

Understanding causal explanations – reasons given for happenings in one’s life – has been found to be an important psychological factor linked to physical and mental health. Causal explanations are often studied through manual identification of phrases over limited samples of personal writing. Automatic identification of causal explanations in social media, while challenging in relying on contextual and sequential cues, offers a larger-scale alternative to expensive manual ratings and opens the door for new applications (e.g. studying prevailing beliefs about causes, such as climate change). Here, we explore automating causal explanation analysis, building on discourse parsing, and presenting two novel subtasks: causality detection (determining whether a causal explanation exists at all) and causal explanation identification (identifying the specific phrase that is the explanation). We achieve strong accuracies for both tasks but find different approaches best: an SVM for causality prediction (F1 = 0.791) and a hierarchy of Bidirectional LSTMs for causal explanation identification (F1 = 0.853). Finally, we explore applications of our complete pipeline (F1 = 0.868), showing demographic differences in mentions of causal explanation and that the association between a word and sentiment can change when it is used within a causal explanation.

Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints

Neural conversation models tend to generate safe, generic responses for most inputs. This is due to the limitations of likelihood-based decoding objectives in generation tasks with diverse outputs, such as conversation. To address this challenge, we propose a simple yet effective approach for incorporating side information in the form of distributional constraints over the generated responses. We propose two constraints that help generate more content rich responses that are based on a model of syntax and topics (Griffiths et al., 2005) and semantic similarity (Arora et al., 2016). We evaluate our approach against a variety of competitive baselines, using both automatic metrics and human judgments, showing that our proposed approach generates responses that are much less generic without sacrificing plausibility. A working demo of our code can be found at https://…/DC-NeuralConversation.

Graph-based Deep-Tree Recursive Neural Network (DTRNN) for Text Classification

A novel graph-to-tree conversion mechanism called the deep-tree generation (DTG) algorithm is first proposed to predict text data represented by graphs. The DTG method can generate a richer and more accurate representation for nodes (or vertices) in graphs. It adds flexibility in exploring the vertex neighborhood information to better reflect the second order proximity and homophily equivalence in a graph. Then, a Deep-Tree Recursive Neural Network (DTRNN) method is presented and used to classify vertices that contains text data in graphs. To demonstrate the effectiveness of the DTRNN method, we apply it to three real-world graph datasets and show that the DTRNN method outperforms several state-of-the-art benchmarking methods.

Compositional Stochastic Average Gradient for Machine Learning and Related Applications

Many machine learning, statistical inference, and portfolio optimization problems require minimization of a composition of expected value functions (CEVF). Of particular interest is the finite-sum versions of such compositional optimization problems (FS-CEVF). Compositional stochastic variance reduced gradient (C-SVRG) methods that combine stochastic compositional gradient descent (SCGD) and stochastic variance reduced gradient descent (SVRG) methods are the state-of-the-art methods for FS-CEVF problems. We introduce compositional stochastic average gradient descent (C-SAG) a novel extension of the stochastic average gradient method (SAG) to minimize composition of finite-sum functions. C-SAG, like SAG, estimates gradient by incorporating memory of previous gradient information. We present theoretical analyses of C-SAG which show that C-SAG, like SAG, and C-SVRG, achieves a linear convergence rate when the objective function is strongly convex; However, C-CAG achieves lower oracle query complexity per iteration than C-SVRG. Finally, we present results of experiments showing that C-SAG converges substantially faster than full gradient (FG), as well as C-SVRG.

Compound Poisson approximation for random fields with application to sequence alignment
A Local Lemma for Focused Stochastic Algorithms
Edit Errors with Block Transpositions: Deterministic Document Exchange Protocols and Almost Optimal Binary Codes
emrQA: A Large Corpus for Question Answering on Electronic Medical Records
An Optimal $χ$-Bound for ($P_6$, diamond)-Free Graphs
‘Read My Lips’: Using Automatic Text Analysis to Classify Politicians by Party and Ideology
Exhaustive generation for permutations avoiding a (colored) regular sets of patterns
Vandermonde Factorization of Hankel Matrix for Complex Exponential Signal Recovery — Application in Fast NMR Spectroscopy
Information Signal Design for Incentivizing Team Formation
Analysis for the Slow Convergence in Arimoto Algorithm
End-to-end Multimodal Emotion and Gender Recognition with Dynamic Weights of Joint Loss
Bounding the number of self-avoiding walks: Hammersley-Welsh with polygon insertion
Adaptive Douglas-Rachford Splitting Algorithm for the Sum of Two Operators
Spatial-Spectral Fusion by Combining Deep Learning and Variation Model
A note on heat kernel estimates, resistance bounds and Poincaré inequality
Robust Iris Segmentation Based on Fully Convolutional Networks and Generative Adversarial Networks
Transferring Deep Reinforcement Learning with Adversarial Objective and Augmentation
Sequence-to-Action: End-to-End Semantic Graph Generation for Semantic Parsing
Bounds on the edge-Wiener index of cacti with $n$ vertices and $t$ cycles
Stretched exponential decay of correlations in the quasiperiodic continuum percolation model
PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track
Cycle Ramsey numbers for random graphs
Matrix Infinitely Divisible Series: Tail Inequalities and Applications in Optimization
Lipschitz Networks and Distributional Robustness
Complexity Reduction for Systems of Interacting Orientable Agents: Beyond The Kuramoto Model
Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction
Adelic Extension Classes, Atiyah Bundles and Non-Commutative Codes
A comparative study of top-k high utility itemset mining methods
Sion’s mini-max theorem and Nash equilibrium in a multi-players game with two groups which is zero-sum and symmetric in each group
Nash equilibrium in asymmetric multi-players zero-sum game with two strategic variables and only one alien
Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions
Constructing a solution of the $(2+1)$-dimensional KPZ equation
Equalization with Expectation Propagation at Smoothing Level
A Novel A Priori Simulation Algorithm for Absorbing Receivers in Diffusion-Based Molecular Communication Systems
A Deep Learning Spatiotemporal Prediction Framework for Mobile Crowdsourced Services
RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes
Counterexamples to a conjecture of Las Vergnas
Multiplicative random cascades with additional stochastic process in financial markets
High-dimensional varying index coefficient quantile regression model
Multi-species neutron transport equation
Improving the Expressiveness of Deep Learning Frameworks with Recursion
Bounded Rational Decision-Making with Adaptive Neural Network Priors
Metabolize Neural Network
Framework for Discrete Rate Transmission in Buffer-Aided Underlay CRN With Direct Path
Existence, uniqueness and stability of semi-linear rough partial differential equations
Music Sequence Prediction with Mixture Hidden Markov Models
Multi-target Unsupervised Domain Adaptation without Exactly Shared Categories
Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images
Stabilization of port-Hamiltonian systems with discontinuous energy densities
Non-monotonic Reasoning in Deductive Argumentation
Unveiling co-evolutionary patterns in systems of cities: a systematic exploration of the SimpopNet model
Handwriting styles: benchmarks and evaluation metrics
Algebraic matroids in action
Private Information Retrieval From a Cellular Network With Caching at the Edge
On the predictive power of database classifiers formed by a small network of interacting chemical oscillators
Automated bird sound recognition in realistic settings
An elementary proof of de Finetti’s Theorem
MesoNet: a Compact Facial Video Forgery Detection Network
A note on the spectra of some subgraphs of the hypercube
Improving full waveform inversion by wavefield reconstruction with the alternating direction method of multipliers
A Simple and Practical Concurrent Non-blocking Unbounded Graph with Reachability Queries
Image Reassembly Combining Deep Learning and Shortest Path Problem
Parity Crowdsourcing for Cooperative Labeling
Penalizing Top Performers: Conservative Loss for Semantic Segmentation Adaptation
Bangla License Plate Recognition Using Convolutional Neural Networks (CNN)
Treewidth of display graphs: bounds, brambles and applications
Optimal Distributed and Tangential Boundary Control for the Unsteady Stochastic Stokes Equations
OCNet: Object Context Network for Scene Parsing
Segmentation-free compositional $n$-gram embedding
Trees and linear anticomplete pairs
Proof of a Conjecture of Galvin
Planar graphs without cycles of lengths 4 and 5 and close triangles are DP-3-colorable
Lifted Projective Reed-Solomon Codes
Faster Balanced Clusterings in High Dimension
Improving generalization of vocal tract feature reconstruction: from augmented acoustic inversion to articulatory feature reconstruction without articulatory data
From Möbius inversion to renormalisation
Noisy Voronoi: a Simple Framework for Terminal-Clustering Problems
Compressive Hyperspectral Imaging: Fourier Transform Interferometry meets Single Pixel Camera
A class of orders with linear? time sorting algorithm
Cone valuations, Gram’s relation, and flag-angles
How to model fake news
Energy-Efficient Mobile-Edge Computation Offloading for Applications with Shared Data
A simplified proof of weak convergence in Douglas-Rachford method to a solution of the unnderlying inclusion problem
Equivalence of approximation by convolutional neural networks and fully-connected networks
Optimal Reinsurance for Gerber-Shiu Functions in the Cramer-Lundberg Model
Étude de l’informativité des transcriptions : une approche basée sur le résumé automatique
Several classes of optimal Ferrers diagram rank-metric codes
Aesthetic Discrimination of Graph Layouts
Leveraging Deep Visual Descriptors for Hierarchical Efficient Localization
Computing optimal discrete readout weights in reservoir computing is NP-hard
A Neural Network Aided Approach for LDPC Coded DCO-OFDM with Clipping Distortion
Toric degenerations of Grassmannians from matching fields
Determining the Number of Communities in Degree-corrected Stochastic Block Models
Using SIMD and SIMT vectorization to evaluate sparse chemical kinetic Jacobian matrices and thermochemical source terms
Saving Lives at Sea with UAV-assisted Wireless Networks
A Roadmap for the Value-Loading Problem
Shape-Enforcing Operators for Point and Interval Estimators
Iris recognition in cases of eye pathology
The Effect of Context on Metaphor Paraphrase Aptness Judgments
Guaranteed simulation error bounds for linear time invariant systems identified from data
The Effect of Time Delay on the Average Data Rate and Performance in Networked Control Systems
A Novel Neural Sequence Model with Multiple Attentions for Word Sense Disambiguation
Reasoning in Bayesian Opinion Exchange Networks Is PSPACE-Hard
Energy Efficient Resource Allocation for Mobile-Edge Computation Networks with NOMA
A Quantum Spatial Graph Convolutional Neural Network using Quantum Passing Information
Scaling limits of discrete optimal transport
Adversarial Attacks on Node Embeddings
Accelerating Beam Sweeping in mmWave Standalone 5G New Radios using Recurrent Neural Networks
Distributed Nonconvex Constrained Optimization over Time-Varying Digraphs
Localization of Neumann Eigenfunctions near Irregular Boundaries
Text2Scene: Generating Abstract Scenes from Textual Descriptions
A note on the tight example in On the randomised query complexity of composition
VideoMatch: Matching based Video Object Segmentation
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering
Unsupervised Video Object Segmentation using Motion Saliency-Guided Spatio-Temporal Propagation
‘This is why we play’: Characterizing Online Fan Communities of the NBA Teams
Quantifier-free description of the solutions set of the generalized interval-quantifier system of linear equations
Challenges of capturing engagement on Facebook for Altmetrics
On the minimal displacement vector of compositions and convex combinations of nonexpansive mappings
Random Language Model: a path to principled complexity
SOS lower bounds with hard constraints: think global, act local
Hybrid Master Equation for Jump-Diffusion Approximation of Biomolecular Reaction Networks
A Primal-Dual Quasi-Newton Method for Exact Consensus Optimization
The Saddle Point Problem of Polynomials
Vulcan: A Monte Carlo Algorithm for Large Chance Constrained MDPs with Risk Bounding Functions
Testing for exponentiality for stationary associated random variables