Use Of Vapnik-Chervonenkis Dimension in Model Selection

In this dissertation, I derive a new method to estimate the Vapnik-Chervonenkis Dimension (VCD) for the class of linear functions. This method is inspired by the technique developed by Vapnik et al. Vapnik et al. (1994). My contribution rests on the approximation of the expected maximum difference between two empirical Losses (EMDBTEL). In fact, I use a cross-validated form of the error to compute the EMDBTEL, and I make the bound on the EMDBTEL tighter by minimizing a constant in of its right upper bound. I also derive two bounds for the true unknown risk using the additive (ERM1) and the multiplicative (ERM2) Chernoff bounds. These bounds depend on the estimated VCD and the empirical risk. These bounds can be used to perform model selection and to declare with high probability, the chosen model will perform better without making strong assumptions about the data generating process (DG). I measure the accuracy of my technique on simulated datasets and also on three real datasets. The model selection provided by VCD was always as good as if not better than the other methods under reasonable conditions.

Loss Data Analytics

Loss Data Analytics is an interactive, online, freely available text. The idea behind the name Loss Data Analytics is to integrate classical loss data models from applied probability with modern analytic tools. In particular, we seek to recognize that big data (including social media and usage based insurance) are here and high speed computation is readily available. The online version contains many interactive objects (quizzes, computer demonstrations, interactive graphs, video, and the like) to promote deeper learning. A subset of the book is available for offline reading in pdf and EPUB formats. The online text will be available in multiple languages to promote access to a worldwide audience.

Learning to Exploit Invariances in Clinical Time-Series Data using Sequence Transformer Networks

Recently, researchers have started applying convolutional neural networks (CNNs) with one-dimensional convolutions to clinical tasks involving time-series data. This is due, in part, to their computational efficiency, relative to recurrent neural networks and their ability to efficiently exploit certain temporal invariances, (e.g., phase invariance). However, it is well-established that clinical data may exhibit many other types of invariances (e.g., scaling). While preprocessing techniques, (e.g., dynamic time warping) may successfully transform and align inputs, their use often requires one to identify the types of invariances in advance. In contrast, we propose the use of Sequence Transformer Networks, an end-to-end trainable architecture that learns to identify and account for invariances in clinical time-series data. Applied to the task of predicting in-hospital mortality, our proposed approach achieves an improvement in the area under the receiver operating characteristic curve (AUROC) relative to a baseline CNN (AUROC=0.851 vs. AUROC=0.838). Our results suggest that a variety of valuable invariances can be learned directly from the data.

Neural Relation Extraction via Inner-Sentence Noise Reduction and Transfer Learning

Extracting relations is critical for knowledge base completion and construction in which distant supervised methods are widely used to extract relational facts automatically with the existing knowledge bases. However, the automatically constructed datasets comprise amounts of low-quality sentences containing noisy words, which is neglected by current distant supervised methods resulting in unacceptable precisions. To mitigate this problem, we propose a novel word-level distant supervised approach for relation extraction. We first build Sub-Tree Parse(STP) to remove noisy words that are irrelevant to relations. Then we construct a neural network inputting the sub-tree while applying the entity-wise attention to identify the important semantic features of relational words in each instance. To make our model more robust against noisy words, we initialize our network with a priori knowledge learned from the relevant task of entity classification by transfer learning. We conduct extensive experiments using the corpora of New York Times(NYT) and Freebase. Experiments show that our approach is effective and improves the area of Precision/Recall(PR) from 0.35 to 0.39 over the state-of-the-art work.

Semi-Supervised Learning for Neural Keyphrase Generation

We study the problem of generating keyphrases that summarize the key points for a given document. While sequence-to-sequence (seq2seq) models have achieved remarkable performance on this task (Meng et al., 2017), model training often relies on large amounts of labeled data, which is only applicable to resource-rich domains. In this paper, we propose semi-supervised keyphrase generation methods by leveraging both labeled data and large-scale unlabeled samples for learning. Two strategies are proposed. First, unlabeled documents are first tagged with synthetic keyphrases obtained from unsupervised keyphrase extraction methods or a selflearning algorithm, and then combined with labeled samples for training. Furthermore, we investigate a multi-task learning framework to jointly learn to generate keyphrases as well as the titles of the articles. Experimental results show that our semi-supervised learning-based methods outperform a state-of-the-art model trained with labeled data only.

LRMM: Learning to Recommend with Missing Modalities

Multimodal learning has shown promising performance in content-based recommendation due to the auxiliary user and item information of multiple modalities such as text and images. However, the problem of incomplete and missing modality is rarely explored and most existing methods fail in learning a recommendation model with missing or corrupted modalities. In this paper, we propose LRMM, a novel framework that mitigates not only the problem of missing modalities but also more generally the cold-start problem of recommender systems. We propose modality dropout (m-drop) and a multimodal sequential autoencoder (m-auto) to learn multimodal representations for complementing and imputing missing modalities. Extensive experiments on real-world Amazon data show that LRMM achieves state-of-the-art performance on rating prediction tasks. More importantly, LRMM is more robust to previous methods in alleviating data-sparsity and the cold-start problem.

zoNNscan : a boundary-entropy index for zone inspection of neural models

The training of deep neural network classifiers results in decision boundaries which geometry is still not well understood. This is in direct relation with classification problems such as so called adversarial examples. We introduce zoNNscan, an index that is intended to inform on the boundary uncertainty (in terms of the presence of other classes) around one given input datapoint. It is based on confidence entropy, and is implemented through sampling in the multidimensional ball surrounding that input. We detail the zoNNscan index, give an algorithm for approximating it, and finally illustrate its benefits on four applications, including two important problems for the adoption of deep networks in critical systems: adversarial examples and corner case inputs. We highlight that zoNNscan exhibits significantly higher values than for standard inputs in those two problem classes.

Composite Hashing for Data Stream Sketches

In rapid and massive data streams, it is often not possible to estimate the frequency of items with complete accuracy. To perform the operation in a reasonable amount of space and with sufficiently low latency, approximated methods are used. The most common ones are variations of the Count-Min sketch. By using multiple hash functions, they summarize massive streams in sub-linear space. In reality, data item ids or keys can be modular, e.g., a graph edge is represented by source and target node ids, a 32-bit IP address is composed of four 8-bit words, a web address consists of domain name, domain extension, path, and filename, among many others. In this paper, we investigate the modularity property of item keys, and systematically develop more accurate, composite hashing strategies, such as employing multiple independent hash functions that hash different modules in a key and their combinations separately, instead of hashing the entire key directly into the sketch. However, our problem of finding the best hashing strategy is non-trivial, since there are exponential number of ways to combine the modules of a key before they can be hashed into the sketch. Moreover, given a fixed size allocated for the entire sketch, it is hard to find the optimal range of all hash functions that correspond to different modules and their combinations. We solve both these problems with extensive theoretical analysis, and perform thorough experiments with real-world datasets to demonstrate the accuracy and efficiency of our proposed method, MOD-Sketch.

Text-to-image Synthesis via Symmetrical Distillation Networks

Text-to-image synthesis aims to automatically generate images according to text descriptions given by users, which is a highly challenging task. The main issues of text-to-image synthesis lie in two gaps: the heterogeneous and homogeneous gaps. The heterogeneous gap is between the high-level concepts of text descriptions and the pixel-level contents of images, while the homogeneous gap exists between synthetic image distributions and real image distributions. For addressing these problems, we exploit the excellent capability of generic discriminative models (e.g. VGG19), which can guide the training process of a new generative model on multiple levels to bridge the two gaps. The high-level representations can teach the generative model to extract necessary visual information from text descriptions, which can bridge the heterogeneous gap. The mid-level and low-level representations can lead it to learn structures and details of images respectively, which relieves the homogeneous gap. Therefore, we propose Symmetrical Distillation Networks (SDN) composed of a source discriminative model as ‘teacher’ and a target generative model as ‘student’. The target generative model has a symmetrical structure with the source discriminative model, in order to transfer hierarchical knowledge accessibly. Moreover, we decompose the training process into two stages with different distillation paradigms for promoting the performance of the target generative model. Experiments on two widely-used datasets are conducted to verify the effectiveness of our proposed SDN.

Are You Tampering With My Data?

We propose a novel approach towards adversarial attacks on neural networks (NN), focusing on tampering the data used for training instead of generating attacks on trained models. Our network-agnostic method creates a backdoor during training which can be exploited at test time to force a neural network to exhibit abnormal behaviour. We demonstrate on two widely used datasets (CIFAR-10 and SVHN) that a universal modification of just one pixel per image for all the images of a class in the training set is enough to corrupt the training procedure of several state-of-the-art deep neural networks causing the networks to misclassify any images to which the modification is applied. Our aim is to bring to the attention of the machine learning community, the possibility that even learning-based methods that are personally trained on public datasets can be subject to attacks by a skillful adversary.

Machine Learning for Spatiotemporal Sequence Forecasting: A Survey

Spatiotemporal systems are common in the real-world. Forecasting the multi-step future of these spatiotemporal systems based on the past observations, or, Spatiotemporal Sequence Forecasting (STSF), is a significant and challenging problem. Although lots of real-world problems can be viewed as STSF and many research works have proposed machine learning based methods for them, no existing work has summarized and compared these methods from a unified perspective. This survey aims to provide a systematic review of machine learning for STSF. In this survey, we define the STSF problem and classify it into three subcategories: Trajectory Forecasting of Moving Point Cloud (TF-MPC), STSF on Regular Grid (STSF-RG) and STSF on Irregular Grid (STSF-IG). We then introduce the two major challenges of STSF: 1) how to learn a model for multi-step forecasting and 2) how to adequately model the spatial and temporal structures. After that, we review the existing works for solving these challenges, including the general learning strategies for multi-step forecasting, the classical machine learning based methods for STSF, and the deep learning based methods for STSF. We also compare these methods and point out some potential research directions.

Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks

This paper proposed a Soft Filter Pruning (SFP) method to accelerate the inference procedure of deep Convolutional Neural Networks (CNNs). Specifically, the proposed SFP enables the pruned filters to be updated when training the model after pruning. SFP has two advantages over previous works: (1) Larger model capacity. Updating previously pruned filters provides our approach with larger optimization space than fixing the filters to zero. Therefore, the network trained by our method has a larger model capacity to learn from the training data. (2) Less dependence on the pre-trained model. Large capacity enables SFP to train from scratch and prune the model simultaneously. In contrast, previous filter pruning methods should be conducted on the basis of the pre-trained model to guarantee their performance. Empirically, SFP from scratch outperforms the previous filter pruning methods. Moreover, our approach has been demonstrated effective for many advanced CNN architectures. Notably, on ILSCRC-2012, SFP reduces more than 42% FLOPs on ResNet-101 with even 0.2% top-5 accuracy improvement, which has advanced the state-of-the-art. Code is publicly available on GitHub: https://…/soft-filter-pruning

Multi-Source Pointer Network for Product Title Summarization

In this paper, we study the product title summarization problem in E-commerce applications for display on mobile devices. Comparing with conventional sentence summarization, product title summarization has some extra and essential constraints. For example, factual errors or loss of the key information are intolerable for E-commerce applications. Therefore, we abstract two more constraints for product title summarization: (i) do not introduce irrelevant information; (ii) retain the key information (e.g., brand name and commodity name). To address these issues, we propose a novel multi-source pointer network by adding a new knowledge encoder for pointer network. The first constraint is handled by pointer mechanism. For the second constraint, we restore the key information by copying words from the knowledge encoder with the help of the soft gating mechanism. For evaluation, we build a large collection of real-world product titles along with human-written short titles. Experimental results demonstrate that our model significantly outperforms the other baselines. Finally, online deployment of our proposed model has yielded a significant business impact, as measured by the click-through rate.

Backpropagation and Biological Plausibility

By and large, Backpropagation (BP) is regarded as one of the most important neural computation algorithms at the basis of the progress in machine learning, including the recent advances in deep learning. However, its computational structure has been the source of many debates on its arguable biological plausibility. In this paper, it is shown that when framing supervised learning in the Lagrangian framework, while one can see a natural emergence of Backpropagation, biologically plausible local algorithms can also be devised that are based on the search for saddle points in the learning adjoint space composed of weights, neural outputs, and Lagrangian multipliers. This might open the doors to a truly novel class of learning algorithms where, because of the introduction of the notion of support neurons, the optimization scheme also plays a fundamental role in the construction of the architecture.

Interval-valued Data Prediction via Regularized Artificial Neural Network

A regularized artificial neural network (RANN) is proposed for interval-valued data prediction. The ANN model is selected due to its powerful capability in fitting linear and nonlinear functions. To meet mathematical coherence requirement for an interval (i.e., the predicted lower bounds should not cross over their upper bounds), a soft non-crossing regularizer is introduced to the interval-valued ANN model. We conduct extensive experiments based on both simulation datasets and real-life datasets, and compare the proposed RANN method with multiple traditional models, including the linear constrained center and range method (CCRM), the least absolute shrinkage and selection operator-based interval-valued regression method (Lasso-IR), the nonlinear interval kernel regression (IKR), the interval multi-layer perceptron (iMLP) and the multi-output support vector regression (MSVR). Experimental results show that the proposed RANN model is an effective tool for interval-valued prediction tasks with high prediction accuracy.

An ensemble learning method for variable selection: application to high dimensional data and missing values

Standard approaches for variable selection in linear models are not tailored to deal properly with high dimensional and incomplete data. Currently, methods dedicated to high dimensional data handle missing values by ad-hoc strategies, like complete case analysis or single imputation, while methods dedicated to missing values, mainly based on multiple imputation, do not discuss the imputation method to use with high dimensional data. Consequently, both approaches appear to be limited for many modern applications. With inspiration from ensemble methods, a new variable selection method is proposed. It extends classical variable selection methods such as stepwise, lasso or knockoff in the case of high dimensional data with or without missing data. Theoretical properties are studied and the practical interest is demonstrated through a simulation study. In the low dimensional case without missing values, the performances of the method can be better than those obtained by standard techniques. Moreover, the procedure improves the control of the error risks. With missing values, the method performs better than reference selection methods based on multiple imputation. Similar performances are obtained in the high-dimensional case with or without missing values.

Hypernetwork Knowledge Graph Embeddings

Knowledge graphs are large graph-structured databases of facts, which typically suffer from incompleteness. Link prediction is the task of inferring missing relations (links) between entities (nodes) in a knowledge graph. We propose to solve this task by using a hypernetwork architecture to generate convolutional layer filters specific to each relation and apply those filters to the subject entity embeddings. This architecture enables a trade-off between non-linear expressiveness and the number of parameters to learn. Our model simplifies the entity and relation embedding interactions introduced by the predecessor convolutional model, while outperforming all previous approaches to link prediction across all standard link prediction datasets.

Who is Really Affected by Fraudulent Reviews? An analysis of shilling attacks on recommender systems in real-world scenarios

We present the results of an initial analysis conducted on a real-life setting to quantify the effect of shilling attacks on recommender systems. We focus on both algorithm performance as well as the types of users who are most affected by these attacks.

Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation

Recent research suggests that neural machine translation achieves parity with professional human translation on the WMT Chinese–English news translation task. We empirically test this claim with alternative evaluation protocols, contrasting the evaluation of single sentences and entire documents. In a pairwise ranking experiment, human raters assessing adequacy and fluency show a stronger preference for human over machine translation when evaluating documents as compared to isolated sentences. Our findings emphasise the need to shift towards document-level evaluation as machine translation improves to the degree that errors which are hard or impossible to spot at the sentence-level become decisive in discriminating quality of different translation outputs.

A Heterogeneity Based Case-Control Analysis of Motorcyclist Injury Crashes: Evidence from Motorcycle Crash Causation Study
The Variable Quality of Metadata About Biological Samples Used in Biomedical Experiments
Optimized Hierarchical Power Oscillations Control for Distributed Generation Under Unbalanced Conditions
Compiler Enhanced Scheduling for OpenMP for Heterogeneous Multiprocessors
CrowdTruth 2.0: Quality Metrics for Crowdsourcing with Disagreement
The Distribution of Reversible Functions is Normal
Network-based Biased Tree Ensembles (NetBiTE) for Drug Sensitivity Prediction and Drug Sensitivity Biomarker Identification in Cancer
Localization parameters for two interacting particles in disordered two-dimensional lattices
On the mathematics of the free-choice paradigm
Segmentation of Microscopy Data for finding Nuclei in Divergent Images
Non-monotone Submodular Maximization with Nearly Optimal Adaptivity Complexity
Artificial Neural Networks in Fluid Dynamics: A Novel Approach to the Navier-Stokes Equations
Compiling Adiabatic Quantum Programs
End to End Vehicle Lateral Control Using a Single Fisheye Camera
Predicting Stochastic Travel Times based on High-Volume Floating Car Data
PACO: Signal Restoration via PAtch COnsensus
Input-to-State Stability of Nonlinear Parabolic PDEs with Dirichlet Boundary Disturbances
Safe Intersection Management for Mixed Transportation Systems with Human-Driven and Autonomous Vehicles
Supervised Kernel PCA For Longitudinal Data
Adversarial Removal of Demographic Attributes from Text Data
On the Expected Value of the Maximal Bet in the Labouchere System
Percolation on Isotropically Directed Lattice
Stochastic Combinatorial Ensembles for Defending Against Adversarial Examples
Inverse Problems in Asteroseismology
Privacy Amplification by Iteration
On the Optimality of Ergodic Trajectories for Information Gathering Tasks
Deterministic Factorization of Sparse Polynomials with Bounded Individual Degree
Designs over finite fields by difference methods
Dynamically evolved community size and stability of random Lotka-Volterra ecosystems
A Hybrid DE Approach to Designing CNN for Image Classification
Out-of-Distribution Detection using Multiple Semantic Label Representations
Cayley Digraphs Associated to Arithmetic Groups
Stability for maximal independent sets
Noncommutative polynomials describing convex sets
Learning deep representations by mutual information estimation and maximization
Adversarial Sampling for Active Learning
Class2Str: End to End Latent Hierarchy Learning
Deep Multimodal Image-Repurposing Detection
Bayesian Function-on-Scalars Regression for High Dimensional Data
Local-Global Graph Clustering with Applications in Sense and Frame Induction
VERAM: View-Enhanced Recurrent Attention Model for 3D Shape Classification
An Isolated Power Factor Corrected Power Supply Utilizing the Transformer Leakage Inductance
Graph connectivity in log-diameter steps using label propagation
A non-iterative algorithm for generalized Pig games
Near log-convexity of measured heat in (discrete) time and consequences
Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks
State Polytopes Related to Two Classes of Combinatorial Neural Codes
Modes of Information Flow
You Shall Know the Most Frequent Sense by the Company it Keeps
D.H. Lehmer’s Tridiagonal determinant: An Etude in (Andrews-Inspired) Experimental Mathematics
Wrapped Loss Function for Regularizing Nonconforming Residual Distributions
Fully Active Cops and Robbers
A parallel non-uniform fast Fourier transform library based on an ‘exponential of semicircle’ kernel
Constrained-size Tensorflow Models for YouTube-8M Video Understanding Challenge
Interactive Semantic Parsing for If-Then Recipes via Hierarchical Reinforcement Learning
Abnormal Event Detection and Location for Dense Crowds using Repulsive Forces and Sparse Reconstruction
The story of conflict and cooperation
Lessons from Natural Language Inference in the Clinical Domain
Estimating Metric Poses of Dynamic Objects Using Monocular Visual-Inertial Fusion
Dominant Channel Estimation via MIPS for Large-Scale Antenna Systems with One-Bit ADCs
Channel Estimation for One-Bit Massive MIMO Systems Exploiting Spatio-Temporal Correlations
Automatic skin lesion segmentation on dermoscopic images by the means of superpixel merging
Designing Near-Optimal Policies for Energy Management in a Stochastic Environment
Stochastic Modeling and Analysis of User-Centric Network MIMO Systems
Energy Efficient Event Localization and Classification for Nano IoT
Direction of Arrival and Center Frequency Estimation for Impulse Radio Millimeter Wave Communications
Making a Dynamic Interaction Between Two Power System Analysis Software
Position Sensor-less and Adaptive Speed Design for Controlling Brush-less DC Motor Drives
Multi-task multiple kernel machines for personalized pain recognition from functional near-infrared spectroscopy brain signals
Stabilization for Networked Control Systems with Simultaneous Input Delay and Markovian Packet Losses
Central limit theorem for statistics of subcritical configuration models
Parametrix method for one-dimensional locally $α$-stable Lévy-type processes
Real-time Analog Pixel-to-pixel Dynamic Frame Differencing with Memristive Sensing Circuits
Polynomial Chaos reformulation in Nonlinear Stochastic Optimal Control with application on a drivetrain subject to bifurcation phenomena
Critical two-point function for long-range models with power-law couplings: The marginal case for $d\ge d_c$
A note on the approximate symmetry of Bregman distances
Parameter Synthesis Problems for Parametric Timed Automata
A computationally efficient correlated mixed Probit for credit risk modelling
Downsampling Strategies are Crucial for Word Embedding Reliability
Wavelet imaging of transient energy localization in nonlinear systems at thermal equilibrium: the case study of NaI crystals at high temperature
The Role of the Task Topic in Web Search of Different Task Types
A Usefulness-based Approach for Measuring the Local and Global Effect of IIR Services
Translational Grounding: Using Paraphrase Recognition and Generation to Demonstrate Semantic Abstraction Abilities of MultiLingual NMT
Existence of a Unique Quasi-stationary Distribution for Stochastic Reaction Networks
Analysis of Speeches in Indian Parliamentary Debates
Decompositions of log-correlated fields with applications
Optimum Transmission Rate in Fading Channels with Markovian Sources and QoS Constraints
Fully-Convolutional Point Networks for Large-Scale Point Clouds
Deep Learned Full-3D Object Completion from Single View
Search for Common Minima in Joint Optimization of Multiple Cost Functions
Deep Video-Based Performance Cloning
Demonstrating PAR4SEM – A Semantic Writing Aid with Adaptive Paraphrasing
Spanning surfaces in 3-graphs
Adversarial training for multi-context joint entity and relation extraction
Dissipation in parabolic SPDEs
ADMM for Exploiting Structure in MPC Problems
Automatic Generation of Text Descriptive Comments for Code Blocks
Self-supervised learning of a facial attribute embedding from video
The Turtleback Diagram for Conditional Probability
Multimodal Interaction-aware Motion Prediction for Autonomous Street Crossing
Discrete-attractor-like Tracking in Continuous Attractor Neural Networks
On Stronger Types of Locating-dominating Codes
Defending against Intrusion of Malicious UAVs with Networked UAV Defense Swarms
Optimal designs for two-level main effects models on a restricted design region
Scalable Population Synthesis with Deep Generative Modeling
On a New Improvement-Based Acquisition Function for Bayesian Optimization
Stable divisorial gonality is in NP
General hypergeometric distribution: A basic statistical distribution for the number of overlapped elements in multiple subsets drawn from a finite population
Smart energy models for atomistic simulations using a DFT-driven multifidelity approach
A Game-Theoretic Approach to Multi-Objective Resource Sharing and Allocation in Mobile Edge Clouds
A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation
Group Activity Selection with Few Agent Types
Minimalist designs
Greedy Harmony Search Algorithm for the Hop Constrained Connected Facility Location
Normal matrix ensembles at the hard edge, orthogonal polynomials, and universality
A novel approach to assess the impact of the Fano factor on the sensitivity of low-mass dark matter experiments
Robust Chemical Circuits
Microwave Hilbert Transformer and its Applications in Real-time Analog Processing (RAP)
Thresholding the virtual value: a simple method to increase welfare and lower reserve prices in online auction systems
Entropy of a quantum channel
Iterated Greedy Algorithms for the Hop-Constrained Steiner Tree Problem
Regularity and h-polynomials of binomial edge idals
Optimizing the Union of Intersections LASSO ($UoI_{LASSO}$) and Vector Autoregressive ($UoI_{VAR}$) Algorithms for Improved Statistical Estimation at Scale
Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval
Resource Allocation for Cooperative D2D-Enabled Wireless Caching Networks
Gaussian Word Embedding with a Wasserstein Distance Loss
Real Time Elbow Angle Estimation Using Single RGB Camera
Functional convergence for moving averages with heavy tails and random coefficients
Student Cluster Competition 2017, Team University ofTexas at Austin/Texas State University: Reproducing Vectorization of the Tersoff Multi-Body Potential on the Intel Skylake and NVIDIA V100 Architectures
Quantitative contraction rates for Markov chains on general state spaces
URLLC Services in 5G – Low Latency Enhancements for LTE
QuAC : Question Answering in Context
CoQA: A Conversational Question Answering Challenge
A Hybridized Discontinuous Galerkin Method for A Linear Degenerate Elliptic Equation Arising from Two-Phase Mixtures
ISNA-Set: A novel English Corpus of Iran NEWS