R Packages worth a look

Density Estimation and Random Number Generation with Distribution Element Trees (detpack)
Density estimation for possibly large data sets and conditional/unconditional random number generation with distribution element trees. For more details on distribution element trees see: Meyer, D.W. (2016) <arXiv:1610.00345> or Meyer, D.W., Statistics and Computing (2017) <doi:10.1007/s11222-017-9751-9> and Meyer, D.W. (2017) <arXiv:1711.04632>.

Tidy Temporal Data Frames and Tools (tsibble)
Provides a ‘tbl_ts’ class (the ‘tsibble’) to store and manage temporal-context data in a data-centric format, which is built on top of the ‘tibble’. The ‘tsibble’ aims at manipulating and analysing temporal data in a tidy and modern manner, including easily interpolate missing values, aggregate over calendar periods, performing rolling window calculations, and etc.

An Easy SVG Basic Elements Generator (easySVG)
This SVG elements generator can easily generate SVG elements such as rect, line, circle, ellipse, polygon, polyline, text and group. Also, it can combine and output SVG elements into a SVG file.

Automatic Generation of Interactive Visualizations for Popular Statistical Results (autoplotly)
Functionalities to automatically generate interactive visualizations for popular statistical results supported by ‘ggfortify’, such as time series, PCA, clustering and survival analysis, with ‘plotly.js’ <> and ‘ggplot2’ style. The generated visualizations can also be easily extended using ‘ggplot2’ and ‘plotly’ syntax while staying interactive.

Post Processing of (Half-)Hourly Eddy-Covariance Measurements (REddyProc)
Standard and extensible Eddy-Covariance data post-processing includes uStar-filtering, gap-filling, and flux-partitioning. The Eddy-Covariance (EC) micrometeorological technique quantifies continuous exchange fluxes of gases, energy, and momentum between an ecosystem and the atmosphere. It is important for understanding ecosystem dynamics and upscaling exchange fluxes. (Aubinet et al. (2012) <doi:10.1007/978-94-007-2351-1>). This package inputs pre-processed (half-)hourly data and supports further processing. First, a quality-check and filtering is performed based on the relationship between measured flux and friction velocity (uStar) to discard biased data (Papale et al. (2006) <doi:10.5194/bg-3-571-2006>). Second, gaps in the data are filled based on information from environmental conditions (Reichstein et al. (2005) <doi:10.1111/j.1365-2486.2005.001002.x>). Third, the net flux of carbon dioxide is partitioned into its gross fluxes in and out of the ecosystem by night-time based and day-time based approaches (Lasslop et al. (2010) <doi:10.1111/j.1365-2486.2009.02041.x>).


Whats new on arXiv

Functional ANOVA with Multiple Distributions: Implications for the Sensitivity Analysis of Computer Experiments

The functional ANOVA expansion of a multivariate mapping plays a fundamental role in statistics. The expansion is unique once a unique distribution is assigned to the covariates. Recent investigations in the environmental and climate sciences show that analysts may not be in a position to assign a unique distribution in realistic applications. We offer a systematic investigation of existence, uniqueness, orthogonality, monotonicity and ultramodularity of the functional ANOVA expansion of a multivariate mapping when a multiplicity of distributions is assigned to the covariates. In particular, we show that a multivariate mapping can be associated with a core of probability measures that guarantee uniqueness. We obtain new results for variance decomposition and dimension distribution under mixtures. Implications for the global sensitivity analysis of computer experiments are also discussed.

Compositional Correlation for Detecting Real Associations Among Time Series

Correlation remains to be one of the most widely used statistical tools for assessing the strength of relationships between data series. This paper presents a novel compositional correlation method for detecting linear and nonlinear relationships by considering the averages of all parts of all possible compositions of the data series instead of considering the averages of the whole series. The approach enables cumulative contribution of all local associations to the resulting correlation value. The method is applied on two different datasets: a set of four simple nonlinear polynomial functions and the expression time series data of 4381 budding yeast (saccharomyces cerevisiae) genes. The obtained results show that the introduced compositional correlation method is capable of determining real direct and inverse linear, nonlinear and monotonic relationships. Comparisons with Pearson’s correlation, Spearman’s correlation, distance correlation and the simulated annealing genetic algorithm maximal information coefficient (SGMIC) have shown that the presented method is capable of detecting important associations which were not detected by the compared methods.

Decision-Feedback Detection Strategy for Nonlinear Frequency-Division Multiplexing

By exploiting a causality property of the nonlinear Fourier transform, a novel decision-feedback detection strategy for nonlinear frequency-division multiplexing (NFDM) systems is introduced. The performance of the proposed strategy is investigated both by simulations and by theoretical bounds and approximations, showing that it achieves a considerable performance improvement compared to previously adopted techniques in terms of Q-factor. The obtained improvement demonstrates that, by tailoring the detection strategy to the peculiar properties of the nonlinear Fourier transform, it is possible to boost the performance of NFDM systems and overcome current limitations imposed by the use of more conventional detection techniques suitable for the linear regime.

Time Series Segmentation through Automatic Feature Learning

Internet of things (IoT) applications have become increasingly popular in recent years, with applications ranging from building energy monitoring to personal health tracking and activity recognition. In order to leverage these data, automatic knowledge extraction – whereby we map from observations to interpretable states and transitions – must be done at scale. As such, we have seen many recent IoT data sets include annotations with a human expert specifying states, recorded as a set of boundaries and associated labels in a data sequence. These data can be used to build automatic labeling algorithms that produce labels as an expert would. Here, we refer to human-specified boundaries as breakpoints. Traditional changepoint detection methods only look for statistically-detectable boundaries that are defined as abrupt variations in the generative parameters of a data sequence. However, we observe that breakpoints occur on more subtle boundaries that are non-trivial to detect with these statistical methods. In this work, we propose a new unsupervised approach, based on deep learning, that outperforms existing techniques and learns the more subtle, breakpoint boundaries with a high accuracy. Through extensive experiments on various real-world data sets – including human-activity sensing data, speech signals, and electroencephalogram (EEG) activity traces – we demonstrate the effectiveness of our algorithm for practical applications. Furthermore, we show that our approach achieves significantly better performance than previous methods.

Deep Canonically Correlated LSTMs

We examine Deep Canonically Correlated LSTMs as a way to learn nonlinear transformations of variable length sequences and embed them into a correlated, fixed dimensional space. We use LSTMs to transform multi-view time-series data non-linearly while learning temporal relationships within the data. We then perform correlation analysis on the outputs of these neural networks to find a correlated subspace through which we get our final representation via projection. This work follows from previous work done on Deep Canonical Correlation (DCCA), in which deep feed-forward neural networks were used to learn nonlinear transformations of data while maximizing correlation.

An Integration-Oriented Ontology to Govern Evolution in Big Data Ecosystems

Big Data architectures allow to flexibly store and process heterogeneous data, from multiple sources, in their original format. The structure of those data, commonly supplied by means of REST APIs, is continuously evolving. Thus data analysts need to adapt their analytical processes after each API release. This gets more challenging when performing an integrated or historical analysis. To cope with such complexity, in this paper, we present the Big Data Integration ontology, the core construct to govern the data integration process under schema evolution by systematically annotating it with information regarding the schema of the sources. We present a query rewriting algorithm that, using the annotated ontology, converts queries posed over the ontology to queries over the sources. To cope with syntactic evolution in the sources, we present an algorithm that semi-automatically adapts the ontology upon new releases. This guarantees ontology-mediated queries to correctly retrieve data from the most recent schema version as well as correctness in historical queries. A functional and performance evaluation on real-world APIs is performed to validate our approach.

MORF: A Framework for MOOC Predictive Modeling and Replication At Scale

The MOOC Replication Framework (MORF) is a novel software system for feature extraction, model training/testing, and evaluation of predictive dropout models in Massive Open Online Courses (MOOCs). MORF makes large-scale replication of complex machine-learned models tractable and accessible for researchers, and enables public research on privacy-protected data. It does so by focusing on the high-level operations of an \emph{extract-train-test-evaluate} workflow, and enables researchers to encapsulate their implementations in portable, fully reproducible software containers which are executed on data with a known schema. MORF’s workflow allows researchers to use data in analysis without providing them access to the underlying data directly, preserving privacy and data security. During execution, containers are sandboxed for security and data leakage and parallelized for efficiency, allowing researchers to create and test new models rapidly, on large-scale multi-institutional datasets that were previously inaccessible to most researchers. MORF is provided both as a Python API (the MORF Software), for institutions to use on their own MOOC data) or in a platform-as-a-service (PaaS) model with a web API and a high-performance computing environment (the MORF Platform).

Learning Features For Relational Data

Feature engineering is one of the most important but tedious tasks in data science projects. This work studies automation of feature learning for relational data. We first theoretically proved that learning relevant features from relational data for a given predictive analytics problem is NP-hard. However, it is possible to empirically show that an efficient rule based approach predefining transformations as a priori based on heuristics can extract very useful features from relational data. Indeed, the proposed approach outperformed the state of the art solutions with a significant margin. We further introduce a deep neural network which automatically learns appropriate transformations of relational data into a representation that predicts the target variable well instead of being predefined as a priori by users. In an extensive experiment with Kaggle competitions, the proposed methods could win late medals. To the best of our knowledge, this is the first time an automation system could win medals in Kaggle competitions with complex relational data.

Topic Modeling on Health Journals with Regularized Variational Inference

Topic modeling enables exploration and compact representation of a corpus. The CaringBridge (CB) dataset is a massive collection of journals written by patients and caregivers during a health crisis. Topic modeling on the CB dataset, however, is challenging due to the asynchronous nature of multiple authors writing about their health journeys. To overcome this challenge we introduce the Dynamic Author-Persona topic model (DAP), a probabilistic graphical model designed for temporal corpora with multiple authors. The novelty of the DAP model lies in its representation of authors by a persona — where personas capture the propensity to write about certain topics over time. Further, we present a regularized variational inference algorithm, which we use to encourage the DAP model’s personas to be distinct. Our results show significant improvements over competing topic models — particularly after regularization, and highlight the DAP model’s unique ability to capture common journeys shared by different authors.

Panel Data Quantile Regression with Grouped Fixed Effects

This paper introduces grouped latent heterogeneity in panel data quantile regression. More precisely, we assume that the observed individuals come from a heterogeneous population with an unknown, finite number of types. The number of types and group membership is not assumed to be known in advance and is estimated by means of a convex optimization problem. We provide conditions under which group membership is estimated consistently and establish asymptotic normality of the resulting estimators.

DKVF: A Framework for Rapid Prototyping and Evaluating Distributed Key-value Stores

We present our framework DKVF that enables one to quickly prototype and evaluate new protocols for key-value stores and compare them with existing protocols based on selected benchmarks. Due to limitations of CAP theorem, new protocols must be developed that achieve the desired trade-off between consistency and availability for the given application at hand. Hence, both academic and industrial communities focus on developing new protocols that identify a different (and hopefully better in one or more aspect) point on this trade-off curve. While these protocols are often based on a simple intuition, evaluating them to ensure that they indeed provide increased availability, consistency, or performance is a tedious task. Our framework, DKVF, enables one to quickly prototype a new protocol as well as identify how it performs compared to existing protocols for pre-specified benchmarks. Our framework relies on YCSB (Yahoo! Cloud Servicing Benchmark) for benchmarking. We demonstrate DKVF by implementing four existing protocols –eventual consistency, COPS, GentleRain and CausalSpartan– with it. We compare the performance of these protocols against different loading conditions. We find that the performance is similar to our implementation of these protocols from scratch. And, the comparison of these protocols is consistent with what has been reported in the literature. Moreover, implementation of these protocols was much more natural as we only needed to translate the pseudocode into Java (and add the necessary error handling). Hence, it was possible to achieve this in just 1-2 days per protocol. Finally, our framework is extensible. It is possible to replace individual components in the framework (e.g., the storage component).

Reblur2Deblur: Deblurring Videos via Self-Supervised Learning

Motion blur is a fundamental problem in computer vision as it impacts image quality and hinders inference. Traditional deblurring algorithms leverage the physics of the image formation model and use hand-crafted priors: they usually produce results that better reflect the underlying scene, but present artifacts. Recent learning-based methods implicitly extract the distribution of natural images directly from the data and use it to synthesize plausible images. Their results are impressive, but they are not always faithful to the content of the latent image. We present an approach that bridges the two. Our method fine-tunes existing deblurring neural networks in a self-supervised fashion by enforcing that the output, when blurred based on the optical flow between subsequent frames, matches the input blurry image. We show that our method significantly improves the performance of existing methods on several datasets both visually and in terms of image quality metrics. The supplementary material is

Variational Recurrent Neural Machine Translation

Partially inspired by successful applications of variational recurrent neural networks, we propose a novel variational recurrent neural machine translation (VRNMT) model in this paper. Different from the variational NMT, VRNMT introduces a series of latent random variables to model the translation procedure of a sentence in a generative way, instead of a single latent variable. Specifically, the latent random variables are included into the hidden states of the NMT decoder with elements from the variational autoencoder. In this way, these variables are recurrently generated, which enables them to further capture strong and complex dependencies among the output translations at different timesteps. In order to deal with the challenges in performing efficient posterior inference and large-scale training during the incorporation of latent variables, we build a neural posterior approximator, and equip it with a reparameterization technique to estimate the variational lower bound. Experiments on Chinese-English and English-German translation tasks demonstrate that the proposed model achieves significant improvements over both the conventional and variational NMT models.

OneNet: Joint Domain, Intent, Slot Prediction for Spoken Language Understanding

In practice, most spoken language understanding systems process user input in a pipelined manner; first domain is predicted, then intent and semantic slots are inferred according to the semantic frames of the predicted domain. The pipeline approach, however, has some disadvantages: error propagation and lack of information sharing. To address these issues, we present a unified neural network that jointly performs domain, intent, and slot predictions. Our approach adopts a principled architecture for multitask learning to fold in the state-of-the-art models for each task. With a few more ingredients, e.g. orthography-sensitive input encoding and curriculum training, our model delivered significant improvements in all three tasks across all domains over strong baselines, including one using oracle prediction for domain detection, on real user data of a commercial personal assistant.

Sequences, yet Functions: The Dual Nature of Data-Stream Processing

Data-stream processing has continuously risen in importance as the amount of available data has been steadily increasing over the last decade. Besides traditional domains such as data-center monitoring and click analytics, there is an increasing number of network-enabled production machines that generate continuous streams of data. Due to their continuous nature, queries on data-streams can be more complex, and distinctly harder to understand then database queries. As users have to consider operational details, maintenance and debugging become challenging. Current approaches model data-streams as sequences, because this is the way they are physically received. These models result in an implementation-focused perspective. We explore an alternate way of modeling datastreams by focusing on time-slicing semantics. This focus results in a model based on functions, which is better suited for reasoning about query semantics. By adapting the definitions of relevant concepts in stream processing to our model, we illustrate the practical useful- ness of our approach. Thereby, we link data-streams and query primitives to concepts in functional programming and mathematics. Most noteworthy, we prove that data-streams are monads, and show how to derive monad definitions for current data-stream models. We provide an abstract, yet practical perspective on data- stream related subjects based on a sound, consistent query model. Our work can serve as solid foundation for future data-stream query-languages.

A Bayesian Conjugate Gradient Method

A fundamental task in numerical computation is the solution of large linear systems. The conjugate gradient method is an iterative method which offers rapid convergence to the solution, particularly when an effective preconditioner is employed. However, for more challenging systems a substantial error can be present even after many iterations have been performed. The estimates obtained in this case are of little value unless further information can be provided about the numerical error. In this paper we propose a novel statistical model for this numerical error set in a Bayesian framework. Our approach is a strict generalisation of the conjugate gradient method, which is recovered as the posterior mean for a particular choice of prior. The estimates obtained are analysed with Krylov subspace methods and a contraction result for the posterior is presented. The method is then analysed in a simulation study as well as being applied to a challenging problem in medical imaging.

StressedNets: Efficient Feature Representations via Stress-induced Evolutionary Synthesis of Deep Neural Networks

The computational complexity of leveraging deep neural networks for extracting deep feature representations is a significant barrier to its widespread adoption, particularly for use in embedded devices. One particularly promising strategy to addressing the complexity issue is the notion of evolutionary synthesis of deep neural networks, which was demonstrated to successfully produce highly efficient deep neural networks while retaining modeling performance. Here, we further extend upon the evolutionary synthesis strategy for achieving efficient feature extraction via the introduction of a stress-induced evolutionary synthesis framework, where stress signals are imposed upon the synapses of a deep neural network during training to induce stress and steer the synthesis process towards the production of more efficient deep neural networks over successive generations and improved model fidelity at a greater efficiency. The proposed stress-induced evolutionary synthesis approach is evaluated on a variety of different deep neural network architectures (LeNet5, AlexNet, and YOLOv2) on different tasks (object classification and object detection) to synthesize efficient StressedNets over multiple generations. Experimental results demonstrate the efficacy of the proposed framework to synthesize StressedNets with significant improvement in network architecture efficiency (e.g., 40x for AlexNet and 33x for YOLOv2) and speed improvements (e.g., 5.5x inference speed-up for YOLOv2 on an Nvidia Tegra X1 mobile processor).

Low-Shot Learning from Imaginary Data

Humans can quickly learn new visual concepts, perhaps because they can easily visualize or imagine what novel objects look like from different views. Incorporating this ability to hallucinate novel instances of new concepts might help machine vision systems perform better low-shot learning, i.e., learning concepts from few examples. We present a novel approach to low-shot learning that uses this idea. Our approach builds on recent progress in meta-learning (‘learning to learn’) by combining a meta-learner with a ‘hallucinator’ that produces additional training examples, and optimizing both models jointly. Our hallucinator can be incorporated into a variety of meta-learners and provides significant gains: up to a 6 point boost in classification accuracy when only a single training example is available, yielding state-of-the-art performance on the challenging ImageNet low-shot classification benchmark.

A Matrix Positivstellensatz with lifting polynomials
Fast Uplink Grant for Machine Type Communications: Challenges and Opportunities
Dynamic compensation and homeostasis: a feedback control perspective
Emergent Planarity in two-dimensional Ising Models with finite-range Interactions
What Level of Quality can Neural Machine Translation Attain on Literary Text?
Changing and unchanging of the domination number of a graph: Path addition numbers
Two-Stage LASSO ADMM Signal Detection Algorithm For Large Scale MIMO
Smoothing splines on Riemannian manifolds, with applications to 3D shape space
On the Complexity of the Weighted Fussed Lasso
Vehicle Routing with Subtours
Two-stack-sorting with pop stacks
Divide and Recombine for Large and Complex Data: Model Likelihood Functions using MCMC
Robust port-Hamiltonian representations of passive systems
An octree cells occupancy geometric dimensionality descriptor for massive on-server point cloud visualisation and classification
Global Convergence of Policy Gradient Methods for Linearized Control Problems
Student Beats the Teacher: Deep Neural Networks for Lateral Ventricles Segmentation in Brain MR
Centralized ‘big science’ communities more likely generate non-replicable results
Resistance growth of branching random networks
Latent nested nonparametric priors
Conceptualizing and Evaluating Replication Across Domains of Behavioral Research
Multi-Label Learning from Medical Plain Text with Convolutional Residual Models
Circular Antenna Array Design for Breast Cancer Detection
A Finite Block Length Achievability Bound for Low Probability of Detection Communication
A Human-Grounded Evaluation Benchmark for Local Explanations of Machine Learning
One Way Function Candidate based on the Collatz Problem
Real-time Road Traffic Information Detection Through Social Media
Reed-Muller Sequences for 5G Grant-free Massive Access
Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis
On the Analysis of Puncturing for Finite-Length Polar Codes: Boolean Function Approach
Grounded Language Understanding for Manipulation Instructions Using GAN-Based Classification
Boolean Function Analogs of Covering Systems
On the I/O Costs of Some Repair Schemes for Full-Length Reed-Solomon Codes
Throughput Maximization in Cloud Radio Access Networks using Network Coding
Factor graph fragmentization of expectation propagation
Exact Error and Erasure Exponents for the Asymmetric Broadcast Channel
Generalized Reed-Muller codes over Galois rings
Steady-state analysis of the Join the Shortest Queue model in the Halfin-Whitt regime
Asynchronous Bidirectional Decoding for Neural Machine Translation
Localization-Aware Active Learning for Object Detection
Round- and Message-Optimal Distributed Part-Wise Aggregation
Understanding the Disharmony between Dropout and Batch Normalization by Variance Shift
Total dominator coloring of central graphs
Image denoising and restoration with CNN-LSTM Encoder Decoder with Direct Attention
An Accurate and Real-time Self-blast Glass Insulator Location Method Based On Faster R-CNN and U-net with Aerial Images
Algorithms for Computing Wiener Indices of Acyclic and Unicyclic Graphs
Adversarial Learning for Chinese NER from Crowd Annotations
Constraint-free Natural Image Reconstruction from fMRI Signals Based on Convolutional Neural Network
On derived equivalences for categories of generalized intervals of a finite poset
Empirical Explorations in Training Networks with Discrete Activations
GitGraph – Architecture Search Space Creation through Frequent Computational Subgraph Mining
Embedding a $θ$-invariant code into a complete one
On Hamiltonian and Hamilton-connected digraphs
Universal disorder-induced broadening of phonon bands: from disordered lattices to glasses
Deep Multi-Spectral Registration Using Invariant Descriptor Learning
Fully Convolutional Multi-scale Residual DenseNets for Cardiac Segmentation and Automated Cardiac Diagnosis using Ensemble of Classifiers
A theorem on even pancyclic bipartite digraphs
On the Kernel of $\mathbb{Z}_{2^s}$-Linear Hadamard Codes
Sparsity Preserving Optimal Control of Discretized PDE Systems
Multicolour containers, extremal entropy and counting
The cross-index of a complete graph based on a hamiltonian cycle
Scaling Laws and Warning Signs for Bifurcations of SPDEs
Lower bounds for Combinatorial Algorithms for Boolean Matrix Multiplication
Forward-Invariance and Wong-Zakai Approximation for Stochastic Moving Boundary Problems
A Multi-Agent Neural Network for Dynamic Frequency Reuse in LTE Networks
Enabling Quality-Driven Scalable Video Transmission over Multi-User NOMA System
Dual vibration configuration interaction (DVCI). A novel factorisation of molecular Hamiltonian for high performance infrared spectrum computation
Calculating $p$-values and their significances with the Energy Test for large datasets
Device-to-Device Aided Multicasting
Three-dimensional chimera patterns in networks of spiking neuron oscillators
A Survey of Physical Layer Security Techniques for 5G Wireless Networks and Challenges Ahead
de Finetti reductions for partially exchangeable probability distributions
Rank Selection of CP-decomposed Convolutional Layers with Variational Bayesian Matrix Factorization
Assessing Bayesian Nonparametric Log-Linear Models: an application to Disclosure Risk estimation
Simplified Versions of the Conditional Gradient Method
A probabilistic proof of Perron’s theorem
A new characterization of endogeny
Robust sustainable development assessment with composite indices aggregating interacting dimensions: the hierarchical-SMAA-Choquet integral approach
Long-term Visual Localization using Semantically Segmented Images
Unsupervised Representation Learning with Laplacian Pyramid Auto-encoders
Joint registration and synthesis using a probabilistic model for alignment of MRI and histological sections
Bounds on the Effective-length of Optimal Codes for Interference Channel with Feedback
Social Network based Short-Term Stock Trading System
The Frechet distribution: Estimation and Application an Overview
Critical exponents of infinite balanced words
Re-ID done right: towards good practices for person re-identification
Joint CSI Estimation, Beamforming and Scheduling Design for Wideband Massive MIMO System
Learning Deep Features for One-Class Classification
A note on Harris’ ergodic theorem, controllability and perturbations of harmonic networks
Subword complexity and power avoidance
Rooted tree maps and the Kawashima relations for multiple zeta values
Coexistence of 5G mmWave Users with Incumbent Fixed Stations over 70 and 80 GHz
On the Direction of Discrimination: An Information-Theoretic Analysis of Disparate Impact in Machine Learning
Ambulance Emergency Response Optimization in Developing Countries
Interference Mitigation Techniques for Coexistence of 5G mmWave Users with Incumbents at 70 and 80 GHz
Expectation Propagation for Approximate Inference: Free Probability Framework
An Automated System for Epilepsy Detection using EEG Brain Signals based on Deep Learning Approach
Combinatorial Preconditioners for Proximal Algorithms on Graphs

Book Memo: “Randomness and Hyper-randomness”

The monograph compares two approaches that describe the statistical stability phenomenon – one proposed by the probability theory that ignores violations of statistical stability and another proposed by the theory of hyper-random phenomena that takes these violations into account. There are five parts. The first describes the phenomenon of statistical stability. The second outlines the mathematical foundations of probability theory. The third develops methods for detecting violations of statistical stability and presents the results of experimental research on actual processes of different physical nature that demonstrate the violations of statistical stability over broad observation intervals. The fourth part outlines the mathematical foundations of the theory of hyper-random phenomena. The fifth part discusses the problem of how to provide an adequate description of the world. The monograph should be interest to a wide readership: from university students on a first course majoring in physics, engineering, and mathematics to engineers, post-graduate students, and scientists carrying out research on the statistical laws of natural physical phenomena, developing and using statistical methods for high-precision measurement, prediction, and signal processing over broad observation intervals. To read the book, it is sufficient to be familiar with a standard first university course on mathematics.

Document worth reading: “Reductions for Frequency-Based Data Mining Problems”

Studying the computational complexity of problems is one of the – if not the – fundamental questions in computer science. Yet, surprisingly little is known about the computational complexity of many central problems in data mining. In this paper we study frequency-based problems and propose a new type of reduction that allows us to compare the complexities of the maximal frequent pattern mining problems in different domains (e.g. graphs or sequences). Our results extend those of Kimelfeld and Kolaitis [ACM TODS, 2014] to a broader range of data mining problems. Our results show that, by allowing constraints in the pattern space, the complexities of many maximal frequent pattern mining problems collapse. These problems include maximal frequent subgraphs in labelled graphs, maximal frequent itemsets, and maximal frequent subsequences with no repetitions. In addition to theoretical interest, our results might yield more efficient algorithms for the studied problems. Reductions for Frequency-Based Data Mining Problems

Distilled News

Alibaba’s Neural Network Model Beat the Highest Human Score in Stanford’s Reading Test

Machines getting the better of humans is no longer a surprise. It started with IBM’s Deep Blue program beating Garry Kasparov in a game of chess more than 20 years ago and with the increasing breakthroughs in the world of machine and deep learning, machines continue to become powerful tools. Yesterday, Alibaba developed a model that beat out any human competition in Stanford’s reading comprehension competition. The dataset consists of more than 100,000 questions sourced from more than 500 Wikipedia articles. The purpose of the quiz is to see how long it takes the machine learning models to process all the information, train themselves and then provide precise or accurate answers. Alibaba used a deep learning framework to build a neural network model. It’s based on the “Hierarchical Attention Network”, which according to the company, works by identifying first paragraphs, then sentences and finally words. The underlying technology has been used previously by Alibaba, in it’s AI-powered chatbot – Dian Xiaomi.

Generative Adversarial Networks

In this article, I’ll talk about Generative Adversarial Networks, or GANs for short. GANs are one of the very few machine learning techniques which has given good performance for generative tasks, or more broadly unsupervised learning. In particular, they have given splendid performance for a variety of image generation related tasks. Yann LeCun, one of the forefathers of deep learning, has called them “the best idea in machine learning in the last 10 years”. Most importantly, the core conceptual ideas associated with a GAN are quite simple to understand (and in-fact, you should have a good idea about them by the time you finish reading this article).

Base R can be Fast

“Base R” (call it “Pure R”, “Good Old R”, just don’t call it “Old R” or late for dinner) can be fast for in-memory tasks. This is despite the commonly repeated claim that: “packages written in C/C++ are faster than R code.” The benchmark results of “rquery: Fast Data Manipulation in R” really called out for follow-up timing experiments. This note is one such set of experiments, this time concentrating on in-memory (non-database) solutions.

«smooth» package for R. Common ground. Part III. Exogenous variables. Basic stuff

One of the features of the functions in smooth package is the ability to use exogenous (aka “external”) variables. This potentially leads to the increase in the forecasting accuracy (given that you have a good estimate of the future exogenous variable). For example, in retail this can be a binary variable for promotions and we may know when the next promotion will happen. Or we may have an idea about the temperature for the next day and include it as an exogenous variable in the model. While arima() function from stats package allows inserting exogenous variables, ets() function from forecast package does not. That was one of the original motivations of developing an alternative function for ETS. It is worth noting that all the forecasting functions in smooth package (except for sma()) allow using exogenous variables, so this feature is not restricted with es() only.

Square off: Machine learning libraries

Choosing a machine learning (ML) library to solve predictive use cases is easier said than done. There are many to choose from, and each have their own niche and benefits that are good for specific use cases. Even for someone with decent experience in ML and data science, it can be an ordeal to vet all the varied solutions. Where do you start? At Salesforce Einstein, we have to constantly research the market to stay on top of it. Here are some observations on the top five characteristics of ML libraries that developers should consider when deciding what library to use:

Field Guide to the R Ecosystem

I started working with R around about 5 years ago. Parts of the R world have changed substantially over that time, while other parts remain largely the same. One thing that hasn’t changed however, is that there has never been a simple, high-level text to introduce newcomers to the ecosystem. I believe this is especially important now that the ecosystem has grown so much. It’s no longer enough to just know about R itself. Those working with, or even around R, must now understand the ecosystem as a whole in order to best manage and support its use. Hopefully the Field Guide to the R Ecosystem goes some way towards filling this gap.

Whats new on arXiv

Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution

Current machine learning systems operate, almost exclusively, in a statistical, or model-free mode, which entails severe theoretical limits on their power and performance. Such systems cannot reason about interventions and retrospection and, therefore, cannot serve as the basis for strong AI. To achieve human level intelligence, learning machines need the guidance of a model of reality, similar to the ones used in causal inference tasks. To demonstrate the essential role of such models, I will present a summary of seven tasks which are beyond reach of current machine learning systems and which have been accomplished using the tools of causal modeling.

Cost-Sensitive Convolution based Neural Networks for Imbalanced Time-Series Classification

Some deep convolutional neural networks were proposed for time-series classification and class imbalanced problems. However, those models performed degraded and even failed to recognize the minority class of an imbalanced temporal sequences dataset. Minority samples would bring troubles for temporal deep learning classifiers due to the equal treatments of majority and minority class. Until recently, there were few works applying deep learning on imbalanced time-series classification (ITSC) tasks. Here, this paper aimed at tackling ITSC problems with deep learning. An adaptive cost-sensitive learning strategy was proposed to modify temporal deep learning models. Through the proposed strategy, classifiers could automatically assign misclassification penalties to each class. In the experimental section, the proposed method was utilized to modify five neural networks. They were evaluated on a large volume, real-life and imbalanced time-series dataset with six metrics. Each single network was also tested alone and combined with several mainstream data samplers. Experimental results illustrated that the proposed cost-sensitive modified networks worked well on ITSC tasks. Compared to other methods, the cost-sensitive convolution neural network and residual network won out in the terms of all metrics. Consequently, the proposed cost-sensitive learning strategy can be used to modify deep learning classifiers from cost-insensitive to cost-sensitive. Those cost-sensitive convolutional networks can be effectively applied to address ITSC issues.

Multivariate LSTM-FCNs for Time Series Classification

Over the past decade, multivariate time series classification has been receiving a lot of attention. We propose augmenting the existing univariate time series classification models, LSTM-FCN and ALSTM-FCN with a squeeze and excitation block to further improve performance. Our proposed models outperform most of the state of the art models while requiring minimum preprocessing. The proposed models work efficiently on various complex multivariate time series classification tasks such as activity recognition or action recognition. Furthermore, the proposed models are highly efficient at test time and small enough to deploy on memory constrained systems.

Brain EEG Time Series Selection: A Novel Graph-Based Approach for Classification

Brain Electroencephalography (EEG) classification is widely applied to analyze cerebral diseases in recent years. Unfortunately, invalid/noisy EEGs degrade the diagnosis performance and most previously developed methods ignore the necessity of EEG selection for classification. To this end, this paper proposes a novel maximum weight clique-based EEG selection approach, named mwcEEGs, to map EEG selection to searching maximum similarity-weighted cliques from an improved Fr\'{e}chet distance-weighted undirected EEG graph simultaneously considering edge weights and vertex weights. Our mwcEEGs improves the classification performance by selecting intra-clique pairwise similar and inter-clique discriminative EEGs with similarity threshold \delta. Experimental results demonstrate the algorithm effectiveness compared with the state-of-the-art time series selection algorithms on real-world EEG datasets.

A Semi-Parametric Binning Approach to Quickest Change Detection

The problem of quickest detection of a change in distribution is considered under the assumption that the pre-change distribution is known, and the post-change distribution is only known to belong to a family of distributions distinguishable from a discretized version of the pre-change distribution. A sequential change detection procedure is proposed that partitions the sample space into a finite number of bins, and monitors the number of samples falling into each of these bins to detect the change. A test statistic that approximates the generalized likelihood ratio test is developed. It is shown that the proposed test statistic can be efficiently computed using a recursive update scheme, and a procedure for choosing the number of bins in the scheme is provided. Various asymptotic properties of the test statistic are derived to offer insights into its performance trade-off between average detection delay and average run length to a false alarm. Testing on synthetic and real data demonstrates that our approach is comparable or better in performance to existing non-parametric change detection methods.

Evaluation of Machine Learning Fameworks on Finis Terrae II

Machine Learning (ML) and Deep Learning (DL) are two technologies used to extract representations of the data for a specific purpose. ML algorithms take a set of data as input to generate one or several predictions. To define the final version of one model, usually there is an initial step devoted to train the algorithm (get the right final values of the parameters of the model). There are several techniques, from supervised learning to reinforcement learning, which have different requirements. On the market, there are some frameworks or APIs that reduce the effort for designing a new ML model. In this report, using the benchmark DLBENCH, we will analyse the performance and the execution modes of some well-known ML frameworks on the Finis Terrae II supercomputer when supervised learning is used. The report will show that placement of data and allocated hardware can have a large influence on the final timeto-solution.

Some techniques in density estimation

Density estimation is an interdisciplinary topic at the intersection of statistics, theoretical computer science and machine learning. We review some old and new techniques for bounding sample complexity of estimating densities of continuous distributions, focusing on the class of mixtures of Gaussians and its subclasses.

Comparative Study on Generative Adversarial Networks

In recent years, there have been tremendous advancements in the field of machine learning. These advancements have been made through both academic as well as industrial research. Lately, a fair amount of research has been dedicated to the usage of generative models in the field of computer vision and image classification. These generative models have been popularized through a new framework called Generative Adversarial Networks. Moreover, many modified versions of this framework have been proposed in the last two years. We study the original model proposed by Goodfellow et al. as well as modifications over the original model and provide a comparative analysis of these models.

Noisy Expectation-Maximization: Applications and Generalizations

We present a noise-injected version of the Expectation-Maximization (EM) algorithm: the Noisy Expectation Maximization (NEM) algorithm. The NEM algorithm uses noise to speed up the convergence of the EM algorithm. The NEM theorem shows that injected noise speeds up the average convergence of the EM algorithm to a local maximum of the likelihood surface if a positivity condition holds. The generalized form of the noisy expectation-maximization (NEM) algorithm allow for arbitrary modes of noise injection including adding and multiplying noise to the data. We demonstrate these noise benefits on EM algorithms for the Gaussian mixture model (GMM) with both additive and multiplicative NEM noise injection. A separate theorem (not presented here) shows that the noise benefit for independent identically distributed additive noise decreases with sample size in mixture models. This theorem implies that the noise benefit is most pronounced if the data is sparse. Injecting blind noise only slowed convergence.

Multiple Imputation: A Review of Practical and Theoretical Findings

Multiple imputation is a straightforward method for handling missing data in a principled fashion. This paper presents an overview of multiple imputation, including important theoretical results and their practical implications for generating and using multiple imputations. A review of strategies for generating imputations follows, including recent developments in flexible joint modeling and sequential regression/chained equations/fully conditional specification approaches. Finally, we compare and contrast different methods for generating imputations on a range of criteria before identifying promising avenues for future research.

MINE: Mutual Information Neural Estimation

We argue that the estimation of the mutual information between high dimensional continuous random variables is achievable by gradient descent over neural networks. This paper presents a Mutual Information Neural Estimator (MINE) that is linearly scalable in dimensionality as well as in sample size. MINE is back-propable and we prove that it is strongly consistent. We illustrate a handful of applications in which MINE is succesfully applied to enhance the property of generative models in both unsupervised and supervised settings. We apply our framework to estimate the information bottleneck, and apply it in tasks related to supervised classification problems. Our results demonstrate substantial added flexibility and improvement in these settings.

How Many Samples Required in Big Data Collection: A Differential Message Importance Measure

Information collection is a fundamental problem in big data, where the size of sampling sets plays a very important role. This work considers the information collection process by taking message importance into account. Similar to differential entropy, we define differential message importance measure (DMIM) as a measure of message importance for continuous random variable. It is proved that the change of DMIM can describe the gap between the distribution of a set of sample values and a theoretical distribution. In fact, the deviation of DMIM is equivalent to Kolmogorov-Smirnov statistic, but it offers a new way to characterize the distribution goodness-of-fit. Numerical results show some basic properties of DMIM and the accuracy of the proposed approximate values. Furthermore, it is also obtained that the empirical distribution approaches the real distribution with decreasing of the DMIM deviation, which contributes to the selection of suitable sampling points in actual system.

State Variation Mining: On Information Divergence with Message Importance in Big Data

Information transfer which reveals the state variation of variables can play a vital role in big data analytics and processing. In fact, the measure for information transfer can reflect the system change from the statistics by using the variable distributions, similar to KL divergence and Renyi divergence. Furthermore, in terms of the information transfer in big data, small probability events dominate the importance of the total message to some degree. Therefore, it is significant to design an information transfer measure based on the message importance which emphasizes the small probability events. In this paper, we propose the message importance divergence (MID) and investigate its characteristics and applications on three aspects. First, the message importance transfer capacity based on MID is presented to offer an upper bound for the information transfer with disturbance. Then, we utilize the MID to guide the queue length selection, which is the fundamental problem considered to have higher social or academic value in the caching operation of mobile edge computing. Finally, we extend the MID to the continuous case and discuss the robustness by using it to measuring information distance.

MSDNN: Multi-Scale Deep Neural Network for Salient Object Detection

Salient object detection is a fundamental problem and has been received a great deal of attentions in computer vision. Recently deep learning model became a powerful tool for image feature extraction. In this paper, we propose a multi-scale deep neural network (MSDNN) for salient object detection. The proposed model first extracts global high-level features and context information over the whole source image with recurrent convolutional neural network (RCNN). Then several stacked deconvolutional layers are adopted to get the multi-scale feature representation and obtain a series of saliency maps. Finally, we investigate a fusion convolution module (FCM) to build a final pixel level saliency map. The proposed model is extensively evaluated on four salient object detection benchmark datasets. Results show that our deep model significantly outperforms other 12 state-of-the-art approaches.

Deep Learning for Sampling from Arbitrary Probability Distributions

This paper proposes a fully connected neural network model to map samples from a uniform distribution to samples of any explicitly known probability density function. During the training, the Jensen-Shannon divergence between the distribution of the model’s output and the target distribution is minimized. We experimentally demonstrate that our model converges towards the desired state. It provides an alternative to existing sampling methods such as inversion sampling, rejection sampling, Gaussian mixture models and Markov-Chain-Monte-Carlo. Our model has high sampling efficiency and is easily applied to any probability distribution, without the need of further analytical or numerical calculations. It can produce correlated samples, such that the output distribution converges faster towards the target than for independent samples. But it is also able to produce independent samples, if single values are fed into the network and the input values are independent as well. We focus on one-dimensional sampling, but additionally illustrate a two-dimensional example with a target distribution of dependent variables.

Fairness in Supervised Learning: An Information Theoretic Approach

Automated decision making systems are increasingly being used in real-world applications. In these systems for the most part, the decision rules are derived by minimizing the training error on the available historical data. Therefore, if there is a bias related to a sensitive attribute such as gender, race, religion, etc. in the data, say, due to cultural/historical discriminatory practices against a certain demographic, the system could continue discrimination in decisions by including the said bias in its decision rule. We present an information theoretic framework for designing fair predictors from data, which aim to prevent discrimination against a specified sensitive attribute in a supervised learning setting. We use equalized odds as the criterion for discrimination, which demands that the prediction should be independent of the protected attribute conditioned on the actual label. To ensure fairness and generalization simultaneously, we compress the data to an auxiliary variable, which is used for the prediction task. This auxiliary variable is chosen such that it is decontaminated from the discriminatory attribute in the sense of equalized odds. The final predictor is obtained by applying a Bayesian decision rule to the auxiliary variable.

SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks

Going deeper and wider in neural architectures improves the accuracy, while the limited GPU DRAM places an undesired restriction on the network design domain. Deep Learning (DL) practitioners either need change to less desired network architectures, or nontrivially dissect a network across multiGPUs. These distract DL practitioners from concentrating on their original machine learning tasks. We present SuperNeurons: a dynamic GPU memory scheduling runtime to enable the network training far beyond the GPU DRAM capacity. SuperNeurons features 3 memory optimizations, \textit{Liveness Analysis}, \textit{Unified Tensor Pool}, and \textit{Cost-Aware Recomputation}, all together they effectively reduce the network-wide peak memory usage down to the maximal memory usage among layers. We also address the performance issues in those memory saving techniques. Given the limited GPU DRAM, SuperNeurons not only provisions the necessary memory for the training, but also dynamically allocates the memory for convolution workspaces to achieve the high performance. Evaluations against Caffe, Torch, MXNet and TensorFlow have demonstrated that SuperNeurons trains at least 3.2432 deeper network than current ones with the leading performance. Particularly, SuperNeurons can train ResNet2500 that has 10^4 basic network layers on a 12GB K40c.

Non-Parametric Transformation Networks

ConvNets have been very effective in many applications where it is required to learn invariances to within-class nuisance transformations. However, through their architecture, ConvNets only enforce invariance to translation. In this paper, we introduce a new class of convolutional architectures called Non-Parametric Transformation Networks (NPTNs) which can learn general invariances and symmetries directly from data. NPTNs are a direct and natural generalization of ConvNets and can be optimized directly using gradient descent. They make no assumption regarding structure of the invariances present in the data and in that aspect are very flexible and powerful. We also model ConvNets and NPTNs under a unified framework called Transformation Networks which establishes the natural connection between the two. We demonstrate the efficacy of NPTNs on natural data such as MNIST and CIFAR 10 where it outperforms ConvNet baselines with the same number of parameters. We show it is effective in learning invariances unknown apriori directly from data from scratch. Finally, we apply NPTNs to Capsule Networks and show that they enable them to perform even better.

DCDistance: A Supervised Text Document Feature extraction based on class labels

Text Mining is a field that aims at extracting information from textual data. One of the challenges of such field of study comes from the pre-processing stage in which a vector (and structured) representation should be extracted from unstructured data. The common extraction creates large and sparse vectors representing the importance of each term to a document. As such, this usually leads to the curse-of-dimensionality that plagues most machine learning algorithms. To cope with this issue, in this paper we propose a new supervised feature extraction and reduction algorithm, named DCDistance, that creates features based on the distance between a document to a representative of each class label. As such, the proposed technique can reduce the features set in more than 99% of the original set. Additionally, this algorithm was also capable of improving the classification accuracy over a set of benchmark datasets when compared to traditional and state-of-the-art features selection algorithms.

tau-FPL: Tolerance-Constrained Learning in Linear Time

Learning a classifier with control on the false-positive rate plays a critical role in many machine learning applications. Existing approaches either introduce prior knowledge dependent label cost or tune parameters based on traditional classifiers, which lack consistency in methodology because they do not strictly adhere to the false-positive rate constraint. In this paper, we propose a novel scoring-thresholding approach, tau-False Positive Learning (tau-FPL) to address this problem. We show the scoring problem which takes the false-positive rate tolerance into accounts can be efficiently solved in linear time, also an out-of-bootstrap thresholding method can transform the learned ranking function into a low false-positive classifier. Both theoretical analysis and experimental results show superior performance of the proposed tau-FPL over existing approaches.

SPIN: A Fast and Scalable Matrix Inversion Method in Apache Spark

The growth of big data in domains such as Earth Sciences, Social Networks, Physical Sciences, etc. has lead to an immense need for efficient and scalable linear algebra operations, e.g. Matrix inversion. Existing methods for efficient and distributed matrix inversion using big data platforms rely on LU decomposition based block-recursive algorithms. However, these algorithms are complex and require a lot of side calculations, e.g. matrix multiplication, at various levels of recursion. In this paper, we propose a different scheme based on Strassen’s matrix inversion algorithm (mentioned in Strassen’s original paper in 1969), which uses far fewer operations at each level of recursion. We implement the proposed algorithm, and through extensive experimentation, show that it is more efficient than the state of the art methods. Furthermore, we provide a detailed theoretical analysis of the proposed algorithm, and derive theoretical running times which match closely with the empirically observed wall clock running times, thus explaining the U-shaped behaviour w.r.t. block-sizes.

An Interpretable Reasoning Network for Multi-Relation Question Answering

Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis.

Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Learning similarity functions between image pairs with deep neural networks yields highly correlated activations of embeddings. In this work, we show how to improve the robustness of such embeddings by exploiting the independence within ensembles. To this end, we divide the last embedding layer of a deep network into an embedding ensemble and formulate training this ensemble as an online gradient boosting problem. Each learner receives a reweighted training sample from the previous learners. Further, we propose two loss functions which increase the diversity in our ensemble. These loss functions can be applied either for weight initialization or during training. Together, our contributions leverage large embedding sizes more effectively by significantly reducing correlation of the embedding and consequently increase retrieval accuracy of the embedding. Our method works with any differentiable loss function and does not introduce any additional parameters during test time. We evaluate our metric learning method on image retrieval tasks and show that it improves over state-of-the-art methods on the CUB 200-2011, Cars-196, Stanford Online Products, In-Shop Clothes Retrieval and VehicleID datasets.

Building a Conversational Agent Overnight with Dialogue Self-Play

We propose Machines Talking To Machines (M2M), a framework combining automation and crowdsourcing to rapidly bootstrap end-to-end dialogue agents for goal-oriented dialogues in arbitrary domains. M2M scales to new tasks with just a task schema and an API client from the dialogue system developer, but it is also customizable to cater to task-specific interactions. Compared to the Wizard-of-Oz approach for data collection, M2M achieves greater diversity and coverage of salient dialogue flows while maintaining the naturalness of individual utterances. In the first phase, a simulated user bot and a domain-agnostic system bot converse to exhaustively generate dialogue ‘outlines’, i.e. sequences of template utterances and their semantic parses. In the second phase, crowd workers provide contextual rewrites of the dialogues to make the utterances more natural while preserving their meaning. The entire process can finish within a few hours. We propose a new corpus of 3,000 dialogues spanning 2 domains collected with M2M, and present comparisons with popular dialogue datasets on the quality and diversity of the surface forms and dialogue flows.

Cobra: A Framework for Cost Based Rewriting of Database Applications

Database applications are typically written using a mixture of imperative languages and declarative frameworks for data processing. Application logic gets distributed across the declarative and imperative parts of a program. Often, there is more than one way to implement the same program, whose efficiency may depend on a number of parameters. In this paper, we propose a framework that automatically generates all equivalent alternatives of a given program using a given set of program transformations, and chooses the least cost alternative. We use the concept of program regions as an algebraic abstraction of a program and extend the Volcano/Cascades framework for optimization of algebraic expressions, to optimize programs. We illustrate the use of our framework for optimizing database applications. We show through experimental results, that our framework has wide applicability in real world applications and provides significant performance benefits.

Formal Dependability Modeling and Optimization of Scrubbed-Partitioned TMR for SRAM-based FPGAs
LDPC Codes with Local and Global Decoding
Model-Based Action Exploration
Resolvability on Continuous Alphabets
Interactive Learning of Acyclic Conditional Preference Networks
Extremal $G$-free induced subgraphs of Kneser graphs
Influence of topology in the mobility enhancement of pulse-coupled oscillator synchronization
Timely Status Update in Massive IoT Systems: Decentralized Scheduling for Wireless Uplinks
Fully-Coupled Two-Stream Spatiotemporal Networks for Extremely Low Resolution Action Recognition
A Brain-Inspired Trust Management Model to Assure Security in a Cloud based IoT Framework for Neuroscience Applications
On the roots of Wiener polynomials of graphs
Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction
Average Power and $λ$-power in Multiple Testing Scenarios when the Benjamini-Hochberg False Discovery Rate Procedure is Used
Noisy Feedback and Loss Unlimited Private Communication
Efficient C-RAN Random Access for IoT Devices: Learning Links via Recommendation Systems
Regularly varying non-stationary Galton–Watson processes with immigration
Minimax Optimality of Sign Test for Paired Heterogeneous Data
Adaptive Bit Allocation for OFDM Cognitive Radio Systems with Imperfect Channel Estimation
Enhancing Underwater Imagery using Generative Adversarial Networks
Non-Rigid Image Registration Using Self-Supervised Fully Convolutional Networks without Training Data
Brain Age Prediction Based on Resting-State Functional Connectivity Patterns Using Convolutional Neural Networks
A Hardware-Friendly Algorithm for Scalable Training and Deployment of Dimensionality Reduction Models on FPGA
On Partially Overlapping Coexistence for Dynamic Spectrum Access in Cognitive Radio
Spatio-Temporal Pricing for Ridesharing Platforms
Did William Shakespeare and Thomas Kyd Write Edward III?
Application of a semantic segmentation convolutional neural network for accurate automatic detection and mapping of solar photovoltaic arrays in aerial imagery
Cognitive Non-Orthogonal Multiple Access with Cooperative Relaying: A New Wireless Frontier for 5G Spectrum Sharing
A Simplified Coding Scheme for the Broadcast Channel with Complementary Receiver Side Information under Individual Secrecy Constraints
Asymptotic Static Hedge via Symmetrization
Communication Optimality Trade-offs For Distributed Estimation
A3T: Adversarially Augmented Adversarial Training
Emergent memory in cell signaling: Persistent adaptive dynamics in cascades can arise from the diversity of relaxation time-scales
Deep Stereo Matching with Explicit Cost Aggregation Sub-Architecture
Content Based Status Updates
Status Updates in a multi-stream M/G/1/1 preemptive queue
Controller Synthesis for Safety of Physically-Viable Data-Driven Models
How to augment a small learning set for improving the performances of a CNN-based steganalyzer?
Optimal control of an evolution equation with non-smooth dissipation
How should a fixed budget of dwell time be spent in scanning electron microscopy to optimize image quality?
On notions of Q-independence and Q-identical distributiveness
Multivariate stochastic delay differential equations and CAR representations of CARMA processes
Sensitivity indices for independent groups of variables
Hierarchical Motion Consistency Constraint for Efficient Geometrical Verification in UAV Image Matching
Exceptional and modern intervals of the Tamari lattice
Combinatorics of compactified universal Jacobians
Planning with Trust for Human-Robot Collaboration
Spatio-Temporal Linkage over Location Enhanced Services
Generative Single Image Reflection Separation
Self-Predicting Boolean Functions
Multiple Antennas Secure Transmission under Pilot Spoofing and Jamming Attack
Active repositioning of storage units in Robotic Mobile Fulfillment Systems
Perfect codes in generalized Fibonacci cubes
Couplings in L^p distance of two Brownian motions and their L{é}vy area
Clinical and Non-clinical Effects on Surgery Duration: Statistical Modeling and Analysis
On the goodness-of-fit of generalized linear geostatistical models
A Game Theoretic Approach to Hyperbolic Consensus Problems
Improved bounds on the multicolor Ramsey numbers of paths and even cycles
Interpretation of the vibrational spectra of glassy polymers using coarse-grained simulations
On Partly Overloaded Spreading Sequences with Variable Spreading Factor
Deep Episodic Memory: Encoding, Recalling, and Predicting Episodic Experiences for Robot Action Execution
First-passage times over moving boundaries for asymptotically stable walks
Cosmic String Detection with Tree-Based Machine Learning
Local asymptotic self-similarity for heavy tailed harmonizable fractional Lévy motions
Second order models for optimal transport and cubic splines on the Wasserstein space
Variational Second-Order Interpolation on the Group of Diffeomorphisms with a Right-Invariant Metric
Bayesian Quadrature for Multiple Related Integrals
Can Who-Edits-What Predict Edit Survival?
QuickNAT: Segmenting MRI Neuroanatomy in 20 seconds
Arhuaco: Deep Learning and Isolation Based Security for Distributed High-Throughput Computing
Youla Coding and Computation of Gaussian Feedback Capacity
Computing permanents of complex diagonally dominant matrices and tensors
A Simple and Efficient Estimation Method for Models with Nonignorable Missing Data
Determining Projection Constants of Univariate Polynomial Spaces
Multinomial logistic model for coinfection diagnosis between arbovirus and malaria in Kedougou
A unifying Perron-Frobenius theorem for nonnegative tensors via multi-homogeneous maps
Machine Intelligence Techniques for Next-Generation Context-Aware Wireless Networks
List Decoding of Locally Repairable Codes
Optimal Streaming Codes for Channels with Burst and Arbitrary Erasures
Development of Energy Models for Design Space Exploration of Embedded Many-Core Systems
Inexact cuts in Deterministic and Stochastic Dual Dynamic Programming applied to linear optimization problems
Safe Privatization in Transactional Memory
Graph domination-saturation
On projective and affine equivalence of sub-Riemannian metrics
Conditional Probability Models for Deep Image Compression
Deep saliency: What is learnt by a deep network about saliency?
A note on Herglotz’s theorem for time series on function spaces
Real-world Anomaly Detection in Surveillance Videos
Asynchronous Stochastic Variational Inference
The Control Toolbox – An Open-Source C++ Library for Robotics, Optimal and Model Predictive Control
Generalization Error Bounds for Noisy, Iterative Algorithms
Belief Propagation Decoding of Polar Codes on Permuted Factor Graphs
A Family of Tractable Graph Distances
A Workload Analysis of NSF’s Innovative HPC Resources Using XDMoD
TFisher Tests: Optimal and Adaptive Thresholding for Combining $p$-Values
A Multi-Hop Framework for Multi-Source, Multi-Relay, All-Cast Channels
Light Field Super-Resolution using a Low-Rank Prior and Deep Convolutional Neural Networks
Corner cases, singularities, and dynamic factoring
Not All Ops Are Created Equal!
Prototypicality effects in global semantic description of objects
TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
The DCS Theorem
Estimating the Number of Connected Components in a Graph via Subgraph Sampling
Predicting Future Lane Changes of Other Highway Vehicles using RNN-based Deep Models
Combining Symbolic and Function Evaluation Expressions In Neural Programs
Susceptibility of power grids to input fluctuations
Engineering Cooperative Smart Things based on Embodied Cognition
A Computational Model of Commonsense Moral Decision Making
Comprehensive Optimization of Parametric Kernels for Graphics Processing Units
On the Capacity Region of the Deterministic Y-Channel with Common and Private Messages
Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers
Feature Space Transfer for Data Augmentation
Coded Cooperative Computation for Internet of Things
Queue-aware Energy Efficient Control for Dense Wireless Networks
Estimation in the group action channel
Is profile likelihood a true likelihood? An argument in favor
Inverted Residuals and Linear Bottlenecks: Mobile Networks forClassification, Detection and Segmentation
Cohomology rings of compactifications of toric arrangements
Distributed Multi-User Secret Sharing
Phase diagrams of Weyl semimetals with competing diagonal and off-diagonal disorders
A Context-free Grammar for Peaks and Double Descents of Permutations
Social Advantage with Mixed Entangled States
A Survey on Compiler Autotuning using Machine Learning
On the convergence properties of GAN training
Towards a more efficient representation of imputation operators in TPOT
Tight Bounds for $\ell_p$ Oblivious Subspace Embeddings
LDPC Codes over Gaussian Multiple Access Wiretap Channel
Asymptotic Distribution of Multilevel Channel Polarization for a Certain Class of Erasure Channels
Longest Common Prefixes with $k$-Errors and Applications
Sparse NOMA: A Closed-Form Characterization
Detecting Offensive Language in Tweets Using Deep Learning
Aperiodic Sampled-Data Control via Explicit Transmission Mapping: A Set Invariance Approach
Semi-supervised Fisher vector network
Variable-Length Resolvability for Mixed Sources and its Application to Variable-Length Source Coding
Secure Communications in NOMA System: Subcarrier Assignment and Power Allocation
Scalable De Novo Genome Assembly Using Pregel
Size-to-depth: A New Perspective for Single Image Depth Estimation
Boolean functions: noise stability, non-interactive correlation, and mutual information
A Scalable Belief Propagation Algorithm for Radio Signal Based SLAM
Lattice Erasure Codes of Low Rank with Noise Margins
On the Measurement Uncertainty in a Reverberation Chamber Including Frequency Stirring
EmbedRank: Unsupervised Keyphrase Extraction using Sentence Embeddings
Not-All-Equal and 1-in-Degree Decompositions: Algorithmic Complexity and Applications
Channel Whispering: a Protocol for Physical Layer Group Key Generation. Application to IR-UWB through Deconvolution
Waring’s Theorem for Binary Powers
Persistence of one-dimensional AR(1)-sequences
Can Computers Create Art?
Better Runtime Guarantees Via Stochastic Domination
A Stochastic Singular Vector Based MIMO Channel Model for MAC Layer Tracking
On a statistical approach to mate choices in reproduction
Irreversible investment with fixed adjustment costs: a stochastic impulse control approach
An Explicit Convergence Rate for Nesterov’s Method from SDP
Model Predictive Control in Spacecraft Rendezvous and Soft Docking
Near-optimal approximation algorithm for simultaneous Max-Cut
Fast Methods for Solving the Cluster Containment Problem for Phylogenetic Networks
Extinction time of a CB-processes with competition in a Lévy random environment
Saturated equiangular lines in Euclidean spaces
Non-Orthogonal Multiple Access for mmWave Drone Networks with Limited Feedback
Polynomial stability of exact solution and a numerical method for stochastic differential equations with time-dependent delay
Shrink or Substitute: Handling Process Failures in HPC Systems using In-situ Recovery
A Bio-inspired Collision Detecotr for Small Quadcopter
Regularity of stochastic nonlocal diffusion equations
Compressed Neighbour Discovery using Sparse Kerdock Matrices
Fix your classifier: the marginal value of training the last weight layer
Cooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication
Hire the Experts: Combinatorial Auction Based Scheme for Experts Selection in E-Healthcare
Throughput Maximization for UAV-Enabled Wireless Powered Communication Networks
Frame Moments and Welch Bound with Erasures
Properties of non-symmetric Macdonald polynomials at $q=1$ and $q=0$
Energy-Efficient Resource Allocation in NOMA Heterogeneous Networks
Remarks on Graphons
Poisson Cox Point Processes for Vehicular Networks
On the effect of blockage objects in dense MIMO SWIPT networks
Asymptotic Enumeration of Graph Classes with Many Components
Towards Realistic Threat Modeling: Attack Commodification, Irrelevant Vulnerabilities, and Unrealistic Assumptions
Fully Quantum Arbitrarily Varying Channels: Random Coding Capacity and Capacity Dichotomy
Distributed dynamic load balancing for task parallel programming
The method of hypergraph containers
A Bayesian Evidence Synthesis Approach to Estimate Disease Prevalence in Hard-To-Reach Populations: Hepatitis C in New York City
Deep Reinforcement Fuzzing
Frame-Recurrent Video Super-Resolution
On the shape factor of interaction laws for a non-local approximation of the Sobolev norm and the total variation
On Identifying a Massive Number of Distributions
Stochastic quantization of an Abelian gauge theory
New Perspectives on Multi-Prover Interactive Proofs
Deep Reinforcement Learning of Cell Movement in the Early Stage of C. elegans Embryogenesis
PACER: Peripheral Activity Completion Estimation and Recognition
A functional limit theorem for the profile of random recursive trees
Algorithmic Polynomials
Some Generalizations of Good Integers and Their Applications in the Study of Self-Dual Negacyclic Codes
Some remarks on biased recursive trees
Hierarchical Memory Management for Mutable State
Top k Memory Candidates in Memory Networks for Common Sense Reasoning
Theorems About Integration Order Replacement in Multiple Ito Stochastic Integrals
An Elementary Dyadic Riemann Hypothesis
Strategies for Stable Merge Sorting
Generalized Lambert Series Identities and Applications in Rank Differences
Renewal in Hawkes processes with self-excitation and inhibition
Non-Orthogonal Multiple Access For Cooperative Communications: Challenges, Opportunities, And Trends
Deep Net Triage: Assessing the Criticality of Network Layers by Structural Compression
Hyperspectral recovery from RGB images using Gaussian Processes
Information Geometric Approach to Bayesian Lower Error Bounds
The Circular Law for Random Matrices with Intra-row Dependence
Efficient Trimmed Convolutional Arithmetic Encoding for Lossless Image Compression
The decoding failure probability of MDPC codes
Fault-Tolerant Hotelling Games
Partial geodesics on symmetric groups endowed with breakpoint distance
Approximation of Excessive Backlog Probabilities of Two Tandem Queues
Efficient arithmetic regularity and removal lemmas for induced bipartite patterns
Hierarchical Coding for Distributed Computing
Asymptotic Correlation Structure of Discounted Incurred But Not Reported Claims under Fractional Poisson Arrival Process
Towards Imperceptible and Robust Adversarial Example Attacks against Neural Networks
Sparsity-based Defense against Adversarial Attacks on Linear Classifiers
Robust capacitated trees and networks with uniform demands
Searching for Maximum Out-Degree Vertices in Tournaments
Inclusion-exclusion by ordering-free cancellation
Sensitivity analysis for multiscale stochastic reaction networks using hybrid approximations
Robust Inference for Seemingly Unrelated Regression Models
Combining Stereo Disparity and Optical Flow for Basic Scene Flow
Spectral engineering and tunable thermoelectric behavior in a quasiperiodic ladder network
The Communication-Hiding Conjugate Gradient Method with Deep Pipelines
Full Wafer Redistribution and Wafer Embedding as Key Technologies for a Multi-Scale Neuromorphic Hardware Cluster
Secure Adaptive Group Testing
Mixing Time on the Kagome Lattice
Distributionally Robust Optimization for Sequential Decision Making
Directed Strongly Regular Cayley Graphs on Dihedral groups
SAR Image Despeckling Using Quadratic-Linear Approximated L1-Norm
On the Distribution of Random Geometric Graphs
A Tight Converse to the Spectral Resolution Limit via Convex Programming
Two High-performance Schemes of Transmit Antenna Selection for Secure Spatial Modulation
Detecting dynamic spatial correlation patterns with generalized wavelet coherence and non-stationary surrogate data
Block-coordinate primal-dual method for the nonsmooth minimization over linear constraints
Subpolynomial trace reconstruction for random strings and arbitrary deletion probability
Approximating the Incremental Knapsack Problem
New LMRD bounds for constant dimension codes and improved constructions
A partial order on Motzkin paths
Predicting Movie Genres Based on Plot Summaries
Robots as Powerful Allies for the Study of Embodied Cognition from the Bottom Up
Improving Communication Patterns in Polyhedral Process Networks
Mixing Time for Square Tilings
Empirical $L^2$-distance test statistics for ergodic diffusions
System-Aware Compression
Improving Orbit Prediction Accuracy through Supervised Machine Learning
Randomized projection methods for convex feasibility problems: conditioning and convergence rates
Classification of histopathological breast cancer images using iterative VMD aided Zernike moments & textural signatures
Coding over Sets for DNA Storage
Unsupervised Cipher Cracking Using Discrete GANs
Non-Orthogonal Multiple Access for Mobile VLC Networks with Random Receiver Orientation
Sending Information Through Status Updates

Book Memo: “Mining the Social Web”

Want to tap the tremendous amount of valuable social data in Facebook, Twitter, LinkedIn, GitHub, Instagram, and Google+? This new edition helps you discover who’s making connections with social media, what they’re talking about, and where they’re located. You’ll learn how to combine social web data, analysis techniques, and visualization to find what you’ve been looking for in the social haystack—as well as useful information you didn’t know existed.
• Get a straightforward synopsis of the social web landscape
• Use adaptable scripts on GitHub to harvest data from social network APIs.
• Learn how to employ easy-to-use Python tools to slice and dice the data you collect
• Explore social connections in microformats with the XHTML Friends Network
• Apply advanced mining techniques such as TF-IDF, cosine similarity, collocation analysis, document summarization, clique detection, and image recognition
• Build interactive visualizations with web technologies based upon HTML5 and JavaScript toolkits

If you did not already know

Zeros Ones Inflated Proportional google
The ZOIP distribution (Zeros Ones Inflated Proportional) is a proportional data distribution inflated with zeros and/or ones, this distribution is defined on the most known proportional data distributions, the beta and simplex distribution, Jørgensen and Barndorff-Nielsen (1991) <doi:10.1016/0047-259X(91)90008-P>, also allows it to have different parameterizations of the beta distribution, Ferrari and Cribari-Neto (2004) <doi:10.1080/0266476042000214501>, Rigby and Stasinopoulos (2005) <doi:10.18637/jss.v023.i07>. The ZOIP distribution has four parameters, two of which correspond to the proportion of zeros and ones, and the other two correspond to the distribution of the proportional data of your choice. The ‘ZOIP’ package allows adjustments of regression models for fixed and mixed effects for proportional data inflated with zeros and/or ones. …

Fuzzy Supervised Learning with Binary Meta-Feature (FSL-BM) google
This paper introduces a novel real-time Fuzzy Supervised Learning with Binary Meta-Feature (FSL-BM) for big data classification task. The study of real-time algorithms addresses several major concerns, which are namely: accuracy, memory consumption, and ability to stretch assumptions and time complexity. Attaining a fast computational model providing fuzzy logic and supervised learning is one of the main challenges in the machine learning. In this research paper, we present FSL-BM algorithm as an efficient solution of supervised learning with fuzzy logic processing using binary meta-feature representation using Hamming Distance and Hash function to relax assumptions. While many studies focused on reducing time complexity and increasing accuracy during the last decade, the novel contribution of this proposed solution comes through integration of Hamming Distance, Hash function, binary meta-features, binary classification to provide real time supervised method. Hash Tables (HT) component gives a fast access to existing indices; and therefore, the generation of new indices in a constant time complexity, which supersedes existing fuzzy supervised algorithms with better or comparable results. To summarize, the main contribution of this technique for real-time Fuzzy Supervised Learning is to represent hypothesis through binary input as meta-feature space and creating the Fuzzy Supervised Hash table to train and validate model. …

Norm google
In linear algebra, functional analysis and related areas of mathematics, a norm is a function that assigns a strictly positive length or size to each vector in a vector space – save possibly for the zero vector, which is assigned a length of zero. A seminorm, on the other hand, is allowed to assign zero length to some non-zero vectors (in addition to the zero vector). A norm must also satisfy certain properties pertaining to scalability and additivity which are given in the formal definition below. A simple example is the 2-dimensional Euclidean space R2 equipped with the Euclidean norm. Elements in this vector space (e.g., (3, 7)) are usually drawn as arrows in a 2-dimensional cartesian coordinate system starting at the origin (0, 0). The Euclidean norm assigns to each vector the length of its arrow. Because of this, the Euclidean norm is often known as the magnitude. A vector space on which a norm is defined is called a normed vector space. Similarly, a vector space with a seminorm is called a seminormed vector space. It is often possible to supply a norm for a given vector space in more than one way. …

R Packages worth a look

Query Search Interfaces (searcher)
Provides a search interface to look up terms on ‘Google’, ‘Bing’, ‘DuckDuckGo’, ‘StackOverflow’, ‘GitHub’, and ‘BitBucket’. Upon searching, a browser window will open with the aforementioned search results.

Composite Likelihood Estimation for Spatial Data (clespr)
Composite likelihood approach is implemented to estimating statistical models for spatial ordinal and proportional data based on Feng et al. (2014) <doi:10.1002/env.2306>. Parameter estimates are identified by maximizing composite log-likelihood functions using the limited memory BFGS optimization algorithm with bounding constraints, while standard errors are obtained by estimating the Godambe information matrix.

Adjusted Prediction Model Performance Estimation (APPEstimation)
Calculating predictive model performance measures adjusted for predictor distributions using density ratio method (Sugiyama et al., (2012, ISBN:9781139035613)). L1 and L2 error for continuous outcome and C-statistics for binomial outcome are computed.

Rcpp’ Bindings for the ‘Corpus Workbench’ (‘CWB’) (RcppCWB)
Rcpp’ Bindings for the C code of the ‘Corpus Workbench’ (‘CWB’), an indexing and query engine to efficiently analyze large corpora (<> ). ‘RcppCWB’ is licensed under the GNU GPL-3, in line with the GPL-3 license of the ‘CWB’ (<https://…/GPL-3> ). The ‘CWB’ relies on ‘pcre’ (BSD license, see <https://…/licence.txt> ) and ‘GLib’ (LGPL license, see <https://…/lgpl-3.0.en.html> ). See the file LICENSE.note for further information.

An Integrated Regression Model for Normalizing ‘NanoString nCounter’ Data (RCRnorm)
NanoString nCounter’ is a medium-throughput platform that measures gene or microRNA expression levels. Here is a publication that introduces this platform: Malkov (2009) <doi:10.1186/1756-0500-2-80>. Here is the webpage of ‘NanoString nCounter’ where you can find detailed information about this platform <https://…/ncounter-technology>. It has great clinical application, such as diagnosis and prognosis of cancer. Implements integrated system of random-coefficient hierarchical regression model to normalize data from ‘NanoString nCounter’ platform so that noise from various sources can be removed.