Advertisements

R Packages worth a look

Bivariate Pareto Models (Bivariate.Pareto)
Perform competing risks analysis under bivariate Pareto models. See Shih et al. (2018, to appear).

Display Resizable Plots (tkRplotR)
Display a plot in a Tk canvas.

A Compact, High Speed Data Format (msgpack)
A fast C-based encoder and streaming decoder for the ‘messagepack’ data format. ‘Messagepack’ is similar in structure to ‘JSON’ but uses a more compact binary encoding. Based on the CWPack C library.

Non Linear Time Series Analysis (NlinTS)
The main functionalities of this package are about time series forecasting and causality detection. In particular, it provides a neural network Vector Auto-Regressive, the classical Granger causality test C.W.J.Granger (1980) <doi:10.1016/0165-1889(80)90069-X>, and a non-linear version of it.

Random Effects for the Identification of Differential Splicing (REIDS)
Contains the REIDS model presented in Van Moerbeke et al (2017) <doi:10.1186/s12859-017-1687-8> for the detection of alternative splicing. The method is extended by incorporating junction information for the assessment of alternative splicing. The vignette introduces the model and shows an example work flow.

Advertisements

Distilled News

Loops in R and Python: Who is faster?

This post is about R versus Python in terms of the time they require to loop and generate pseudo-random numbers. To accomplish the task, the following steps were performed in Python and R (1) loop 100k times (i i is the loop index) (2) generate a random integer number out of the array of integers from 1 to the current loop index i i (i i +1 for Python) (3) output elapsed time at the probe loop steps: i i (i i +1 for Python) in [10, 100, 1000, 5000, 10000, 25000, 50000, 75000, 100000]


Does It Make Sense to Do Big Data with Small Nodes?

It’s been about ten years since NoSQL showed up on the scene. With its scale-out architecture, NoSQL enables even small organizations to handle huge amounts of data on commodity hardware. Need to do more? No problem, just add more hardware. As companies continue to use more and more data, the ability to scale-out becomes more critical. It’s also important to note that commodity hardware has changed a lot since the rise of NoSQL. In 2008, Intel was about to release the Intel Core and Core Duo architecture, in which we first had two cores in the same die. Jump back to the present, where so many of us carry around a phone with an 8-core processor. In this age of big data and powerful commodity hardware there’s an ongoing debate about node size. Does it make sense to use a lot of small nodes to handle big data workloads? Or should we instead use only a handful of very big nodes? If we need to process 200TB of data, for example, is it better to do so with 200 nodes with 4 cores and 1 terabyte each, or to use 20 nodes with 40 cores and 10 terabytes each?


Control Structures in R: Using If-Else Statements and Loops

Control structures allow you to specify the execution of your code. They are extremely useful if you want to run a piece of code multiple times, or if you want to run a piece a code if a certain condition is met.


An introduction to joint modeling in R

It basically combines (joins) the probability distributions from a linear mixed-effects model with random effects (which takes care of the longitudinal data) and a survival Cox model (which calculates the hazard ratio for an event from the censored data).


Whys and Hows of Apply Family of Functions in R: Introduction to Looping system

Imagine you were to perform a simple task, let’s say calculating sum of columns for 3X3 matrix, what do you think is the best way? Calculating it directly using traditional methods such as calculator or even pen and paper doesn’t sound like a bad approach. A lot of us may prefer to just calculate it manually instead of writing an entire piece of code for such a small dataset. Now, if the dataset is 10X10 matrix, would you do the same? Not sure. Now, if the dataset is further bigger, let’s say 100X100 matrix or 1000X1000 matrix or 5000X5000 matrix, would you even think of doing it manually? I won’t.


Benchmarking Google’s new TPUv2

Nine months after the initial announcement, Google last week finally released TPUv2 to early beta users on the Google Cloud Platform. At RiseML, we got our hands on them and ran a couple of quick benchmarks. Below, we’d like to share our experience and preliminary results.

If you did not already know

Data Partitioning google
Data partitioning in data mining is the division of the whole data available into two or three non overlapping sets: the training set , the validation set , and the test set. If the data set is very large, often only a portion of it is selected for the partitions. Partitioning is normally used when the model for the data at hand is being chosen from a broad set of models. The basic idea of data partitioning is to keep a subset of available data out of analysis, and to use it later for verification of the model. …

Deep Learning Accelerator Unit (DLAU) google
As the emerging field of machine learning, deep learning shows excellent ability in solving complex learning problems. However, the size of the networks becomes increasingly large scale due to the demands of the practical applications, which poses significant challenge to construct a high performance implementations of deep learning neural networks. In order to improve the performance as well to maintain the low power cost, in this paper we design DLAU, which is a scalable accelerator architecture for large-scale deep learning networks using FPGA as the hardware prototype. The DLAU accelerator employs three pipelined processing units to improve the throughput and utilizes tile techniques to explore locality for deep learning applications. Experimental results on the state-of-the-art Xilinx FPGA board demonstrate that the DLAU accelerator is able to achieve up to 36.1x speedup comparing to the Intel Core2 processors, with the power consumption at 234mW. …

Neural Decision Trees google
In this paper we propose a synergistic melting of neural networks and decision trees into a deep hashing neural network (HNN) having a modeling capability exponential with respect to its number of neurons. We first derive a soft decision tree named neural decision tree allowing the optimization of arbitrary decision function at each split node. We then rewrite this soft space partitioning as a new kind of neural network layer, namely the hashing layer (HL), which can be seen as a generalization of the known soft-max layer. This HL can easily replace the standard last layer of ANN in any known network topology and thus can be used after a convolutional or recurrent neural network for example. We present the modeling capacity of this deep hashing function on small datasets where one can reach at least equally good results as standard neural networks by diminishing the number of output neurons. Finally, we show that for the case where the number of output neurons is large, the neural network can mitigate the absence of linear decision boundaries by learning for each difficult class a collection of not necessarily connected sub-regions of the space leading to more flexible decision surfaces. Finally, the HNN can be seen as a deep locality sensitive hashing function which can be trained in a supervised or unsupervised setting as we will demonstrate for classification and regression problems. …

Document worth reading: “Blockchain and Artificial Intelligence”

It is undeniable that artificial intelligence (AI) and blockchain concepts are spreading at a phenomenal rate. Both technologies have distinct degree of technological complexity and multi-dimensional business implications. However, a common misunderstanding about blockchain concept, in particular, is that blockchain is decentralized and is not controlled by anyone. But the underlying development of a blockchain system is still attributed to a cluster of core developers. Take smart contract as an example, it is essentially a collection of codes (or functions) and data (or states) that are programmed and deployed on a blockchain (say, Ethereum) by different human programmers. It is thus, unfortunately, less likely to be free of loopholes and flaws. In this article, through a brief overview about how artificial intelligence could be used to deliver bug-free smart contract so as to achieve the goal of blockchain 2.0, we to emphasize that the blockchain implementation can be assisted or enhanced via various AI techniques. The alliance of AI and blockchain is expected to create numerous possibilities. Blockchain and Artificial Intelligence

Book Memo: “Anomaly Detection Principles and Algorithms”

This book provides a readable and elegant presentation of the principles of anomaly detection,providing an easy introduction for newcomers to the field. A large number of algorithms are succinctly described, along with a presentation of their strengths and weaknesses. The authors also cover algorithms that address different kinds of problems of interest with single and multiple time series data and multi-dimensional data. New ensemble anomaly detection algorithms are described, utilizing the benefits provided by diverse algorithms, each of which work well on some kinds of data. With advancements in technology and the extensive use of the internet as a medium for communications and commerce, there has been a tremendous increase in the threats faced by individuals and organizations from attackers and criminal entities. Variations in the observable behaviors of individuals (from others and from their own past behaviors) have been found to be useful in predicting potential problems of various kinds. Hence computer scientists and statisticians have been conducting research on automatically identifying anomalies in large datasets. This book will primarily target practitioners and researchers who are newcomers to the area of modern anomaly detection techniques. Advanced-level students in computer science will also find this book helpful with their studies.

Whats new on arXiv

Learning Causally-Generated Stationary Time Series

We present the Causal Gaussian Process Convolution Model (CGPCM), a doubly nonparametric model for causal, spectrally complex dynamical phenomena. The CGPCM is a generative model in which white noise is passed through a causal, nonparametric-window moving-average filter, a construction that we show to be equivalent to a Gaussian process with a nonparametric kernel that is biased towards causally-generated signals. We develop enhanced variational inference and learning schemes for the CGPCM and its previous acausal variant, the GPCM (Tobar et al., 2015b), that significantly improve statistical accuracy. These modelling and inferential contributions are demonstrated on a range of synthetic and real-world signals.


Structured low-rank matrix completion for forecasting in time series analysis

In this paper we consider the low-rank matrix completion problem with specific application to forecasting in time series analysis. Briefly, the low-rank matrix completion problem is the problem of imputing missing values of a matrix under a rank constraint. We consider a matrix completion problem for Hankel matrices and a convex relaxation based on the nuclear norm. Based on new theoretical results and a number of numerical and real examples, we investigate the cases when the proposed approach can work. Our results highlight the importance of choosing a proper weighting scheme for the known observations.


Artificial Intelligence and Legal Liability

A recent issue of a popular computing journal asked which laws would apply if a self-driving car killed a pedestrian. This paper considers the question of legal liability for artificially intelligent computer systems. It discusses whether criminal liability could ever apply; to whom it might apply; and, under civil law, whether an AI program is a product that is subject to product design legislation or a service to which the tort of negligence applies. The issue of sales warranties is also considered. A discussion of some of the practical limitations that AI systems are subject to is also included.


Manipulating and Measuring Model Interpretability

Despite a growing body of research focused on creating interpretable machine learning methods, there have been few empirical studies verifying whether interpretable methods achieve their intended effects on end users. We present a framework for assessing the effects of model interpretability on users via pre-registered experiments in which participants are shown functionally identical models that vary in factors thought to influence interpretability. Using this framework, we ran a sequence of large-scale randomized experiments, varying two putative drivers of interpretability: the number of features and the model transparency (clear or black-box). We measured how these factors impact trust in model predictions, the ability to simulate a model, and the ability to detect a model’s mistakes. We found that participants who were shown a clear model with a small number of features were better able to simulate the model’s predictions. However, we found no difference in multiple measures of trust and found that clear models did not improve the ability to correct mistakes. These findings suggest that interpretability research could benefit from more emphasis on empirically verifying that interpretable models achieve all their intended effects.


Learning to Explain: An Information-Theoretic Perspective on Model Interpretation

We introduce instancewise feature selection as a methodology for model interpretation. Our method is based on learning a function to extract a subset of features that are most informative for each given example. This feature selector is trained to maximize the mutual information between selected features and the response variable, where the conditional distribution of the response variable given the input is the model to be explained. We develop an efficient variational approximation to the mutual information, and show that the resulting method compares favorably to other model explanation methods on a variety of synthetic and real data sets using both quantitative metrics and human evaluation.


Data Privacy for a $ρ$-Recoverable Function

A user’s data is represented by a finite-valued random variable. Given a function of the data, a querier is required to recover, with at least a prescribed probability, the value of the function based on a query response provided by the user. The user devises the query response, subject to the recoverability requirement, so as to maximize privacy of the data from the querier. Privacy is measured by the probability of error incurred by the querier in estimating the data from the query response. We analyze single and multiple independent query responses, with each response satisfying the recoverability requirement, that provide maximum privacy to the user. Achievability schemes with explicit randomization mechanisms for query responses are given and their privacy compared with converse upper bounds.


Federated Meta-Learning for Recommendation

Recommender systems have been widely studied from the machine learning perspective, where it is crucial to share information among users while preserving user privacy. In this work, we present a federated meta-learning framework for recommendation in which user information is shared at the level of algorithm, instead of model or data adopted in previous approaches. In this framework, user-specific recommendation models are locally trained by a shared parameterized algorithm, which preserves user privacy and at the same time utilizes information from other users to help model training. Interestingly, the model thus trained exhibits a high capacity at a small scale, which is energy- and communication-efficient. Experimental results show that recommendation models trained by meta-learning algorithms in the proposed framework outperform the state-of-the-art in accuracy and scale. For example, on a production dataset, a shared model under Google Federated Learning (McMahan et al., 2017) with 900,000 parameters has prediction accuracy 76.72%, while a shared algorithm under federated meta-learning with less than 30,000 parameters achieves accuracy of 86.23%.


Asynchronous Byzantine Machine Learning

Asynchronous distributed machine learning solutions have proven very effective so far, but always assuming perfectly functioning workers. In practice, some of the workers can however exhibit Byzantine behavior, caused by hardware failures, software bugs, corrupt data, or even malicious attacks. We introduce \emph{Kardam}, the first distributed asynchronous stochastic gradient descent (SGD) algorithm that copes with Byzantine workers. Kardam consists of two complementary components: a filtering and a dampening component. The first is scalar-based and ensures resilience against \frac{1}{3} Byzantine workers. Essentially, this filter leverages the Lipschitzness of cost functions and acts as a self-stabilizer against Byzantine workers that would attempt to corrupt the progress of SGD. The dampening component bounds the convergence rate by adjusting to stale information through a generic gradient weighting scheme. We prove that Kardam guarantees almost sure convergence in the presence of asynchrony and Byzantine behavior, and we derive its convergence rate. We evaluate Kardam on the CIFAR-100 and EMNIST datasets and measure its overhead with respect to non Byzantine-resilient solutions. We empirically show that Kardam does not introduce additional noise to the learning procedure but does induce a slowdown (the cost of Byzantine resilience) that we both theoretically and empirically show to be less than f/n, where f is the number of Byzantine failures tolerated and n the total number of workers. Interestingly, we also empirically observe that the dampening component is interesting in its own right for it enables to build an SGD algorithm that outperforms alternative staleness-aware asynchronous competitors in environments with honest workers.


SparCML: High-Performance Sparse Communication for Machine Learning

One of the main drivers behind the rapid recent advances in machine learning has been the availability of efficient system support. This comes both through hardware acceleration, but also in the form of efficient software frameworks and programming models. Despite significant progress, scaling compute-intensive machine learning workloads to a large number of compute nodes is still a challenging task, with significant latency and bandwidth demands. In this paper, we address this challenge, by proposing SPARCML, a general, scalable communication layer for machine learning applications. SPARCML is built on the observation that many distributed machine learning algorithms either have naturally sparse communication patters, or have updates which can be sparsified in a structured way for improved performance, without any convergence or accuracy loss. To exploit this insight, we design and implement a set of communication efficient protocols for sparse input data, in conjunction with efficient machine learning algorithms which can leverage these primitives. Our communication protocols generalize standard collective operations, by allowing processes to contribute sparse input data vectors, of heterogeneous sizes. We call these operations sparse-input collectives, and present efficient practical algorithms with strong theoretical bounds on their running time and communication cost. Our generic communication layer is enriched with additional features, such support for non-blocking (asynchronous) operations, and support for low-precision data representations. We validate our algorithmic results experimentally on a range of large-scale machine learning applications and target architectures, showing that we can leverage sparsity for order- of-magnitude runtime savings, compared to state-of-the art methods and frameworks.


The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets

Machine learning models based on neural networks and deep learning are being rapidly adopted for many purposes. What those models learn, and what they may share, is a significant concern when the training data may contain secrets and the models are public — e.g., when a model helps users compose text messages using models trained on all users’ messages. This paper presents exposure: a simple-to-compute metric that can be applied to any deep learning model for measuring the memorization of secrets. Using this metric, we show how to extract those secrets efficiently using black-box API access. Further, we show that unintended memorization occurs early, is not due to over-fitting, and is a persistent issue across different types of models, hyperparameters, and training strategies. We experiment with both real-world models (e.g., a state-of-the-art translation model) and datasets (e.g., the Enron email dataset, which contains users’ credit card numbers) to demonstrate both the utility of measuring exposure and the ability to extract secrets. Finally, we consider many defenses, finding some ineffective (like regularization), and others to lack guarantees. However, by instantiating our own differentially-private recurrent model, we validate that by appropriately investing in the use of state-of-the-art techniques, the problem can be resolved, with high utility.


Neural Predictive Coding using Convolutional Neural Networks towards Unsupervised Learning of Speaker Characteristics

Learning speaker-specific features is vital in many applications like speaker recognition, diarization and speech recognition. This paper provides a novel approach, we term Neural Predictive Coding (NPC), to learn speaker-specific characteristics in a completely unsupervised manner from large amounts of unlabeled training data that even contain multi-speaker audio streams. The NPC framework exploits the proposed short-term active-speaker stationarity hypothesis which assumes two temporally-close short speech segments belong to the same speaker, and thus a common representation that can encode the commonalities of both the segments, should capture the vocal characteristics of that speaker. We train a convolutional deep siamese network to produce ‘speaker embeddings’ by optimizing a loss function that increases between-speaker variability and decreases within-speaker variability. The trained NPC model can produce these embeddings by projecting any test audio stream into a high dimensional manifold where speech frames of the same speaker come closer than they do in the raw feature space. Results in the frame-level speaker classification experiment along with the visualization of the embeddings manifest the distinctive ability of the NPC model to learn short-term speaker-specific features as compared to raw MFCC features and i-vectors. The utterance-level speaker classification experiments show that concatenating simple statistics of the short-term NPC embeddings over the whole utterance with the utterance-level i-vectors can give useful complimentary information to the i-vectors and boost the classification accuracy. The results also show the efficacy of this technique to learn those characteristics from large amounts of unlabeled training set which has no prior information about the environment of the test set.


Multimodal Named Entity Recognition for Short Social Media Posts

We introduce a new task called Multimodal Named Entity Recognition (MNER) for noisy user-generated data such as tweets or Snapchat captions, which comprise short text with accompanying images. These social media posts often come in inconsistent or incomplete syntax and lexical notations with very limited surrounding textual contexts, bringing significant challenges for NER. To this end, we create a new dataset for MNER called SnapCaptions (Snapchat image-caption pairs submitted to public and crowd-sourced stories with fully annotated named entities). We then build upon the state-of-the-art Bi-LSTM word/character based NER models with 1) a deep image network which incorporates relevant visual context to augment textual information, and 2) a generic modality-attention module which learns to attenuate irrelevant modalities while amplifying the most informative ones to extract contexts from, adaptive to each sample and token. The proposed MNER model with modality attention significantly outperforms the state-of-the-art text-only NER models by successfully leveraging provided visual contexts, opening up potential applications of MNER on myriads of social media platforms.


L2-Nonexpansive Neural Networks

This paper proposes a class of well-conditioned neural networks in which a unit amount of change in the inputs causes at most a unit amount of change in the outputs or any of the internal layers. We develop the known methodology of controlling Lipschitz constants to realize its full potential in maximizing robustness: our linear and convolution layers subsume those in the previous Parseval networks as a special case and allow greater degrees of freedom; aggregation, pooling, splitting and other operators are adapted in new ways, and a new loss function is proposed, all for the purpose of improving robustness. With MNIST and CIFAR-10 classifiers, we demonstrate a number of advantages. Without needing any adversarial training, the proposed classifiers exceed the state of the art in robustness against white-box L2-bounded adversarial attacks. Their outputs are quantitatively more meaningful than ordinary networks and indicate levels of confidence. They are also free of exploding gradients, among other desirable properties.


The Clever Shopper Problem

We investigate a variant of the so-called ‘Internet Shopping Problem’ introduced by Blazewicz et al. (2010), where a customer wants to buy a list of products at the lowest possible total cost from shops which offer discounts when purchases exceed a certain threshold. Although the problem is NP-hard, we provide exact algorithms for several cases, e.g. when each shop sells only two items, and an FPT algorithm for the number of items, or for the number of shops when all prices are equal. We complement each result with hardness proofs in order to draw a tight boundary between tractable and intractable cases. Finally, we give an approximation algorithm and hardness results for the problem of maximising the sum of discounts.


The State of the Art in Integrating Machine Learning into Visual Analytics

Visual analytics systems combine machine learning or other analytic techniques with interactive data visualization to promote sensemaking and analytical reasoning. It is through such techniques that people can make sense of large, complex data. While progress has been made, the tactful combination of machine learning and data visualization is still under-explored. This state-of-the-art report presents a summary of the progress that has been made by highlighting and synthesizing select research advances. Further, it presents opportunities and challenges to enhance the synergy between machine learning and visual analytics for impactful future research directions.


Learning Topic Models by Neighborhood Aggregation

Topic models are one of the most frequently used models in machine learning due to its high interpretability and modular structure. However extending the model to include supervisory signal, incorporate pre-trained word embedding vectors and add nonlinear output function to the model is not an easy task because one has to resort to highly intricate approximate inference procedure. In this paper, we show that topic models could be viewed as performing a neighborhood aggregation algorithm where the messages are passed through a network defined over words. Under the network view of topic models, nodes corresponds to words in a document and edges correspond to either a relationship describing co-occurring words in a document or a relationship describing same word in the corpus. The network view allows us to extend the model to include supervisory signals, incorporate pre-trained word embedding vectors and add nonlinear output function to the model in a simple manner. Moreover, we describe a simple way to train the model that is well suited in a semi-supervised setting where we only have supervisory signals for some portion of the corpus and the goal is to improve prediction performance in the held-out data. Through careful experiments we show that our approach outperforms state-of-the-art supervised Latent Dirichlet Allocation implementation in both held-out document classification tasks and topic coherence.


Finding Top-k Optimal Sequenced Routes — Full Version

Motivated by many practical applications in logistics and mobility-as-a-service, we study the top-k optimal sequenced routes (KOSR) querying on large, general graphs where the edge weights may not satisfy the triangle inequality, e.g., road network graphs with travel times as edge weights. The KOSR querying strives to find the top-k optimal routes (i.e., with the top-k minimal total costs) from a given source to a given destination, which must visit a number of vertices with specific vertex categories (e.g., gas stations, restaurants, and shopping malls) in a particular order (e.g., visiting gas stations before restaurants and then shopping malls). To efficiently find the top-k optimal sequenced routes, we propose two algorithms PruningKOSR and StarKOSR. In PruningKOSR, we define a dominance relationship between two partially-explored routes. The partially-explored routes that can be dominated by other partially-explored routes are postponed being extended, which leads to a smaller searching space and thus improves efficiency. In StarKOSR, we further improve the efficiency by extending routes in an A* manner. With the help of a judiciously designed heuristic estimation that works for general graphs, the cost of partially explored routes to the destination can be estimated such that the qualified complete routes can be found early. In addition, we demonstrate the high extensibility of the proposed algorithms by incorporating Hop Labeling, an effective label indexing technique for shortest path queries, to further improve efficiency. Extensive experiments on multiple real-world graphs demonstrate that the proposed methods significantly outperform the baseline method. Furthermore, when k=1, StarKOSR also outperforms the state-of-the-art method for the optimal sequenced route queries.


Diversity regularization in deep ensembles

Calibrating the confidence of supervised learning models is important for a variety of contexts where the certainty over predictions should be reliable. However, it has been reported that deep neural network models are often too poorly calibrated for achieving complex tasks requiring reliable uncertainty estimates in their prediction. In this work, we are proposing a strategy for training deep ensembles with a diversity function regularization, which improves the calibration property while maintaining a similar prediction accuracy.


An Analysis of Categorical Distributional Reinforcement Learning

Distributional approaches to value-based reinforcement learning model the entire distribution of returns, rather than just their expected values, and have recently been shown to yield state-of-the-art empirical performance. This was demonstrated by the recently proposed C51 algorithm, based on categorical distributional reinforcement learning (CDRL) [Bellemare et al., 2017]. However, the theoretical properties of CDRL algorithms are not yet well understood. In this paper, we introduce a framework to analyse CDRL algorithms, establish the importance of the projected distributional Bellman operator in distributional RL, draw fundamental connections between CDRL and the Cram\’er distance, and give a proof of convergence for sample-based categorical distributional reinforcement learning algorithms.


Vector Field Based Neural Networks

A novel Neural Network architecture is proposed using the mathematically and physically rich idea of vector fields as hidden layers to perform nonlinear transformations in the data. The data points are interpreted as particles moving along a flow defined by the vector field which intuitively represents the desired movement to enable classification. The architecture moves the data points from their original configuration to anew one following the streamlines of the vector field with the objective of achieving a final configuration where classes are separable. An optimization problem is solved through gradient descent to learn this vector field.


Facilitated quantum cellular automata as simple models with nonthermal eigenstates and dynamics
Machine Theory of Mind
Determining the best classifier for predicting the value of a boolean field on a blood donor database
Aggregating the response in time series regression models, applied to weather-related cardiovascular mortality
Lossless Compression of Angiogram Foreground with Visual Quality Preservation of Background
Generalizable Adversarial Examples Detection Based on Bi-model Decision Mismatch
The Lattice of subracks is atomic
Counting Motifs with Graph Sampling
Left Ventricle Segmentation in Cardiac MR Images Using Fully Convolutional Network
Proving ergodicity via divergence of ergodic sums
Lossless Image Compression Algorithm for Wireless Capsule Endoscopy by Content-Based Classification of Image Blocks
Reversible Image Watermarking for Health Informatics Systems Using Distortion Compensation in Wavelet Domain
Segmentation of Bleeding Regions in Wireless Capsule Endoscopy Images an Approach for inside Capsule Video Summarization
Semantic Segmentation Refinement by Monte Carlo Region Growing of High Confidence Detections
Optimal Multi-User Scheduling of Buffer-Aided Relay Systems
Liver Segmentation in Abdominal CT Images by Adaptive 3D Region Growing
Communication Complexity of One-Shot Remote State Preparation
Continuous Relaxation of MAP Inference: A Nonconvex Perspective
Liver segmentation in CT images using three dimensional to two dimensional fully connected network
A New Hybrid Half-Duplex/Full-Duplex Relaying System with Antenna Diversity
Protecting Sensory Data against Sensitive Inferences
Low complexity convolutional neural network for vessel segmentation in portable retinal diagnostic devices
A Guide to Comparing the Performance of VA Algorithms
Communication Melting in Graphs and Complex Networks
Permanental processes with kernels that are not equivalent to a symmetric matrix
Formalizing and Implementing Distributed Ledger Objects
Phase transition for infinite systems of spiking neurons
Variational Inference for Policy Gradient
Learning to Gather without Communication
Equivelar toroids with few flag-orbits
Mutual Assent or Unilateral Nomination? A Performance Comparison of Intersection and Union Rules for Integrating Self-reports of Social Relationships
CoVeR: Learning Covariate-Specific Vector Representations with Tensor Decompositions
Convergent Actor-Critic Algorithms Under Off-Policy Training and Function Approximation
Concise Complexity Analyses for Trust-Region Methods
Detecting Small, Densely Distributed Objects with Filter-Amplifier Networks and Loss Boosting
Cross-Modality Synthesis from CT to PET using FCN and GAN Networks for Improved Automated Lesion Detection
Driver Hand Localization and Grasp Analysis: A Vision-based Real-time Approach
xView: Objects in Context in Overhead Imagery
MPST: A Corpus of Movie Plot Synopses with Tags
Modelling spatiotemporal variation of positive and negative sentiment on Twitter to improve the identification of localised deviations
Efficient Enumeration of Dominating Sets for Sparse Graphs
Multi-Sensor Integration for Indoor 3D Reconstruction
End-to-end learning of keypoint detector and descriptor for pose invariant 3D matching
Regularity of biased 1D random walks in random environment
Improved Techniques For Weakly-Supervised Object Localization
Entropy Rate Estimation for Markov Chains with Large State Space
A New Design of Binary MDS Array Codes with Asymptotically Weak-Optimal Repair
Learning Mixtures of Linear Regressions with Nearly Optimal Complexity
Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points
On the implementation of a primal-dual algorithm for second order time-dependent mean field games with local couplings
Safety-Aware Optimal Control of Stochastic Systems Using Conditional Value-at-Risk
Regional Multi-Armed Bandits
Video Person Re-identification by Temporal Residual Learning
Magnetoresistance in organic semiconductors: including pair correlations in the kinetic equations for hopping transport
Dynamic Output Feedback Guaranteed-Cost Synchronization for Multiagent Networks with Given Cost Budgets
Exploiting Inter-User Interference for Secure Massive Non-Orthogonal Multiple Access
The Hidden Vulnerability of Distributed Learning in Byzantium
Graph-Based Blind Image Deblurring From a Single Photograph
Where’s YOUR focus: Personalized Attention
Faster integer multiplication using short lattice vectors
Adversarial Learning for Semi-Supervised Semantic Segmentation
Asynchronous stochastic approximations with asymptotically biased errors and deep multi-agent learning
Two theorems on distribution of Gaussian quadratic forms
Aspect-Aware Latent Factor Model: Rating Prediction with Ratings and Reviews
On detection of Gaussian stochastic sequences
Numerical integration in arbitrary-precision ball arithmetic
Actigraphy-based Sleep/Wake Pattern Detection using Convolutional Neural Networks
Stereo obstacle detection for unmanned surface vehicles by IMU-assisted semantic segmentation
Non-rigid Object Tracking via Deep Multi-scale Spatial-Temporal Discriminative Saliency Maps
Topological phases of non-Hermitian systems
Incremental and Iterative Learning of Answer Set Programs from Mutually Distinct Examples
Near Isometric Terminal Embeddings for Doubling Metrics
Robustness of classifiers to uniform $\ell\_p$ and Gaussian noise
Cambrian acyclic domains: counting $c$-singletons
Learning to Route with Sparse Trajectory Sets—Extended Version
Joint Antenna Selection and Phase-Only Beamforming Using Mixed-Integer Nonlinear Programming
Decomposition of a graph into two disjoint odd subgraphs
Multidimensional multiscale scanning in Exponential Families: Limit theory and statistical consequences
Generating High-Quality Query Suggestion Candidates for Task-Based Search
Robust estimators in a generalized partly linear regression model under monotony constraints
On the permanent of Sylvester-Hadamard matrices
The use of sampling weights in the M-quantile random-effects regression: an application to PISA mathematics scores
Sounderfeit: Cloning a Physical Model with Conditional Adversarial Autoencoders
Iterate averaging as regularization for stochastic gradient descent
Towards an Understanding of Entity-Oriented Search Intents
Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks
Spanned lines and Langer’s inequality
Structure and Supersaturation for Intersecting Families
On Rational Delegations in Liquid Democracy
The Best of Both Worlds: Asymptotically Efficient Mechanisms with a Guarantee on the Expected Gains-From-Trade
Complex-valued Neural Networks with Non-parametric Activation Functions
Stabilizing discrete-time linear systems
Synchronizing the Smallest Possible System
Are Two (Samples) Really Better Than One? On the Non-Asymptotic Performance of Empirical Revenue Maximization
2VRP: a benchmark problem for small but rich VRPs
Data Consistency Simulation Tool for NoSQL Database Systems
MagnifyMe: Aiding Cross Resolution Face Recognition via Identity Aware Synthesis
Classification of Breast Cancer Histology using Deep Learning
Applications of Optimal Control of a Nonconvex Sweeping Process to Optimization of the Planar Crowd Motion Model
Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem
Understanding the Performance of Ceph Block Storage for Hyper-Converged Cloud with All Flash Storage
Adaptive synchronisation of unknown nonlinear networked systems with prescribed performance
Sparse Bayesian dynamic network models, with genomics applications
Harmonious Attention Network for Person Re-Identification
A novel incentive-based demand response model for Cournot competition in electricity markets
Scaling limits of discrete snakes with stable branching
Reliable Intersection Control in Non-cooperative Environments
Path-Specific Counterfactual Fairness
Stability and Optimal Control of Switching PDE-Dynamical Systems
A note on friezes of type $Λ_4$ and $Λ_6$
LIDIOMS: A Multilingual Linked Idioms Data Set
RDF2PT: Generating Brazilian Portuguese Texts from RDF Data
Collaboratively Learning the Best Option, Using Bounded Memory
Correlation-Adjusted Survival Scores for High-Dimensional Variable Selection
Projection-Free Online Optimization with Stochastic Gradient: From Convexity to Submodularity
The spatial Lambda-Fleming-Viot process with fluctuating selection
Large and realistic models of Amorphous Silicon
Large-scale limit of interface fluctuation models
Seeing the forest for the trees? An investigation of network knowledge
Adversarial Examples that Fool both Human and Computer Vision
A Polynomial Time Subsumption Algorithm for Nominal Safe $\mathcal{ELO}_\bot$ under Rational Closure
Half-space Macdonald processes
ChatPainter: Improving Text to Image Generation using Dialogue
A new model for Cerebellar computation
VizWiz Grand Challenge: Answering Visual Questions from Blind People
Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds
Achievable Rate of Private Function Retrieval from MDS Coded Databases
Thresholds for vanishing of `Isolated’ faces in random Čech and Vietoris-Rips complexes
Quantum linear systems algorithms: a primer
Energy Transfer and Spectra in Simulations of Two-dimensional Compressible Turbulence
A Better (Bayesian) Interval Estimate for Within-Subject Designs
Pattern-based Modeling of Multiresilience Solutions for High-Performance Computing
NetChain: Scale-Free Sub-RTT Coordination (Extended Version)
Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover
What are the most important factors that influence the changes in London Real Estate Prices? How to quantify them?
Hessian-based Analysis of Large Batch Training and Robustness to Adversaries
Arbitrarily Substantial Number Representation for Complex Number
Characterizing Implicit Bias in Terms of Optimization Geometry

Book Memo: “New Advances in Statistics and Data Science”

This book is comprised of the presentations delivered at the 25th ICSA Applied Statistics Symposium held at the Hyatt Regency Atlanta, on June 12-15, 2016. This symposium attracted more than 700 statisticians and data scientists working in academia, government, and industry from all over the world. The theme of this conference was the “Challenge of Big Data and Applications of Statistics,” in recognition of the advent of big data era, and the symposium offered opportunities for learning, receiving inspirations from old research ideas and for developing new ones, and for promoting further research collaborations in the data sciences. The invited contributions addressed rich topics closely related to big data analysis in the data sciences, reflecting recent advances and major challenges in statistics, business statistics, and biostatistics. Subsequently, the six editors selected 19 high-quality presentations and invited the speakers to prepare full chapters for this book, which showcases new methods in statistics and data sciences, emerging theories, and case applications from statistics, data science and interdisciplinary fields. The topics covered in the book are timely and have great impact on data sciences, identifying important directions for future research, promoting advanced statistical methods in big data science, and facilitating future collaborations across disciplines and between theory and practice.

Document worth reading: “Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning”

Learning-based pattern classifiers, including deep networks, have demonstrated impressive performance in several application domains, ranging from computer vision to computer security. However, it has also been shown that adversarial input perturbations carefully crafted either at training or at test time can easily subvert their predictions. The vulnerability of machine learning to adversarial inputs (also known as adversarial examples), along with the design of suitable countermeasures, have been investigated in the research field of adversarial machine learning. In this work, we provide a thorough overview of the evolution of this interdisciplinary research area over the last ten years, starting from pioneering, earlier work up to more recent work aimed at understanding the security properties of deep learning algorithms, in the context of different applications. We report interesting connections between these apparently-different lines of work, highlighting common misconceptions related to the evaluation of the security of machine-learning algorithms. We finally discuss the main limitations of current work, along with the corresponding future research challenges towards the design of more secure learning algorithms. Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning

R Packages worth a look

No-U-Turn MCMC Sampling for ‘ADMB’ and ‘TMB’ Models (adnuts)
Bayesian inference using the no-U-turn (NUTS) algorithm by Hoffman and Gelman (2014) <http://…/hoffman14a.html>. Designed for ‘AD Model Builder’ (‘ADMB’) models, or when R functions for log-density and log-density gradient are available, such as ‘Template Model Builder’ (‘TMB’) models and other special cases. Functionality is similar to ‘Stan’, and the ‘rstan’ and ‘shinystan’ packages are used for diagnostics and inference.

Recovering a Basic Space from Issue Scales (basicspace)
Conducts Aldrich-McKelvey and Blackbox Scaling (Poole et al 2016) <doi:10.18637/jss.v069.i07> to recover latent dimensions of judgment.

Object-Oriented Implementation of CRM Designs (crmPack)
Implements a wide range of model-based dose escalation designs, ranging from classical and modern continual reassessment methods (CRMs) based on dose-limiting toxicity endpoints to dual-endpoint designs taking into account a biomarker/efficacy outcome. The focus is on Bayesian inference, making it very easy to setup a new design with its own JAGS code. However, it is also possible to implement 3+3 designs for comparison or models with non-Bayesian estimation. The whole package is written in a modular form in the S4 class system, making it very flexible for adaptation to new models, escalation or stopping rules.

Logit Leaf Model Classifier for Binary Classification (LLM)
Fits the Logit Leaf Model, makes predictions and visualizes the output. (De Caigny et al., (2018) <DOI:10.1016/j.ejor.2018.02.009>).

Mixed-Frequency GARCH Models (mfGARCH)
Estimating GARCH-MIDAS (MIxed-DAta-Sampling) models (Engle, Ghysels, Sohn, 2013, <doi:10.1162/REST_a_00300>) and related statistical inference, accompanying the paper ‘Two are better than one: volatility forecasting using multiplicative component GARCH models’ by Conrad, Kleen (2018, Working Paper). The GARCH-MIDAS model decomposes the conditional variance of (daily) stock returns into a short- and long-term component, where the latter may depend on an exogenous covariate sampled at a lower frequency.