If you did not already know

Tagger google
We present a framework for efficient perceptual inference that explicitly reasons about the segmentation of its inputs and features. Rather than being trained for any specific segmentation, our framework learns the grouping process in an unsupervised manner or alongside any supervised task. By enriching the representations of a neural network, we enable it to group the representations of different objects in an iterative manner. By allowing the system to amortize the iterative inference of the groupings, we achieve very fast convergence. In contrast to many other recently proposed methods for addressing multi-object scenes, our system does not assume the inputs to be images and can therefore directly handle other modalities. For multi-digit classification of very cluttered images that require texture segmentation, our method offers improved classification performance over convolutional networks despite being fully connected. Furthermore, we observe that our system greatly improves on the semi-supervised result of a baseline Ladder network on our dataset, indicating that segmentation can also improve sample efficiency. …

Attention Economy google
Attention economics is an approach to the management of information that treats human attention as a scarce commodity, and applies economic theory to solve various information management problems. …

Variational Continual Learning google
This paper develops variational continual learning (VCL), a simple but general framework for continual learning that fuses online variational inference (VI) and recent advances in Monte Carlo VI for neural networks. The framework can successfully train both deep discriminative models and deep generative models in complex continual learning settings where existing tasks evolve over time and entirely new tasks emerge. Experimental results show that variational continual learning outperforms state-of-the-art continual learning methods on a variety of tasks, avoiding catastrophic forgetting in a fully automatic way. …


Document worth reading: “Network Community Detection: A Review and Visual Survey”

Community structure is an important area of research. It has received a considerable attention from the scientific community. Despite its importance, one of the key problems in locating information about community detection is the diverse spread of related articles across various disciplines. To the best of our knowledge, there is no current comprehensive review of recent literature which uses a scientometric analysis using complex networks analysis covering all relevant articles from the Web of Science (WoS). Here we present a visual survey of key literature using CiteSpace. The idea is to identify emerging trends besides using network techniques to examine the evolution of the domain. Towards that end, we identify the most influential, central, as well as active nodes using scientometric analyses. We examine authors, key articles, cited references, core subject categories, key journals, institutions, as well as countries. The exploration of the scientometric literature of the domain reveals that Yong Wang is a pivot node with the highest centrality. Additionally, we have observed that Mark Newman is the most highly cited author in the network. We have also identified that the journal, ‘Reviews of Modern Physics’ has the strongest citation burst. In terms of cited documents, an article by Andrea Lancichinetti has the highest centrality score. We have also discovered that the origin of the key publications in this domain is from the United States. Whereas Scotland has the strongest and longest citation burst. Additionally, we have found that the categories of ‘Computer Science’ and ‘Engineering’ lead other categories based on frequency and centrality respectively. Network Community Detection: A Review and Visual Survey

R Packages worth a look

Chain Event Graph (ceg)
Create and learn Chain Event Graph (CEG) models using a Bayesian framework. It provides us with a Hierarchical Agglomerative algorithm to search the CEG model space. The package also includes several facilities for visualisations of the objects associated with a CEG. The CEG class can represent a range of relational data types, and supports arbitrary vertex, edge and graph attributes. A Chain Event Graph is a tree-based graphical model that provides a powerful graphical interface through which domain experts can easily translate a process into sequences of observed events using plain language. CEGs have been a useful class of graphical model especially to capture context-specific conditional independences. References: Collazo R, Gorgen C, Smith J. Chain Event Graph. CRC Press, ISBN 9781498729604, 2018 (forthcoming); and Barday LM, Collazo RA, Smith JQ, Thwaites PA, Nicholson AE. The Dynamic Chain Event Graph. Electronic Journal of Statistics, 9 (2) 2130-2169 <doi:10.1214/15-EJS1068>.

Project Future Case Incidence (projections)
Provides functions and graphics for projecting daily incidence based on past incidence, and estimates of the serial interval and reproduction number. Projections are based on a branching process using a Poisson-distributed number of new cases per day, similar to the model used for estimating R0 in ‘EpiEstim’ or in ‘earlyR’, and described by Nouvellet et al. (2017) <doi:10.1016/j.epidem.2017.02.012>.

Frequent Pattern Mining Outliers (fpmoutliers)
Algorithms for detection of outliers based on frequent pattern mining. Such algorithms follow the paradigm: if an instance contains more frequent patterns, it means that this data instance is unlikely to be an anomaly (He Zengyou, Xu Xiaofei, Huang Zhexue Joshua, Deng Shengchun (2005) <doi:10.2298/CSIS0501103H>). The package implements a list of existing state of the art algorithms as well as other published approaches: FPI, WFPI, FPOF, FPCOF, LFPOF, MFPOF, WCFPOF and WFPOF.

A Menu-Driven GUI for Analyzing and Modelling Data of Just Finance and Econometrics (JFE)
The Just Finance and Econometrics (‘JFE’) provides a ‘tcltk’ based interface to global assets selection and portfolio optimization. ‘JFE’ aims to provide a simple GUI that allows a user to quickly load data from a .RData (.rda) file, explore the data and evaluate financial models. Invoked as JFE(), ‘JFE’ exports a number of utility functions for visualizing assets price (e.g. technical charting) and returns, selecting assets by performance index (based on the package ‘PerformanceAnalytics’) and backtesting specific portfolio profiles (based on the package ‘fPortfolio’).

Continuous Counterfactual Analysis (ccfa)
Contains methods for computing counterfactuals with a continuous treatment variable as in Callaway and Huang (2017) <>. In particular, the package can be used to calculate the expected value, the variance, the interquantile range, the fraction of observations below or above a particular cutoff, or other user-supplied functions of an outcome of interest conditional on a continuous treatment. The package can also be used for computing these same functionals after adjusting for differences in covariates at different values of the treatment. Further, one can use the package to conduct uniform inference for each parameter of interest across all values of the treatment, uniformly test whether adjusting for covariates makes a difference at any value of the treatment, and test whether a parameter of interest is different from its average value at an value of the treatment.

Interpretive Structural Modelling (ISM) (ISM)
The development of ISM was made by Warfield in 1974. ISM is the process of collaborating distinct or related essentials into a simplified and an organized format. Hence, ISM is a methodology that seeks the interrelationships among the various elements considered and endows with a hierarchical and multilevel structure. To run this package user needs to provide a matrix (VAXO) converted into 0’s and 1’s. Warfield,J.N. (1974) <doi:10.1109/TSMC.1974.5408524> Warfield,J.N. (1974, E-ISSN:2168-2909).

Book Memo: “Envisioning Information”

This book celebrates escapes from the flatlands of both paper and computer screen, showing superb displays of high-dimensional complex data. The most design-oriented of Edward Tufte’s books, Envisioning Information shows maps, charts, scientific presentations, diagrams, computer interfaces, statistical graphics and tables, stereo photographs, guidebooks, courtroom exhibits, timetables, use of color, a pop-up, and many other wonderful displays of information. The book provides practical advice about how to explain complex material by visual means, with extraordinary examples to illustrate the fundamental principles of information displays. Topics include escaping flatland, color and information, micro/macro designs, layering and separation, small multiples, and narratives. Winner of 17 awards for design and content. 400 illustrations with exquisite 6- to 12-color printing throughout. Highest quality design and production.

R Packages worth a look

Interchange Tools for Multi-Parameter Spatiotemporal Data (mudata2)
Formatting and structuring multi-parameter spatiotemporal data is often a time-consuming task. This package offers functions and data structures designed to easily organize and visualize these data for applications in geology, paleolimnology, dendrochronology, and paleoclimate.

Spatial Downscaling using the Dissever Algorithm (dissever)
Spatial downscaling of coarse grid mapping to fine grid mapping using predictive covariates and a model fitted using the ‘caret’ package. The original dissever algorithm was published by Malone et al. (2012) <doi:10.1016/j.cageo.2011.08.021>, and extended by Roudier et al. (2017) <doi:10.1016/j.compag.2017.08.021>.

CARTOColors’ Palettes (rcartocolor)
Provides color schemes for maps and other graphics designed by ‘CARTO’ as described at <https://…/>. It includes four types of palettes: aggregation, diverging, qualitative, and quantitative.

Parsimonious Model-Based Clustering with Covariates (MoEClust)
Clustering via parsimonious Mixtures of Experts using the MoEClust models introduced by Murphy and Murphy (2017) <arXiv:1711.05632>. This package fits finite Gaussian mixture models with gating and expert network covariates using parsimonious covariance parameterisations from ‘mclust’ via the EM algorithm. Visualisation of the results of such models using generalised pairs plots is also facilitated.

Distance Measures for Networks (NetworkDistance)
Network is a prevalent form of data structure in many fields. As an object of analysis, many distance or metric measures have been proposed to define the concept of similarity between two networks. We provide a number of distance measures for networks. See Jurman et al (2011) <doi:10.3233/978-1-60750-692-8-227> for an overview on spectral class of inter-graph distance measures.

Simulation and Analysis Tools for Clinical Dose Response Modeling (clinDR)
Bayesian and ML Emax model fitting, graphics and simulation for clinical dose response. The summary data from the dose response meta-analyses in Thomas, Sweeney, and Somayaji (2014) <doi:10.1080/19466315.2014.924876> and Thomas and Roy (2016) <doi:10.1080/19466315.2016.1256229> are included in the package. The prior distributions for the Bayesian analyses default to the posterior predictive distributions derived from these references.

Document worth reading: “Supervised Speech Separation Based on Deep Learning: An Overview”

Speech separation is the task of separating target speech from background interference. Traditionally, speech separation is studied as a signal processing problem. A more recent approach formulates speech separation as a supervised learning problem, where the discriminative patterns of speech, speakers, and background noise are learned from training data. Over the past decade, many supervised separation algorithms have been put forward. In particular, the recent introduction of deep learning to supervised speech separation has dramatically accelerated progress and boosted separation performance. This article provides a comprehensive overview of the research on deep learning based supervised speech separation in the last several years. We first introduce the background of speech separation and the formulation of supervised separation. Then we discuss three main components of supervised separation: learning machines, training targets, and acoustic features. Much of the overview is on separation algorithms where we review monaural methods, including speech enhancement (speech-nonspeech separation), speaker separation (multi-talker separation), and speech dereverberation, as well as multi-microphone techniques. The important issue of generalization, unique to supervised learning, is discussed. This overview provides a historical perspective on how advances are made. In addition, we discuss a number of conceptual issues, including what constitutes the target source. Supervised Speech Separation Based on Deep Learning: An Overview

If you did not already know

Cubist google
Cubist is a powerful tool for generating rule-based models that balance the need for accurate prediction against the requirements of intelligibility. Cubist models generally give better results than those produced by simple techniques such as multivariate linear regression, while also being easier to understand than neural networks. …

wakefield google
wakefield is a Github based R package which is designed to quickly generate random data sets. The user passes n (number of rows) and predefined vectors to the r_data_frame function to produce a dplyr::tbl_df object. …

Fuzzy Clustering google
Fuzzy clustering is a class of algorithms for cluster analysis in which the allocation of data points to clusters is not “hard” (all-or-nothing) but “fuzzy” in the same sense as fuzzy logic. …