Advertisements

Book Memo: “The Basics of Item Response Theory Using R”

This graduate-level textbook is a tutorial for item response theory that covers both the basics of item response theory and the use of R for preparing graphical presentation in writings about the theory. Item response theory has become one of the most powerful tools used in test construction, yet one of the barriers to learning and applying it is the considerable amount of sophisticated computational effort required to illustrate even the simplest concepts. This text provides the reader access to the basic concepts of item response theory freed of the tedious underlying calculations. It is intended for those who possess limited knowledge of educational measurement and psychometrics.
Rather than presenting the full scope of item response theory, this textbook is concise and practical and presents basic concepts without becoming enmeshed in underlying mathematical and computational complexities. Clearly written text and succinct R code allow anyone familiar with statistical concepts to explore and apply item response theory in a practical way. In addition to students of educational measurement, this text will be valuable to measurement specialists working in testing programs at any level and who need an understanding of item response theory in order to evaluate its potential in their settings.
Advertisements

Document worth reading: “Transferrable Plausibility Model – A Probabilistic Interpretation of Mathematical Theory of Evidence”

This paper suggests a new interpretation of the Dempster-Shafer theory in terms of probabilistic interpretation of plausibility. A new rule of combination of independent evidence is shown and its preservation of interpretation is demonstrated. Transferrable Plausibility Model – A Probabilistic Interpretation of Mathematical Theory of Evidence

R Packages worth a look

Network Meta-Analysis using Integrated Nested Laplace Approximations (nmaINLA)
Performs network meta-analysis using integrated nested Laplace approximations (‘INLA’). Includes methods to assess the heterogeneity and inconsistency in the network. Contains more than ten different network meta-analysis data. Installation of R package ‘INLA’ is compulsory for successful usage. ‘INLA’ package can be obtained from <http://www.r-inla.org>. We recommend the testing version.

Detection of Univariate Outliers (univOutl)
Well known outlier detection techniques in the univariate case. Methods to deal with skewed distribution are included too. The Hidiroglou-Berthelot (1986) method to search for outliers in ratios of historical data is implemented as well.

ggplot2′ Faceting Utilities for Geographical Data (geofacet)
Provides geofaceting functionality for ‘ggplot2’. Geofaceting arranges a sequence of plots of data for different geographical entities into a grid that preserves some of the geographical orientation.

Formula Interface to the Grammar of Graphics (ggformula)
Provides a formula interface to ‘ggplot2’ graphics.

Builds Trees by Sampling Variables from Groups (StratifiedRF)
Random Forest that works with groups of predictor variables. When building a tree, a number of variables is taken randomly from each group separately, thus ensuring that it contains variables from each group. Useful when rows contain information about different things (e.g. user information and product information) and it’s not sensible to make a prediction with information from only one group of variables, or when there are far more variables from one group than the other and it’s desired to have groups appear evenly on trees. Trees are grown using the C5.0 algorithm. Currently works for classification only.

If you did not already know

Hierarchical Spectral Merger (HSM) google
We present a new method for time series clustering which we call the Hierarchical Spectral Merger (HSM) method. This procedure is based on the spectral theory of time series and identifies series that share similar oscillations or waveforms. The extent of similarity between a pair of time series is measured using the total variation distance between their estimated spectral densities. At each step of the algorithm, every time two clusters merge, a new spectral density is estimated using the whole information present in both clusters, which is representative of all the series in the new cluster. The method is implemented in an R package HSMClust. We present two applications of the HSM method, one to data coming from wave-height measurements in oceanography and the other to electroencefalogram (EEG) data. …

FALKON google
Kernel methods provide a principled way to perform non linear, nonparametric learning. They rely on solid functional analytic foundations and enjoy optimal statistical properties. However, at least in their basic form, they have limited applicability in large scale scenarios because of stringent computational requirements in terms of time and especially memory. In this paper, we take a substantial step in scaling up kernel methods, proposing FALKON, a novel algorithm that allows to efficiently process millions of points. FALKON is derived combining several algorithmic principles, namely stochastic projections, iterative solvers and preconditioning. Our theoretical analysis shows that optimal statistical accuracy is achieved requiring essentially $O(n)$ memory and $O(n\sqrt{n})$ time. Extensive experiments show that state of the art results on available large scale datasets can be achieved even on a single machine. …

Stochastic Computing based Deep Convolutional Neural Networks (SC-DCNN) google
With recent advancing of Internet of Things (IoTs), it becomes very attractive to implement the deep convolutional neural networks (DCNNs) onto embedded/portable systems. Presently, executing the software-based DCNNs requires high-performance server clusters in practice, restricting their widespread deployment on the mobile devices. To overcome this issue, considerable research efforts have been conducted in the context of developing highly-parallel and specific DCNN hardware, utilizing GPGPUs, FPGAs, and ASICs. Stochastic Computing (SC), which uses bit-stream to represent a number within [-1, 1] by counting the number of ones in the bit-stream, has a high potential for implementing DCNNs with high scalability and ultra-low hardware footprint. Since multiplications and additions can be calculated using AND gates and multiplexers in SC, significant reductions in power/energy and hardware footprint can be achieved compared to the conventional binary arithmetic implementations. The tremendous savings in power (energy) and hardware resources bring about immense design space for enhancing scalability and robustness for hardware DCNNs. This paper presents the first comprehensive design and optimization framework of SC-based DCNNs (SC-DCNNs). We first present the optimal designs of function blocks that perform the basic operations, i.e., inner product, pooling, and activation function. Then we propose the optimal design of four types of combinations of basic function blocks, named feature extraction blocks, which are in charge of extracting features from input feature maps. Besides, weight storage methods are investigated to reduce the area and power/energy consumption for storing weights. Finally, the whole SC-DCNN implementation is optimized, with feature extraction blocks carefully selected, to minimize area and power/energy consumption while maintaining a high network accuracy level. …

Book Memo: “Data Analytics in Digital Humanities”

This book covers computationally innovative methods and technologies including data collection and elicitation, data processing, data analysis, data visualizations, and data presentation. It explores how digital humanists have harnessed the hypersociality and social technologies, benefited from the open-source sharing not only of data but of code, and made technological capabilities a critical part of humanities work.
Chapters are written by researchers from around the world, bringing perspectives from diverse fields and subject areas. The respective authors describe their work, their research, and their learning. Topics include semantic web for cultural heritage valorization, machine learning for parody detection by classification, psychological text analysis, crowdsourcing imagery coding in natural disasters, and creating inheritable digital codebooks.
Designed for researchers and academics, this book is suitable for those interested in methodologies and analytics that can be applied in literature, history, philosophy, linguistics, and related disciplines. Professionals such as librarians, archivists, and historians will also find the content informative and instructive.

Whats new on arXiv

Distributed Least-Squares Iterative Methods in Networks: A Survey

Many science and engineering applications involve solving a linear least-squares system formed from some field measurements. In the distributed cyber-physical systems (CPS), often each sensor node used for measurement only knows partial independent rows of the least-squares system. To compute the least-squares solution they need to gather all these measurement at a centralized location and then compute the solution. These data collection and computation are inefficient because of bandwidth and time constraints and sometimes are infeasible because of data privacy concerns. Thus distributed computations are strongly preferred or demanded in many of the real world applications e.g.: smart-grid, target tracking etc. To compute least squares for the large sparse system of linear equation iterative methods are natural candidates and there are a lot of studies regarding this, however, most of them are related to the efficiency of centralized/parallel computations while and only a few are explicitly about distributed computation or have the potential to apply in distributed networks. This paper surveys the representative iterative methods from several research communities. Some of them were not originally designed for this need, so we slightly modified them to suit our requirement and maintain the consistency. In this survey, we sketch the skeleton of the algorithm first and then analyze its time-to-completion and communication cost. To our best knowledge, this is the first survey of distributed least-squares in distributed networks.


The energy landscape of a simple neural network

We explore the energy landscape of a simple neural network. In particular, we expand upon previous work demonstrating that the empirical complexity of fitted neural networks is vastly less than a naive parameter count would suggest and that this implicit regularization is actually beneficial for generalization from fitted models.


MAGIX: Model Agnostic Globally Interpretable Explanations

Explaining the behavior of a black box machine learning model at the instance level is useful for building trust. However, what is also important is understanding how the model behaves globally. Such an understanding provides insight into both the data on which the model was trained and the generalization power of the rules it learned. We present here an approach that learns rules to explain globally the behavior of black box machine learning models. Collectively these rules represent the logic learned by the model and are hence useful for gaining insight into its behavior. We demonstrate the power of the approach on three publicly available data sets.


Explaining Recurrent Neural Network Predictions in Sentiment Analysis

Recently, a technique called Layer-wise Relevance Propagation (LRP) was shown to deliver insightful explanations in the form of input space relevances for understanding feed-forward neural network classification decisions. In the present work, we extend the usage of LRP to recurrent neural networks. We propose a specific propagation rule applicable to multiplicative connections as they arise in recurrent network architectures such as LSTMs and GRUs. We apply our technique to a word-based bi-directional LSTM model on a five-class sentiment prediction task, and evaluate the resulting LRP relevances both qualitatively and quantitatively, obtaining better results than a gradient-based related method which was used in previous work.


Explanation in Artificial Intelligence: Insights from the Social Sciences

There has been a recent resurgence in the area of explainable artificial intelligence as researchers and practitioners seek to provide more transparency to their algorithms. Much of this research is focused on explicitly explaining decisions or actions to a human observer, and it should not be controversial to say that, if these techniques are to succeed, the explanations they generate should have a structure that humans accept. However, it is fair to say that most work in explainable artificial intelligence uses only the researchers’ intuition of what constitutes a `good’ explanation. There exists vast and valuable bodies of research in philosophy, psychology, and cognitive science of how people define, generate, select, evaluate, and present explanations. This paper argues that the field of explainable artificial intelligence should build on this existing research, and reviews relevant papers from philosophy, cognitive psychology/science, and social psychology, which study these topics. It draws out some important findings, and discusses ways that these can be infused with work on explainable artificial intelligence.


A New Sequence Counted by OEIS Sequence A006012
Response theory of the ergodic many-body delocalized phase: Keldysh Finkel’stein sigma models and the 10-fold way
Interior-proximal primal-dual methods
CAN: Creative Adversarial Networks, Generating ‘Art’ by Learning About Styles and Deviating from Style Norms
Laplacian Simplices
The influence of periodic external fields in multi-agent models with language dynamics
Constrained Bayesian Optimization with Noisy Experiments
K-Adaptability in Two-Stage Mixed-Integer Robust Optimization
A hybrid supervised/unsupervised machine learning approach to solar flare prediction
Cluster Analysis is Convex
‘Parallel Training Considered Harmful?’: Comparing Series-Parallel and Parallel Feedforward Network Training
A 2-spine Decomposition of the Critical Galton-Watson Tree and a Probabilistic Proof of Yaglom’s Theorem
Convergence and Stationary Distributions for Walsh Diffusions
Multiscale Information Decomposition: Exact Computation for Multivariate Gaussian Processes
Generating Long-term Trajectories Using Deep Hierarchical Networks
Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks
The Charming Leading Eigenpair
A Useful Motif for Flexible Task Learning in an Embodied Two-Dimensional Visual Environment
On the Enumeration and Congruences for m-ary Partitions
Multiplicative Pacing Equilibria in Auction Markets
Personalized Automatic Estimation of Self-reported Pain Intensity from Facial Expressions
Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks
A Novel VHR Image Change Detection Algorithm Based on Image Fusion and Fuzzy C-Means Clustering
A bijection of plane increasing trees with relaxed binary trees of right height at most one
Curvature-aware Manifold Learning
Some remarks on boundary operators of Bessel extensions
Shape recognition of volcanic ash by simple convolutional neural network
RelNet: End-to-end Modeling of Entities & Relations
Compressive Statistical Learning with Random Feature Moments
Equilibria, information and frustration in heterogeneous network games with conflicting preferences
The Best-or-Worst and the Postdoc problems
High-Performance Out-of-core Block Randomized Singular Value Decomposition on GPU
Continuum Limit of Posteriors in Graph Bayesian Inverse Problems
From here to infinity – sparse finite versus Dirichlet process mixtures in model-based clustering
Synthesis of Near-regular Natural Textures
Living Labs – An Ethical Challenge for Researchers and Platform Providers
Bounds on energy absorption in quantum systems with long-range interactions
Refined restricted inversion sequences
Distributed Matching between Individuals and Activities with Additively Separable Preferences
Restricted inversion sequences and enhanced $3$-noncrossing partitions
GraphHP: A Hybrid Platform for Iterative Graph Processing
Localization and mobility edges in the off-diagonal quasiperiodic model with slowly varying potentials
Monotonicity Methods for Input-to-State Stability of Nonlinear Parabolic PDEs with Boundary Disturbances
Solidification of porous interfaces and disconnection
Gated-Attention Architectures for Task-Oriented Language Grounding
Convolved subsampling estimation with applications to block bootstrap
Automatic Quality Estimation for ASR System Combination
A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement Learning
Characterization of the ranking induced by the Logarithmic Least Squares Method
Scalable Multi-Class Gaussian Process Classification using Expectation Propagation
Fast Estimation of Haemoglobin Concentration in Tissue Via Wavelet Decomposition
Fractional Partial Differential Equations with Boundary Conditions
Nonlinear Acceleration of Stochastic Algorithms
A note on edge degree and spanning trail containing given edges
Decomposing $C_4$-free graphs under degree constraints
A Minimal Developmental Model Can Increase Evolvability in Soft Robots
Path-dependent Hamilton-Jacobi equations in infinite dimensions
Notes on the replica symmetric solution of the classical and quantum SK model, including the matrix of second derivatives and the spin glass susceptibility
On the non-existence of $srg(76,21,2,7)$
Targeted Undersmoothing
Antimagic orientation of biregular bipartite graphs
Polluted Bootstrap Percolation in Three Dimensions
An End-to-End Computer Vision Pipeline for Automated Cardiac Function Assessment by Echocardiography
Deep Supervision for Pancreatic Cyst Segmentation in Abdominal CT Scans
An approach to reachability analysis for feed-forward ReLU neural networks
Strong Disorder Renormalization for the dynamics of Many-Body-Localized systems : iterative elimination of the fastest degree of freedom via the Floquet expansion
Three-dimensional Cardiovascular Imaging-Genetics: A Mass Univariate Framework
Efficient Convex Optimization with Membership Oracles
Reconstructing the Forest of Lineage Trees of Diverse Bacterial Communities Using Bio-inspired Image Analysis
Tracking Single-Cells in Overcrowded Bacterial Colonies
Pixels to Graphs by Associative Embedding
Constrained Ordered Equilibrium Problems
Girsanov Theorem for Multifractional Brownian Processes
Crystallization of random matrix orbits
Fine-Grained Categorization via CNN-Based Automatic Extraction and Integration of Object-Level and Part-Level Features
Geometric Understanding of the Stability of Power Flow Solutions
On the Complexity and Approximation of the Maximum Expected Value All-or-Nothing Subset
Data-adaptive smoothing for optimal-rate estimation of possibly non-regular parameters
Universal Sampling Rate Distortion
Rational coordination with no communication or conventions
Optimal General Matchings
Single Classifier-based Passive System for Source Printer Classification using Local Texture Features

Distilled News

For Companies, Data Analytics is a Pain; But Why?

1. Analytics is not a vaccine, but a routine workout
2. Insights are just the initiations, and don’t add immediate value to your business
3. Scalability
4. Descriptive analytics is a post-mortem, does it really help
5. Human intervention in analytics is a friend and a foe too
6. Opportunities cost is huge; stale answers make dents
7. Manually intensive
8. Numerical data is analyzed, but what about categorical values
9. Users without expertise
10. Increased lead time to value


An introduction to Support Vector Machines (SVM)

So you’re working on a text classification problem. You’re refining your training set, and maybe you’ve even tried stuff out using Naive Bayes. But now you’re feeling confident in your dataset, and want to take it one step further. Enter Support Vector Machines (SVM): a fast and dependable classification algorithm that performs very well with a limited amount of data. Perhaps you have dug a bit deeper, and ran into terms like linearly separable, kernel trick and kernel functions. But fear not! The idea behind the SVM algorithm is simple, and applying it to natural language classification doesn’t require most of the complicated stuff. Before continuing, we recommend reading our guide to Naive Bayes classifiers first, since a lot of the things regarding text processing that are said there are relevant here as well. Done? Great! Let’s move on.


Taxonomy of Methods for Deep Meta Learning

Let’s talk about Meta-Learning because this is one confusing topic. I wrote a previous post about Deconstructing Meta-Learning which explored “Learning to Learn”. I realized thought that there is another kind of Meta-Learning that practitioners are more familiar with. This kind of Meta-Learning can be understood as algorithms the search and select different DL architectures. Hyper-parameter optimization is an instance of this, however there are another more elaborate algorithms that follow the same prescription of searching for architectures.


Set Theory Arbitrary Union and Intersection Operations with R

Part 3 of 3 in the series Set Theory
• Introduction to Set Theory and Sets with R
• Set Operations Unions and Intersections in R
• Set Theory Arbitrary Union and Intersection Operations with R


Interactive R visuals in Power BI

Power BI has long had the capability to include custom R charts in dashboards and reports. But in sharp contrast to standard Power BI visuals, these R charts were static. While R charts would update when the report data was refreshed or filtered, it wasn’t possible to interact with an R chart on the screen (to display tool-tips, for example).


Face Recognition in R

OpenCV is an incredibly powerful tool to have in your toolbox. I have had a lot of success using it in Python but very little success in R. I haven’t done too much other than searching Google but it seems as if “imager” and “videoplayR” provide a lot of the functionality but not all of it. I have never actually called Python functions from R before. Initially, I tried the “rPython” library – that has a lot of advantages, but was completely unnecessary for me so system() worked absolutely fine. While this example is extremely simple, it should help to illustrate how easy it is to utilize the power of Python from within R. I need to give credit to Harrison Kinsley for all of his efforts and work at PythonProgramming.net – I used a lot of his code and ideas for this post (especially the Python portion). Using videoplayR I created a function which would take a picture with my webcam and save it as “originalWebcamShot.png”


Data Wrangling: Reshaping

Data wrangling is a task of great importance in data analysis. Data wrangling, is the process of importing, cleaning and transforming raw data into actionable information for analysis. It is a time-consuming process which is estimated to take about 60-80% of analyst’s time. In this series we will go through this process. It will be a brief series with goal to craft the reader’s skills on the data wrangling task. This is the second part of this series and it aims to cover the reshaping of data used to turn them into a tidy form. By tidy form, we mean that each feature forms a column and each observation forms a row.

R Packages worth a look

R Session Information (sessioninfo)
Query and print information about the current R session. It is similar to ‘utils::sessionInfo()’, but includes more information about packages, and where they were installed from.

Convert Tibbles or Data Frames to Xts Easily (tbl2xts)
Facilitate the movement between data frames to ‘xts’. Particularly useful when moving from ‘tidyverse’ to the widely used ‘xts’ package, which is the input format of choice to various other packages. It also allows the user to use a ‘spread_by’ argument for a character column ‘xts’ conversion.

2-Stage Clinical Trial Design and Analysis (preference)
Design and analyze two-stage randomized trials with a continuous outcome measure. The package contains functions to compute the required sample size needed to detect a given preference, treatment, and selection effect; alternatively, the package contains functions that can report the study power given a fixed sample size. Finally, analysis functions are provided to test each effect using either summary data (i.e. means, variances) or raw study data.

Construct Process Maps Using Event Data (processmapR)
Visualize of process maps based on event logs, in the form of directed graphs. Part of the ‘bupaR’ framework.

Estimates, Plots and Evaluates Leaf Angle Distribution Functions, Calculates Extinction Coefficients (RLeafAngle)
Leaf angle distribution is described by a number of functions (e.g. ellipsoidal, Beta and rotated ellipsoidal). The parameters of leaf angle distributions functions are estimated through different empirical relationship. This package includes estimations of parameters of different leaf angle distribution function, plots and evaluates leaf angle distribution functions, calculates extinction coefficients given leaf angle distribution. Reference: Wang(2007)<doi:10.1016/j.agrformet.2006.12.003>.

Magister Dixit

“Deep EHR: A Survey of Recent Advances on Deep Learning Techniques for Electronic Health Record (EHR) Analysis” The past decade has seen an explosion in the amount of digital information stored in electronic health records (EHR). While primarily designed for archiving patient clinical information and administrative healthcare tasks, many researchers have found secondary use of these records for various clinical informatics tasks. Over the same period, the machine learning community has seen widespread advances in deep learning techniques, which also have been successfully applied to the vast amount of EHR data. In this paper, we review these deep EHR systems, examining architectures, technical aspects, and clinical applications. We also identify shortcomings of current techniques and discuss avenues of future research for EHR-based deep learning.