Decision Trees (dtree)
Combines various decision tree algorithms, plus both linear regression and ensemble methods into one package. Allows for the use of both continuous and categorical outcomes. An optional feature is to quantify the (in)stability to the decision tree methods, indicating when results can be trusted and when ensemble methods may be preferential.
Create and Manipulate Vocalisation Diagrams (vocaldia)
Create adjacency matrices of vocalisation graphs from dataframes containing sequences of speech and silence intervals, transforming these matrices into Markov diagrams, and generating datasets for classification of these diagrams by ‘flattening’ them and adding global properties (functionals) etc. Vocalisation diagrams date back to early work in psychiatry (Jaffe and Feldstein, 1970) and social psychology (Dabbs and Ruback, 1987) but have only recently been employed as a data representation method for machine learning tasks including meeting segmentation (Luz, 2012) <doi:10.1145/2328967.2328970> and classification (Luz, 2013) <doi:10.1145/2522848.2533788>.
Tools for Principal Component Analysis-Based Data Structure Comparisons (PCADSC)
A suite of non-parametric, visual tools for assessing differences in data structures for two datasets that contain different observations of the same variables. These tools are all based on Principal Component Analysis (PCA) and thus effectively address differences in the structures of the covariance matrices of the two datasets. The PCASDC tools consist of easy-to-use, intuitive plots that each focus on different aspects of the PCA decompositions. The cumulative eigenvalue (CE) plot describes differences in the variance components (eigenvalues) of the deconstructed covariance matrices. The angle plot presents the information loss when moving from the PCA decomposition of one dataset to the PCA decomposition of the other. The chroma plot describes the loading patterns of the two datasets, thereby presenting the relative weighting and importance of the variables from the original dataset.
Ridge Regression with Automatic Selection of the Penalty Parameter (ridge)
Linear and logistic ridge regression functions. Additionally includes special functions for genome-wide single-nucleotide polymorphism (SNP) data.
Manage Cached Files (hoardr)
Suite of tools for managing cached files, targeting use in other R packages. Uses ‘rappdirs’ for cross-platform paths. Provides utilities to manage cache directories, including targeting files by path or by key; cached directories can be compressed and uncompressed easily to save disk space.
Decision Trees (dtree)