fastText’ Wrapper for Text Classification and Word Representation (fastrtext)
fastText’ is an open-source, free, lightweight library that allows users to learn text representations and text classifiers. It transforms text into continuous vectors that can later be used on many language related task. It works on standard, generic hardware (no ‘GPU’ required). It also includes model size reduction feature. ‘fastText’ original source code is available at <https://…/fastText>.

Automated R Instructor (ari)
Create videos from ‘R Markdown’ documents, or images and audio files. These images can come from image files or HTML slides, and the audio files can be provided by the user or computer voice narration can be created using ‘Amazon Polly’. The purpose of this package is to allow users to create accessible, translatable, and reproducible lecture videos. See <https://…/> for more information.

Polygonal Symbolic Data Analysis (psda)
An implementation of symbolic polygonal data analysis. The package presents the estimation of main descriptive statistical measures, e.g, mean, covariance, variance, correlation and coefficient of variation. In addition, transformation of the data in polygons. Empirical probability distribution function based on polygonal histogram and regression models are presented.

Determination of K Using Peak Counts of Features for Clustering (kpeaks)
The input argument k which is the number of clusters is needed to start all of the partitioning clustering algorithms. In unsupervised learning applications, an optimal value of this argument is widely determined by using the internal validity indexes. Since these indexes suggest a k value which is computed on the clustering results after several runs of a clustering algorithm they are computationally expensive. On the contrary, ‘kpeaks’ enables to estimate k before running any clustering algorithm. It is based on a simple novel technique using the descriptive statistics of peak counts of the features in a data set.

A Slow Version of the Rapid Automatic Keyword Extraction (RAKE) Algorithm (slowraker)
A mostly pure-R implementation of the RAKE algorithm (Rose, S., Engel, D., Cramer, N. and Cowley, W. (2010) <doi:10.1002/9780470689646.ch1>), which can be used to extract keywords from documents without any training data.

Advertisements