Yakmo Yakmo implements robust, efficient k-means clustering with triangular inequality and smart initialization , while supporting alternative clustering outputs. The use of the triangular inequality allows k-means to skip unnecessary distance calculations, while the smart initialization by randomized seeding (k-means++) not only improves solution accuracy but also accelerates the convergence of the algorithm. In addition, you can obtain alternative clusterings via orthogonalization .
YARN MapReduce has undergone a complete overhaul in hadoop-0.23 and we now have, what we call, MapReduce 2.0 (MRv2) or YARN.
The fundamental idea of MRv2 is to split up the two major functionalities of the JobTracker, resource management and job scheduling/monitoring, into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job in the classical sense of Map-Reduce jobs or a DAG of jobs.
The ResourceManager and per-node slave, the NodeManager (NM), form the data-computation framework. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system.
The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks.
YCML A Machine Learning framework for Objective-C and Swift (OS X / iOS)
YEDDA In this paper, we introduce YEDDA, a lightweight but efficient open-source tool for text span annotation. YEDDA provides a systematic solution for text span annotation, ranging from collaborative user annotation to administrator evaluation and analysis. It overcomes the low efficiency of traditional text annotation tools by annotating entities through both command line and shortcut keys, which are configurable with custom labels. YEDDA also gives intelligent recommendations by training a predictive model using the up-to-date annotated text. An administrator client is developed to evaluate annotation quality of multiple annotators and generate detailed comparison report for each annotator pair. YEDDA is developed based on Tkinter and is compatible with all major operating systems.
Yinyang K-means This paper presents Yinyang K-means, a new algorithm for K-means clustering. By clustering the centers in the initial stage, and leveraging efficiently maintained lower and upper bounds between a point and centers, it more effectively avoids unnecessary distance calculations than prior algorithms. It significantly outperforms classic K-means and prior alternative K-means algorithms consistently across all experimented data sets, cluster numbers, and machine configurations. The consistent, superior performance—plus its simplicity, user-control of overheads, and guarantee in producing the same clustering results as the standard K-means does—makes Yinyang K-means a drop-in replacement of the classic K-means with an order of magnitude higher performance.
Youden Plot The data for a Youden plot is generated by providing a number of laboratories aliquots from two separate unknown samples, which we will call A and B. Every lab analyzes both samples and a scatter plot of the A and B results are generated-the A results on the x -axis and the B results on the y -axis. Once this is completed, limits of acceptability are plotted and outliers can be identified.
Youden’s J Statistic Youden’s J statistic (also called Youden’s index) is a single statistic that captures the performance of a diagnostic test.