Kalman Filter Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, containing noise (random variations) and other inaccuracies, and produces estimates of unknown variables that tend to be more precise than those based on a single measurement alone. More formally, the Kalman filter operates recursively on streams of noisy input data to produce a statistically optimal estimate of the underlying system state. The filter is named after Rudolf (Rudy) E. Kálmán, one of the primary developers of its theory. The Kalman filter has numerous applications in technology. A common application is for guidance, navigation and control of vehicles, particularly aircraft and spacecraft. Furthermore, the Kalman filter is a widely applied concept in time series analysis used in fields such as signal processing and econometrics. Kalman filters also are one of the main topics in the field of Robotic motion planning and control, and sometimes included in Trajectory optimization.
“Extended Kalman Filter”
Kalman Filter For Dummies
Understanding the Basis of the Kalman Filter via a Simple and Intuitive Derivation
Kalman Smoothing The optimal fixed-interval smoother provides the optimal estimate using the measurements from a fixed interval z_1 to z_n. This is also called ‘Kalman Smoothing’. There are several smoothing algorithms in common use.
“Kalman Filter”
KAMILA Clustering
KAMILA clustering, a novel method for clustering mixed-type data in the spirit of k-means clustering. It does not require dummy coding of variables, and is efficient enough to scale to rather large data sets.
k-Anonymity k-anonymity is a property possessed by certain anonymized data. The concept of k-anonymity was first formulated by Latanya Sweeney in a paper published in 2002 as an attempt to solve the problem: “Given person-specific field-structured data, produce a release of the data with scientific guarantees that the individuals who are the subjects of the data cannot be re-identified while the data remain practically useful.” A release of data is said to have the k-anonymity property if the information for each person contained in the release cannot be distinguished from at least k-1 individuals whose information also appear in the release.
Kanri Distance
Kanri’s proprietary combination of patented statistical and process methods provides a uniquely powerful and insightful ability to evaluate large data sets with multiple variables. While many tools evaluate patterns and dynamics for large data, only the Kanri Distance Calculator allows users to understand where they stand with respect to a desired target state and the specific contribution of each variable toward the overall distance from the target state. The Kanri model not only calculates the relationship of variables within the overall data set, but more importantly mathematically teases out the interaction between each of them. This combination of relational insights fuels Kanri’s breakthrough distance calculator. It answers the question “In a world of exponentially expanding data how do I find the variables that will solve my problem and it helps quickly to reach that conclusion.” But the Kanri model does not stop there. Kanri tells you exactly, formulaically how much each variable contributes. The Kanri Distance Calculator opens a new world of solution development possibilities that can apply the power of massive data sets to an individual…or to an individualized objective.
Kantorovich Distance “Wasserstein Metric”
Kaplan-Meier Estimator The Kaplan-Meier estimator, also known as the product limit estimator, is a non-parametric statistic used to estimate the survival function from lifetime data. In medical research, it is often used to measure the fraction of patients living for a certain amount of time after treatment. In other fields, Kaplan-Meier estimators may be used to measure the length of time people remain unemployed after a job loss, the time-to-failure of machine parts, or how long fleshy fruits remain on plants before they are removed by frugivores. The estimator is named after Edward L. Kaplan and Paul Meier, who each submitted similar manuscripts to the Journal of the American Statistical Association. The journal editor, John Tukey, convinced them to combine their work into one paper, which has been cited about 34,000 times since its publication.
Kaplan-Meier Plot numKM
Kaplan-Meier Survival Curves In 1958, Edward L. Kaplan and Paul Meier collaborated to publish a seminal paper on how to deal with incomplete observations. Subsequently, the Kaplan-Meier curves and estimates of survival data have become a familiar way of dealing with differing survival times (times-to-event), especially when not all the subjects continue in the study. “Survival” times need not relate to actual survival with death being the event; the “event” may be any event of interest. Kaplan-Meier analyses are also used in nonmedical disciplines.
Karger’s Algorithm In computer science and graph theory, Karger’s algorithm is a randomized algorithm to compute a minimum cut of a connected graph. It was invented by David Karger and first published in 1993. The idea of the algorithm is based on the concept of contraction of an edge (u, v) in an undirected graph G = (V, E). Informally speaking, the contraction of an edge merges the nodes u and v into one, reducing the total number of nodes of the graph by one. All other edges connecting either u or v are ‘reattached’ to the merged node, effectively producing a multigraph. Karger’s basic algorithm iteratively contracts randomly chosen edges until only two nodes remain; those nodes represent a cut in the original graph. By iterating this basic algorithm a sufficient number of times, a minimum cut can be found with high probability.
Karlin-Rubin Theorem The Karlin-Rubin theorem can be regarded as an extension of the Neyman-Pearson lemma for composite hypotheses.
Parametric Inference: Karlin-Rubin Theorem
Katz Centrality In graph theory, the Katz centrality of a node is a measure of centrality in a network. It was introduced by Leo Katz in 1953 and is used to measure the relative degree of influence of an actor (or node) within a social network. Unlike typical centrality measures which consider only the shortest path (the geodesic) between a pair of actors, Katz centrality measures influence by taking into account the total number of walks between a pair of actors. It is similar to Google’s PageRank and to the eigenvector centrality.
• Katz centrality can be used to compute centrality in directed networks such as citation networks and the World Wide Web.
• Katz centrality is more suitable in the analysis of directed acyclic graphs where traditionally used measures like eigenvector centrality are rendered useless.
• Katz centrality can also be used in estimating the relative status or influence of actors in a social network.
• In neuroscience, it is found that Katz centrality correlates with the relative firing rate of neurons in a neural network.
Kaufman’s Adaptive Moving Average
Kaufman’s Adaptive Moving Average (KAMA) was created by Perry J. Kaufman and presented in 1998 in his book “Trading Systems and Methods, 3rd Edition”. The main advantage of KAMA over other moving averages is that it takes into consideration not only the direction, but also the market volatility. KAMA adjusts its length according to the prevailing market conditions.
Kayak Kayak: Library for Deep Neural Networks. This is a library that implements some useful modules and provides automatic differentiation utilities for learning deep neural networks. It is similar in spirit to tools like Theano and Torch. The objective of Kayak is to be simple to use and extend, for rapid prototyping in Python. It is unlikely to be faster than these other tools, although it is competitive and sometimes faster in performance when the architectures are highly complex. It will certainly not be faster on convolutional architectures for visual object detection and recognition tasks than, e.g., Alex Krizhevsky’s CUDA Convnet or Caffe. The point of Kayak is to be able to experiment in Python with patterns that look a lot like what you’re already used to with Numpy. It makes it easy to manage batches of data and compute gradients with backpropagation.
K-Competitive Autoencoder for Text
Autoencoders have been successful in learning meaningful representations from image datasets. However, their performance on text datasets has not been widely studied. Traditional autoencoders tend to learn possibly trivial representations of text documents due to their confounding properties such as high-dimensionality, sparsity and power-law word distributions. In this paper, we propose a novel k-competitive autoencoder, called KATE, for text documents. Due to the competition between the neurons in the hidden layer, each neuron becomes specialized in recognizing specific data patterns, and overall the model can learn meaningful representations of textual data. A comprehensive set of experiments show that KATE can learn better representations than traditional autoencoders including denoising, contractive, variational, and k-sparse autoencoders. Our model also outperforms deep generative models, probabilistic topic models, and even word representation models (e.g., Word2Vec) in terms of several downstream tasks such as document classification, regression, and retrieval.
Kendall Distance “Kendall Tau Distance”
Kendall Rank Correlation Coefficient In statistics, the Kendall rank correlation coefficient, commonly referred to as Kendall’s tau coefficient (after the Greek letter τ), is a statistic used to measure the association between two measured quantities. A tau test is a non-parametric hypothesis test for statistical dependence based on the tau coefficient. It is a measure of rank correlation: the similarity of the orderings of the data when ranked by each of the quantities. It is named after Maurice Kendall, who developed it in 1938, though Gustav Fechner had proposed a similar measure in the context of time series in 1897.
Kendall Tau Distance The Kendall tau rank distance is a metric that counts the number of pairwise disagreements between two ranking lists. The larger the distance, the more dissimilar the two lists are. Kendall tau distance is also called bubble-sort distance since it is equivalent to the number of swaps that the bubble sort algorithm would make to place one list in the same order as the other list. The Kendall tau distance was created by Maurice Kendall.
Keras Keras is a high-level neural networks library, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation.
Use Keras if you need a deep learning library that:
• Allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
• Supports both convolutional networks and recurrent networks, as well as combinations of the two.
• Supports arbitrary connectivity schemes (including multi-input and multi-output training).
• Runs seamlessly on CPU and GPU.
Deep Learning with Keras
Kernel Canonical Correlation Analysis
Measures of association between two sets of random variables have long been of interest to statisticians. The classical canonical correlation analysis can characterize, but also be limited to, linear association. In this article we study nonlinear association measures using the kernel method. The introduction of kernel method from machine learning community has a great impact on statistical analysis. The kernel canonical correlation analysis (KCCA) is a method that generalizes the classical linear canonical correlation analysis to nonlinear setting. Such a generalization is nonparametric. It allows us to depict the nonlinear relation of two sets of variables and enables applications of classical multivariate data analysis originally constrained to linearity relation. Moreover, the kernel-based canonical correlation analysis no longer requires the Gaussian distributional assumption on observations, and therefore enhances greatly the applicability.
Kernel Canonical Correlation Analysis and its Applications to Nonlinear Measures of Association and Test of Independence
Kernel Density Estimation
In statistics, kernel density estimation (KDE) is a non-parametric way to estimate the probability density function of a random variable. Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample. In some fields such as signal processing and econometrics it is also termed the Parzen-Rosenblatt window method, after Emanuel Parzen and Murray Rosenblatt, who are usually credited with independently creating it in its current form.
Kernel Fisher Discriminant Analysis
In statistics, kernel Fisher discriminant analysis (KFD), also known as generalized discriminant analysis and kernel discriminant analysis, is a kernelized version of linear discriminant analysis. It is named after Ronald Fisher. Using the kernel trick, LDA is implicitly performed in a new feature space, which allows non-linear mappings to be learned.
“Linear Discriminant Analysis”
Kernel Mean-p Power Error
Correntropy is a second order statistical measure in kernel space, which has been successfully applied in robust learning and signal processing. In this paper, we define a nonsecond order statistical measure in kernel space, called the kernel mean-p power error (KMPE), including the correntropic loss (CLoss) as a special case. Some basic properties of KMPE are presented. In particular, we apply the KMPE to extreme learning machine (ELM) and principal component analysis (PCA), and develop two robust learning algorithms, namely ELM-KMPE and PCA-KMPE. Experimental results on synthetic and benchmark data show that the developed algorithms can achieve consistently better performance when compared with some existing methods.
Kernel Methods In computer science, kernel methods are a class of algorithms for pattern analysis, whose best known member is the support vector machine (SVM). The general task of pattern analysis is to find and study general types of relations (for example clusters, rankings, principal components, correlations, classifications) in datasets. For many of these tasks, data have to be represented as feature vectors, but kernel methods replace this representation by similarities to other data points.
Kernel Principal Component Analysis
In the field of multivariate statistics, kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are done in a reproducing kernel Hilbert space with a non-linear mapping.
Key Performance Variable
Keyhole Markup Language
Keyhole Markup Language (KML) is an XML notation for expressing geographic annotation and visualization within Internet-based, two-dimensional maps and three-dimensional Earth browsers. KML was developed for use with Google Earth, which was originally named Keyhole Earth Viewer. It was created by Keyhole, Inc, which was acquired by Google in 2004. KML became an international standard of the Open Geospatial Consortium in 2008. Google Earth was the first program able to view and graphically edit KML files. Other projects such as Marble have also started to develop KML support.
Keyphrase Extraction “Keyphrase Extraction Algorithm”
Keyphrase Extraction Algorithm
Keywords and keyphrases (multi-word units) are widely used in large document collections. They describe the content of single documents and provide a kind of semantic metadata that is useful for a wide variety of purposes. The task of assigning keyphrases to a document is called keyphrase indexing. For example, academic papers are often accompanied by a set of keyphrases freely chosen by the author. In libraries professional indexers select keyphrases from a controlled vocabulary (also called Subject Headings) according to defined cataloguing rules. On the Internet, digital libraries, or any depositories of data (flickr,, blog articles etc.) also use keyphrases (or here called content tags or content labels) to organize and provide a thematic access to their data. KEA is an algorithm for extracting keyphrases from text documents. It can be either used for free indexing or for indexing with a controlled vocabulary. KEA is implemented in Java and is platform independent. It is an open-source software distributed under the GNU General Public License.
Keyphrase Indexing Keyphrases represent a brief but precise summary of documents. They are widely used for organizing library holdings and providing thematic access to them. Manual assignment of highquality keyphrases is expensive and time-consuming, therefore automatic techniques are in great demand. There are two existing approaches. In keyphrase extraction, the phrases occurring in the document are analyzed to identify apparently significant ones, on the basis of properties such as frequency and length. In term assignment keyphrases are chosen from a controlled vocabulary of terms, and documents are classified according to their content into classes that correspond to elements of the vocabulary. One serious disadvantage of the former approach is that the extracted phrases are often ill formed or inappropriate. The assignment approach circumvents this problem, but for satisfactory results a vast and accurate manually created corpus of training material is needed. This paper describes keyphrase indexing, an intermediate approach between keyphrase extraction and term assignment that combines the advantages of both and avoids their shortcomings.
KeystoneML KeystoneML is a software framework, written in Scala, from the UC Berkeley AMPLab designed to simplify the construction of large scale, end-to-end, machine learning pipelines with Apache Spark.
6 reasons why I like KeystoneML
K-fold Cross Validation In k-fold cross-validation, the original sample is randomly partitioned into k equal size subsamples. Of the k subsamples, a single subsample is retained as the validation data for testing the model, and the remaining k – 1 subsamples are used as training data. The cross-validation process is then repeated k times (the folds), with each of the k subsamples used exactly once as the validation data. The k results from the folds can then be averaged (or otherwise combined) to produce a single estimation. The advantage of this method over repeated random sub-sampling is that all observations are used for both training and validation, and each observation is used for validation exactly once. 10-fold cross-validation is commonly used, but in general k remains an unfixed parameter.
KI, KR Robustness Indicators The KI statistic falls between 0 and 1, gives a value of 1 for a perfect model, and gives 0 for a completely random model. This gives it an intuitive feel for a good model metric as Marcade (KXEN) suggests it should. KI is calculated as a “percent of perfect”.
kinn A graph based regression model from flat unstructured dataset. Each line in the input data set is treated as a node from which an edge to another line (node) can be formed. In the training process, a model is created which contains sparse graph adjacency matrix. This model is then used for prediction by taking a predictor and the model as inputs and outputs a prediction which is an average of the most similar node and its neighbours in the model graph.
Kitematic Kitematic is an open source project built to simplify and streamline using Docker on a Mac or Windows (coming soon) PC. Kitematic automates the Docker installation and setup process and provides an intuitive graphical user interface (GUI) for running Docker containers. Kitematic integrates with Docker Machine to provision a VirtualBox VM and install the Docker Engine locally on your machine. Once installed, the Kitematic GUI launches and from the home screen you will be presented with curated images that you can run instantly. You can search for any public images on Docker Hub from Kitematic just by typing in the search bar. You can use the GUI to create, run and manage your containers just by clicking on buttons. Kitematic allows you to switch back and forth between the Docker CLI and the GUI. Kitematic also automates advanced features such as managing ports and configuring volumes. You can use Kitematic to change environment variables, stream logs, and single click terminal into your Docker container all from the GUI.
Kleinberg’s Impossibility Theorem Although the study of clustering is centered around an intuitively compelling goal, it has been very difficult to develop a unified framework for reasoning about it at a technical level, and pro- foundly diverse approaches to clustering abound in the research community. Here we suggest a formal perspective on the difficulty in finding such a unification, in the form of an impossibility theorem: for a set of three simple properties, we show that there is no clustering function satisfying all three. Relaxations of these properties expose some of the interesting (and unavoidable) trade-offs at work in well-studied clustering techniques such as single-linkage, sum-of-pairs, k-means, and k-median.
Klout Score Klout is a website and mobile app that uses social media analytics to rank its users according to online social influence via the ‘Klout Score’, which is a numerical value between 1 and 100. In determining the user score, Klout measures the size of a user’s social media network and correlates the content created to measure how other users interact with that content.
Klout Score: Measuring Influence Across Multiple Social Networks
KlusTree Graph structured data on the web is now massive as well as diverse, ranging from social networks, web graphs to knowledge-bases. Effectively querying this graph structured data is non-trivial and has led to research in a variety of directions — structured queries, keyword and natural language queries, automatic translation of these queries to structured queries, etc. We are concerned with a class of queries called relationship queries, which are usually expressed as a set of keywords (each keyword denoting a named entity). The results returned are a set of ranked trees, each of which denotes relationships among the various keywords. The result list could consist of hundreds of answers. The problem of keyword search on graphs has been explored for over a decade now, but an important aspect that is not as extensively studied is that of user experience. We propose KlusTree, which presents clustered results to the users instead of a list of all the results. In our approach, the result trees are represented using language models and are clustered using JS divergence as a distance measure. We compare KlusTree with the well-known approaches based on isomorphism and tree-edit distance based clustering. The user evaluations show that KlusTree outperforms the other two in providing better clustering, thereby enriching user experience, revealing interesting patterns and improving result interpretation by the user.
K-Means k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.
Interactive visualisation clustering using k-means
k-medoids The k-medoids algorithm is a clustering algorithm related to the k-means algorithm and the medoidshift algorithm. Both the k-means and k-medoids algorithms are partitional (breaking the dataset up into groups) and both attempt to minimize the distance between points labeled to be in a cluster and a point designated as the center of that cluster. In contrast to the k-means algorithm, k-medoids chooses datapoints as centers (medoids or exemplars) and works with an arbitrary matrix of distances between datapoints instead of l2. This method was proposed in 1987 for the work with l1 norm and other distances.
k-mer The term k-mer typically refers to all the possible substrings, of length k, that are contained in a string. In Computational genomics, k-mers refer to all the possible subsequences (of length k) from a read obtained through DNA Sequencing. The amount of k-mers possible given a string of length, L, is L-k+1 whilst the number of possible k-mers given n possibilities (4 in the case of DNA e.g. ACTG) is n^{k}. K-mers are typically used during Sequence assembly, but can also be used in Sequence alignment.
K-Modes The k-means algorithm is well known for its efficiency in clustering large data sets. However, working only on numeric values prohibits it from being used to cluster real world data containing categorical values. In this paper we present two algorithms which extend the k-means algorithm to categorical domains and domains with mixed numeric and categorical values. The k-modes algorithm uses a simple matching dissimilarity measure to deal with categorical objects, replaces the means of clusters with modes, and uses a frequency-based method to update modes in the clustering process to minimise the clustering cost function. With these extensions the k-modes algorithm enables the clustering of categorical data in a fashion similar to k-means. The k-prototypes algorithm, through the definition of a combined dissimilarity measure, further integrates the k-means and k-modes algorithms to allow for clustering objects described by mixed numeric and categorical attributes.
Knapsack Problem The knapsack problem or rucksack problem is a problem in combinatorial optimization: Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible. It derives its name from the problem faced by someone who is constrained by a fixed-size knapsack and must fill it with the most valuable items. The problem often arises in resource allocation where there are financial constraints and is studied in fields such as combinatorics, computer science, complexity theory, cryptography, applied mathematics, and daily fantasy sports. The knapsack problem has been studied for more than a century, with early works dating as far back as 1897. The name ‘knapsack problem’ dates back to the early works of mathematician Tobias Dantzig (1884-1956), and refers to the commonplace problem of packing your most valuable or useful items without overloading your luggage.
k-nearest neighbors
In pattern recognition, the k-Nearest Neighbors algorithm (or k-NN for short) is a non-parametric method used for classification and regression. In both cases, the input consists of the k closest training examples in the feature space. The output depends on whether k-NN is used for classification or regression:
1. In k-NN classification, the output is a class membership. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor.
2. In k-NN regression, the output is the property value for the object. This value is the average of the values of its k nearest neighbors.
k-NN is a type of instance-based learning, or lazy learning, where the function is only approximated locally and all computation is deferred until classification. The k-NN algorithm is among the simplest of all machine learning algorithms.
KNet Knet (pronounced ‘kay-net’) is the Koç University machine learning framework implemented in Julia, a high-level, high-performance, dynamic programming language. Unlike gradient generating compilers like Theano and TensorFlow which restrict users into a modeling mini-language, Knet allows models to be defined by just describing their forward computation in plain Julia, allowing the use of loops, conditionals, recursion, closures, tuples, dictionaries, array indexing, concatenation and other high level language features. High performance is achieved by combining automatic differentiation of most of Julia with efficient GPU kernels and memory management. Several examples and benchmarks are provided to demonstrate that GPU support and automatic differentiation of a high level language are sufficient for concise definition and efficient training of sophisticated models.
Knockoff Filter In many fields of science, we observe a response variable together with a large number of potential explanatory variables, and would like to be able to discover which variables are truly associated with the response. At the same time, we need to know that the false discovery rate (FDR) – the expected fraction of false discoveries among all discoveries – is not too high, in order to assure the scientist that most of the discoveries are indeed true and replicable. This paper introduces the knockoff filter, a new variable selection procedure controlling the FDR in the statistical linear model whenever there are at least as many observations as variables. This method achieves exact FDR control in finite sample settings no matter the design or covariates, the number of variables in the model, and the amplitudes of the unknown regression coefficients, and does not require any knowledge of the noise level. As the name suggests, the method operates by manufacturing knockoff variables that are cheap – their construction does not require any new data – and are designed to mimic the correlation structure found within the existing variables, in a way that allows for accurate FDR control, beyond what is possible with permutation-based methods. The method of knockoffs is very general and flexible, and can work with a broad class of test statistics.
Knowledge as a Service
In this paper, we introduce and explore a new computing paradigm we call knowledge as a service, in which a knowledge service provider, via its knowledge server, answers queries presented by some knowledge consumers. The knowledge server’s answers are based on knowledge models that may be expensive or impossible to obtain for the knowledge consumers.
Knowledge as a Service
Actionable Knowledge As A Service (AKAAS)
Knowledge Base A knowledge base (KB) is a technology used to store complex structured and unstructured information used by a computer system. The initial use of the term was in connection with expert systems which were the first knowledge-based systems. The original use of the term knowledge-base was to describe one of the two sub-systems of a knowledge-based system. A knowledge-based system consists of a knowledge-base that represents facts about the world and an inference engine that can reason about those facts and use rules and other forms of logic to deduce new facts or highlight inconsistencies.
Knowledge Discovery
(KD / KDD)
Knowledge discovery describes the process of automatically searching large volumes of data for patterns that can be considered knowledge about the data. It is often described as deriving knowledge from the input data. Knowledge discovery developed out of the data mining domain, and is closely related to it both in terms of methodology and terminology.
The most well-known branch of data mining is knowledge discovery, also known as knowledge discovery in databases (KDD). Just as many other forms of knowledge discovery it creates abstractions of the input data. The knowledge obtained through the process may become additional data that can be used for further usage and discovery.
KnOwledge Discovery by Accuracy Maximization
Here we describe KODAMA (knowledge discovery by accuracy maximization), an unsupervised and semisupervised learning algorithm that performs feature extraction from noisy and high-dimensional data. Unlike other data mining methods, the peculiarity of KODAMA is that it is driven by an integrated procedure of cross-validation of the results. The discovery of a local manifold’s topology is led by a classifier through a Monte Carlo procedure of maximization of cross-validated predictive accuracy. Briefly, our approach differs from previous methods in that it has an integrated procedure of validation of the results. In this way, the method ensures the highest robustness of the obtained solution.
Knowledge Extraction Knowledge extraction is the creation of knowledge from structured (relational databases, XML) and unstructured (text, documents, images) sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must represent knowledge in a manner that facilitates inferencing. Although it is methodically similar to information extraction (NLP) and ETL (data warehouse), the main criteria is that the extraction result goes beyond the creation of structured information or the transformation into a relational schema. It requires either the reuse of existing formal knowledge (reusing identifiers or ontologies) or the generation of a schema based on the source data.
Knowledge Graph The Knowledge Graph is a knowledge base used by Google to enhance its search engine’s search results with semantic-search information gathered from a wide variety of sources. Knowledge Graph display was added to Google’s search engine in 2012, starting in the United States, having been announced on May 16, 2012. It provides structured and detailed information about the topic in addition to a list of links to other sites. The goal is that users would be able to use this information to resolve their query without having to navigate to other sites and assemble the information themselves.
Knowledge Management Knowledge management (KM) is the process of capturing, developing, sharing, and effectively using organisational knowledge. It refers to a multi-disciplined approach to achieving organisational objectives by making the best use of knowledge. An established discipline since 1991 (see Nonaka 1991), KM includes courses taught in the fields of business administration, information systems, management, and library and information sciences (Alavi & Leidner 1999). More recently, other fields have started contributing to KM research; these include information and media, computer science, public health, and public policy. Columbia University, Kent State University and the University of Haifa offer dedicated Master of Science degrees in Knowledge Management. Many large companies, public institutions and non-profit organisations have resources dedicated to internal KM efforts, often as a part of their business strategy, information technology, or human resource management departments. Several consulting companies provide strategy and advice regarding KM to these organisations. Knowledge management efforts typically focus on organisational objectives such as improved performance, competitive advantage, innovation, the sharing of lessons learned, integration and continuous improvement of the organisation. KM efforts overlap with organisational learning and may be distinguished from that by a greater focus on the management of knowledge as a strategic asset and a focus on encouraging the sharing of knowledge. It is an enabler of organisational learning.
Knowledge of Preconditions Principle
The Knowledge of Preconditions principle (KoP) is proposed as a widely applicable connection between knowledge and action in multi-agent systems. Roughly speaking, it asserts that if some condition is a necessary condition for performing a given action A, then knowing that this condition holds is also a necessary condition for performing A. Since the specifications of tasks often involve necessary conditions for actions, the KoP principle shows that such specifications induce knowledge preconditions for the actions. Distributed protocols or multi-agent plans that satisfy the specifications must ensure that this knowledge be attained, and that it is detected by the agents as a condition for action. The knowledge of preconditions principle is formalised in the runs and systems framework, and is proven to hold in a wide class of settings. Well-known connections between knowledge and coordinated action are extended and shown to derive directly from the KoP principle: a ‘common knowledge of preconditions’ principle is established showing that common knowledge is a necessary condition for performing simultaneous actions, and a ‘nested knowledge of preconditions’ principle is proven, showing that coordinating actions to be performed in linear temporal order requires a corresponding form of nested knowledge.
Knowledge Worker Knowledge workers are workers whose main capital is knowledge. Typical examples may include software engineers, doctors, architects, engineers, scientists, public accountants, lawyers, and academics, whose job is to “think for a living”.
K-NRM This paper proposes K-NRM, a kernel based neural model for document ranking. Given a query and a set of documents, K-NRM uses a translation matrix that models word-level similarities via word embeddings, a new kernel-pooling technique that uses kernels to extract multi-level soft match features, and a learning-to-rank layer that combines those features into the final ranking score. The whole model is trained end-to-end. The ranking layer learns desired feature patterns from the pairwise ranking loss. The kernels transfer the feature patterns into soft-match targets at each similarity level and enforce them on the translation matrix. The word embeddings are tuned accordingly so that they can produce the desired soft matches. Experiments on a commercial search engine’s query log demonstrate the improvements of K-NRM over prior feature-based and neural-based states-of-the-art, and explain the source of K-NRM’s advantage: Its kernel-guided embedding encodes a similarity metric tailored for matching query words to document words, and provides effective multi-level soft matches.
Kolmogorov-Smirnov Test
In statistics, the Kolmogorov-Smirnov test (K-S test or KS test) is a nonparametric test of the equality of continuous, one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution (one-sample K-S test), or to compare two samples (two-sample K-S test). The Kolmogorov-Smirnov statistic quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples. The null distribution of this statistic is calculated under the null hypothesis that the samples are drawn from the same distribution (in the two-sample case) or that the sample is drawn from the reference distribution (in the one-sample case). In each case, the distributions considered under the null hypothesis are continuous distributions but are otherwise unrestricted. The two-sample K-S test is one of the most useful and general nonparametric methods for comparing two samples, as it is sensitive to differences in both location and shape of the empirical cumulative distribution functions of the two samples. The Kolmogorov-Smirnov test can be modified to serve as a goodness of fit test. In the special case of testing for normality of the distribution, samples are standardized and compared with a standard normal distribution. This is equivalent to setting the mean and variance of the reference distribution equal to the sample estimates, and it is known that using these to define the specific reference distribution changes the null distribution of the test statistic: see below. Various studies have found that, even in this corrected form, the test is less powerful for testing normality than the Shapiro-Wilk test or Anderson-Darling test. However, other tests have their own disadvantages. For instance the Shapiro-Wilk test is known not to work well with many ties (many identical values).
Konstanz Information Miner
KNIME, the Konstanz Information Miner, is an open source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining concept. A graphical user interface allows assembly of nodes for data preprocessing (ETL: Extraction, Transformation, Loading), for modeling and data analysis and visualization. Since 2006, KNIME has been used in pharmaceutical research, but is also used in other areas like CRM customer data analysis, business intelligence and financial data analysis.
K-optimal Pattern Discovery
K-optimal pattern discovery is a data mining technique that provides an alternative to the frequent pattern discovery approach that underlies most association rule learning techniques. Frequent pattern discovery techniques find all patterns for which there are sufficiently frequent examples in the sample data. In contrast, k-optimal pattern discovery techniques find the k patterns that optimize a user-specified measure of interest. The parameter k is also specified by the user.
K-optimal Rule Discovery
K-optimal rule discovery finds the k rules that optimize a user-specified measure of rule value with respect to a set of sample data and user-specified constraints. This approach avoids many limitations of the frequent itemset approach of association rule discovery. This paper presents a scalable algorithm applicable to a wide range of k-optimal rule discovery tasks and demonstrates its efficiency.
Korkine Zolotarev
In mathematics, the goal of lattice basis reduction is given an integer lattice basis as input, to find a basis with short, nearly orthogonal vectors. This is realized using different algorithms, whose running time is usually at least exponential in the dimension of the lattice.
k-POD k-POD, a novel method of k-means clustering on partially observed data that employs a majorization-minimization algorithm to identify a clustering that is consistent with the observed data. By bypassing the completely observed data formulation, k-POD retains all information in the data and avoids committing to distributional assumptions on the missingness patterns.
K-Prototypes The k-means algorithm is well known for its efficiency in clustering large data sets. However, working only on numeric values prohibits it from being used to cluster real world data containing categorical values. In this paper we present two algorithms which extend the k-means algorithm to categorical domains and domains with mixed numeric and categorical values. The k-modes algorithm uses a simple matching dissimilarity measure to deal with categorical objects, replaces the means of clusters with modes, and uses a frequency-based method to update modes in the clustering process to minimise the clustering cost function. With these extensions the k-modes algorithm enables the clustering of categorical data in a fashion similar to k-means. The k-prototypes algorithm, through the definition of a combined dissimilarity measure, further integrates the k-means and k-modes algorithms to allow for clustering objects described by mixed numeric and categorical attributes.
Kraljic Matrix The Kraljic Matrix works by by mapping the profit impact of a product on one axis, and our vulnerability to the supplier’s disappearance on the other. It essentially provides a portfolio management approach to managing an organization’s many suppliers. This enables us to see which relationships are important so we can focus on strengthing these, as well as identifying less important relationships where we might employ traditional supplier management techniques such as offshoring. The Kraljic matrix help us in the first step of supplier management – identifying important suppliers. How you then actually manage those suppliers is up to you.
Kriging In statistics, originally in geostatistics, Kriging or Gaussian process regression is a method of interpolation for which the interpolated values are modeled by a Gaussian process governed by prior covariances, as opposed to a piecewise-polynomial spline chosen to optimize smoothness of the fitted values. Under suitable assumptions on the priors, Kriging gives the best linear unbiased prediction of the intermediate values. Interpolating methods based on other criteria such as smoothness need not yield the most likely intermediate values. The method is widely used in the domain of spatial analysis and computer experiments. The technique is also known as Wiener-Kolmogorov prediction (after Norbert Wiener and Andrey Kolmogorov). The theoretical basis for the method was developed by the French mathematician Georges Matheron based on the Master’s thesis of Danie G. Krige, the pioneering plotter of distance-weighted average gold grades at the Witwatersrand reef complex in South Africa. Krige sought to estimate the most likely distribution of gold based on samples from a few boreholes. The English verb is to krige and the most common noun is Kriging; both are often pronounced with a hard ‘g’, following the pronunciation of the name ‘Krige’.
Spatio-Temporal Kriging in R
Kriging Models In statistics, originally in geostatistics, Kriging or Gaussian process regression is a method of interpolation for which the interpolated values are modeled by a Gaussian process governed by prior covariances, as opposed to a piecewise-polynomial spline chosen to optimize smoothness of the fitted values. Under suitable assumptions on the priors, Kriging gives the best linear unbiased prediction of the intermediate values. Interpolating methods based on other criteria such as smoothness need not yield the most likely intermediate values. The method is widely used in the domain of spatial analysis and computer experiments. The technique is also known as Kolmogorov Wiener prediction.
Kronecker Recurrent Units
Our work addresses two important issues with recurrent neural networks: (1) they are over-parameterized, and (2) the recurrence matrix is ill-conditioned. The former increases the sample complexity of learning and the training time. The latter causes the vanishing and exploding gradient problem. We present a flexible recurrent neural network model called Kronecker Recurrent Units (KRU). KRU achieves parameter efficiency in RNNs through a Kronecker factored recurrent matrix. It overcomes the ill-conditioning of the recurrent matrix by enforcing soft unitary constraints on the factors. Thanks to the small dimensionality of the factors, maintaining these constraints is computationally efficient. Our experimental results on five standard data-sets reveal that KRU can reduce the number of parameters by three orders of magnitude in the recurrent weight matrix compared to the existing recurrent models, without trading the statistical performance. These results in particular show that while there are advantages in having a high dimensional recurrent space, the capacity of the recurrent part of the model can be dramatically reduced.
Kruskal’s Algorithm Kruskal’s algorithm is a greedy algorithm in graph theory that finds a minimum spanning tree for a connected weighted graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized. If the graph is not connected, then it finds a minimum spanning forest (a minimum spanning tree for each connected component).
K-SVD In applied mathematics, K-SVD is a dictionary learning algorithm for creating a dictionary for sparse representations, via a singular value decomposition approach. K-SVD is a generalization of the k-means clustering method, and it works by iteratively alternating between sparse coding the input data based on the current dictionary, and updating the atoms in the dictionary to better fit the data. K-SVD can be found widely in use in applications such as image processing, audio processing, biology, and document analysis.
Analysis K-SVD: A Dictionary-Learning Algorithm for the Analysis Sparse Model
Kullback-Leibler Divergence
In probability theory and information theory, the Kullback-Leibler divergence (also information divergence, information gain, relative entropy, or KLIC; here abbreviated as KL divergence) is a non-symmetric measure of the difference between two probability distributions P and Q. Specifically, the Kullback-Leibler divergence of Q from P, denoted DKL(P||Q), is a measure of the information lost when Q is used to approximate P: The KL divergence measures the expected number of extra bits required to code samples from P when using a code based on Q, rather than using a code based on P. Typically P represents the “true” distribution of data, observations, or a precisely calculated theoretical distribution. The measure Q typically represents a theory, model, description, or approximation of P. Although it is often intuited as a metric or distance, the KL divergence is not a true metric – for example, it is not symmetric: the KL divergence from P to Q is generally not the same as that from Q to P. However, its infinitesimal form, specifically its Hessian, is a metric tensor: it is the Fisher information metric.
Kurtosis In probability theory and statistics, kurtosis (from the Greek word kurtos, meaning curved, arching) is any measure of the ‘peakedness’ of the probability distribution of a real-valued random variable. In a similar way to the concept of skewness, kurtosis is a descriptor of the shape of a probability distribution and, just as for skewness, there are different ways of quantifying it for a theoretical distribution and corresponding ways of estimating it from a sample from a population. There are various interpretations of kurtosis, and of how particular measures should be interpreted; these are primarily peakedness (width of peak), tail weight, and lack of shoulders (distribution primarily peak and tails, not in between).
‘Student’, on Kurtosis
Kurtosis as Peakedness, 1905-2014. R.I.P.
The incorrect notion that kurtosis somehow measures ‘peakedness’ (flatness, pointiness, or modality) of a distribution is remarkably persistent, despite attempts by statisticians to set the record straight. This article puts the notion to rest once and for all. Kurtosis tells you virtually nothing about the shape of the peak – its only unambiguous interpretation is in terms of tail extremity, that is, either existing outliers (for the sample kurtosis) or propensity to produce outliers (for the kurtosis of a probability distribution). To clarify this point, relevant literature is reviewed, counterexample distributions are given, and it is shown that the proportion of the kurtosis that is determined by the central μ ± σ range is usually quite small.