Compressed, Complementary, Computationally-Efficient Adaptive Gradient Online Learning (CompAdaGrad) google
The adaptive gradient online learning method known as AdaGrad has seen widespread use in the machine learning community in stochastic and adversarial online learning problems and more recently in deep learning methods. The method’s full-matrix incarnation offers much better theoretical guarantees and potentially better empirical performance than its diagonal version; however, this version is computationally prohibitive and so the simpler diagonal version often is used in practice. We introduce a new method, CompAdaGrad, that navigates the space between these two schemes and show that this method can yield results much better than diagonal AdaGrad while avoiding the (effectively intractable) $O(n^3)$ computational complexity of full-matrix AdaGrad for dimension $n$. CompAdaGrad essentially performs full-matrix regularization in a low-dimensional subspace while performing diagonal regularization in the complementary subspace. We derive CompAdaGrad’s updates for composite mirror descent in case of the squared $\ell_2$ norm and the $\ell_1$ norm, demonstrate that its complexity per iteration is linear in the dimension, and establish guarantees for the method independent of the choice of composite regularizer. Finally, we show preliminary results on several datasets. …

Conditional Random Fields (CRF) google
Conditional random fields (CRFs) are a class of statistical modelling method often applied in pattern recognition and machine learning, where they are used for structured prediction. Whereas an ordinary classifier predicts a label for a single sample without regard to ‘neighboring’ samples, a CRF can take context into account; e.g., the linear chain CRF popular in natural language processing predicts sequences of labels for sequences of input samples. CRFs are a type of discriminative undirected probabilistic graphical model. It is used to encode known relationships between observations and construct consistent interpretations. It is often used for labeling or parsing of sequential data, such as natural language text or biological sequences and in computer vision. Specifically, CRFs find applications in shallow parsing, named entity recognition and gene finding, among other tasks, being an alternative to the related hidden Markov models (HMMs). In computer vision, CRFs are often used for object recognition and image segmentation. …

SQLScript google
The motivation for SQLScript is to embed data-intensive application logic into the database. As of today, applications only offload very limited functionality into the database using SQL, most of the application logic is normally executed in an application server. This has the effect that data to be operated upon needs to be copied from the database into the application server and vice versa. When executing data intensive logic, this copying of data is very expensive in terms of processor and data transfer time. Moreover, when using an imperative language like ABAP or JAVA for processing data, developers tend to write algorithms which follow a one tuple at a time semantics (for example looping over rows in a table). However, these algorithms are hard to optimize and parallelize compared to declarative set-oriented languages such as SQL. The SAP HANA database is optimized for modern technology trends and takes advantage of modern hardware, for example, by having data residing in main-memory and allowing massive-parallelization on multi-core CPUs. The goal of the SAP HANA database is to optimally support application requirements by leveraging such hardware. To this end, the SAP HANA database exposes a very sophisticated interface to the application consisting of many languages. The expressiveness of these languages far exceeds that attainable with OpenSQL. The set of SQL extensions for the SAP HANA database that allow developers to push data intensive logic into the database is called SQLScript. Conceptually SQLScript is related to stored procedures as defined in the SQL standard, but SQLScript is designed to provide superior optimization possibilities. SQLScript should be used in cases where other modeling constructs of SAP HANA, for example analytic views or attribute views are not sufficient. For more information on how to best exploit the different view types, see ‘Exploit Underlying Engine’. The set of SQL extensions are the key to avoiding massive data copies to the application server and for leveraging sophisticated parallel execution strategies of the database. SQLScript addresses the following problems:
● Decomposing an SQL query can only be done using views. However when decomposing complex queries using views, all intermediate results are visible and must be explicitly typed. Moreover SQL views cannot be parameterized which limits their reuse. In particular they can only be used like tables and embedded into other SQL statements.
● SQL queries do not have features to express business logic (for example a complex currency conversion). As a consequence such a business logic cannot be pushed down into the database (even if it is mainly based on standard aggregations like SUM(Sales), etc.).
● An SQL query can only return one result at a time. As a consequence the computation of related result sets must be split into separate, usually unrelated, queries.
● As SQLScript encourages developers to implement algorithms using a set-oriented paradigm and not using a one tuple at a time paradigm, imperative logic is required, for example by iterative approximation algorithms. Thus it is possible to mix imperative constructs known from stored procedures with declarative ones. …