Network Lasso (nLasso) google
A recently proposed learning algorithm for massive network-structured data sets (big data over networks) is the network Lasso (nLasso), which extends the well- known Lasso estimator from sparse models to network-structured datasets. Efficient implementations of the nLasso have been presented using modern convex optimization methods. …

Charikar’s Algorithm google
To detect near-duplicates this software uses the Charikar’s fingerprinting technique, this means characterizing each document with a unique 64-bit vector, like a fingerprint. To determine whether two documents are Near-duplicates, we have to compare their fingerprints. To do this we use two algorithms, the algorithm developed by Moses Charikar and the Hamming distance algorithm, which allows us to measure the similarity between two vectors of n bits. What is Charikar’s algorithm?
• Characterization of the document
• Apply hash functions to the characteristics
• Obtain fingerprint
• Apply vector comparison function: Are (Doc1, doc2) near-duplicate? Hamming-distance (fingerprint (doc1), fingerprint (doc2)) = k
GitXiv


Transformation Autoregressive Network google
The fundamental task of general density estimation has been of keen interest to machine learning. Recent advances in density estimation have either: a) proposed a flexible model to estimate the conditional factors of the chain rule, $p(x_{i}\, |\, x_{i-1}, \ldots)$; or b) used flexible, non-linear transformations of variables of a simple base distribution. Instead, this work jointly leverages transformations of variables and autoregressive conditional models, and proposes novel methods for both. We provide a deeper understanding of our methods, showing a considerable improvement through a comprehensive study over both real world and synthetic data. Moreover, we illustrate the use of our models in outlier detection and image modeling tasks. …

Advertisements