Triangle Generative Adversarial Network (Delta-GAN) google
A Triangle Generative Adversarial Network ($\Delta$-GAN) is developed for semi-supervised cross-domain joint distribution matching, where the training data consists of samples from each domain, and supervision of domain correspondence is provided by only a few paired samples. $\Delta$-GAN consists of four neural networks, two generators and two discriminators. The generators are designed to learn the two-way conditional distributions between the two domains, while the discriminators implicitly define a ternary discriminative function, which is trained to distinguish real data pairs and two kinds of fake data pairs. The generators and discriminators are trained together using adversarial learning. Under mild assumptions, in theory the joint distributions characterized by the two generators concentrate to the data distribution. In experiments, three different kinds of domain pairs are considered, image-label, image-image and image-attribute pairs. Experiments on semi-supervised image classification, image-to-image translation and attribute-based image generation demonstrate the superiority of the proposed approach. …

BoostJet google
Recommenders have become widely popular in recent years because of their broader applicability in many e-commerce applications. These applications rely on recommenders for generating advertisements for various offers or providing content recommendations. However, the quality of the generated recommendations depends on user features (like demography, temporality), offer features (like popularity, price), and user-offer features (like implicit or explicit feedback). Current state-of-the-art recommenders do not explore such diverse features concurrently while generating the recommendations. In this paper, we first introduce the notion of Trackers which enables us to capture the above-mentioned features and thus incorporate users’ online behaviour through statistical aggregates of different features (demography, temporality, popularity, price). We also show how to capture offer-to-offer relations, based on their consumption sequence, leveraging neural embeddings for offers in our Offer2Vec algorithm. We then introduce BoostJet, a novel recommender which integrates the Trackers along with the neural embeddings using MatrixNet, an efficient distributed implementation of gradient boosted decision tree, to improve the recommendation quality significantly. We provide an in-depth evaluation of BoostJet on Yandex’s dataset, collecting online behaviour from tens of millions of online users, to demonstrate the practicality of BoostJet in terms of recommendation quality as well as scalability. …

Boruta google
Machine learning methods are often used to classify objects described by hundreds of attributes; in many applications of this kind a great fraction of attributes may be totally irrelevant to the classification problem. Even more, usually one cannot decide a priori which attributes are relevant. In this paper we present an improved version of the algorithm for identification of the full set of truly important variables in an information system. It is an extension of the random forest method which utilises the importance measure generated by the original algorithm. It compares, in the iterative fashion, the importances of original attributes with importances of their randomised copies. We analyse performance of the algorithm on several examples of synthetic data, as well as on a biologically important problem, namely on identification of the sequence motifs that are important for aptameric activity of short RNA sequences. …

Advertisements