Mining frequent items bought together using Apriori Algorithm (with code in R)

We live in a fast changing digital world. In today’s age customers expect the sellers to tell what they might want to buy. I personally end up using Amazon’s recommendations almost in all my visits to their site. This creates an interesting threat / opportunity situation for the retailers. If you can tell the customers what they might want to buy – it not only improves your sales, but also the customer experience and ultimately life time value. On the other hand, if you are unable to predict the next purchase, the customer might not come back to your store. In this article, we will learn one such algorithm which enables us to predict the items bought together frequently. Once we know this, we can use it to our advantage in multiple ways.

The State of Data Innovation in the EU

As a host of new digital technologies have emerged over the last decade, data has become a key driver of economic growth, social progress and innovation. But what is the true state of data innovation in the EU, and how do European national economies compare in their use of and support for data? A forthcoming report by the Center for Data Innovation ranks the EU’s 28 member states by how well they perform in data innovation, using a variety of indicators. The around 30 indicators include metrics on open data policies, digital skills, and the use of data-driven technologies in industry. Join the Center for Data Innovation for a presentation of the report’s findings, and a discussion with leading experts and policymakers about the factors behind the leading countries’ success, and how all European countries can build a thriving data economy.

Take our new Deep Learning courses, now open on Coursera

Causal convolutions for sequence-based recommendations

Using sequences of user-item interactions as an input for recommender models has a number of attractive properties. Firstly, it recognizes that recommending the next item that a user may want to buy or see is precisely the goal we are trying to achieve. Secondly, it’s plausible that the ordering of users’ interactions carries additional information over and above just the identities of items they have interacted with. For example, a user is more likely to watch the next episode of a given TV series if they’ve just finished the previous episode. Finally, when the sequence of past interactions rather than the identity of the user is the input to a model, online systems can incorporate new users (and old users’ new actions) in real time. They are fed to the existing model, and do not require a new model to be fit to incorporate new information (unlike factorization models). Recurrent neural networks are the most natural way of modelling such sequence problems. In recommendations, gated recurrent units (GRUs) have been used with success in the Session-based recommendations with recurrent neural networks paper. Spotlight implements a similar model using LSTM units as one of its sequence representations.

Learn Data Science from Kaggle Competition Meetups

Anyone interested in starting a Kaggle meetup?’ It was a casual question asked by the organizer of a paper-reading group. A core group of four people said, “Sure!”, although we didn’t have a clear idea about what such a meetup should be. That was 18 months ago. Since then we have developed a regular meetup series that is regularly attended by 40-60 people. It has given scores of people exposure to hands-on data science. It has also connected numerous startups and established companies with people looking for career opportunities in data science. Participating in this meetup has been a very rewarding experience for all involved. In this blog post we’ll share what we’ve learned and hope it will encourage others to give it a try.

What Artificial Intelligence and Machine Learning Can Do and What It Can’t

I have seen situations where AI (or at least machine learning) had an incredible impact on a business—I also have seen situations where this was not the case. So, what was the difference?

How to choose a cloud provider

If you look up the phrase “boiling the ocean,” it’s defined as writing a post on choosing a cloud provider—there are so many different facets and use cases, and each variable complicates your choice. The key is to narrow the field to your specific situation and needs. In this article, I share some of the early questions and decisions I use when working with a team to choose a cloud provider.

How to create interactive data visualizations with ggvis

The ggvis package is used to make interactive data visualizations. The fact that it combines shiny’s reactive programming model and dplyr’s grammar of data transformation make it a useful tool for data scientists. This package may allows us to implement features like interactivity, but on the other hand every interactive ggvis plot must be connected to a running R session.

New Course – Supervised Learning in R: Regression

Hello R users, new course hot off the press today by Nina Zumel – Supervised Learning in R: Regression! From a machine learning perspective, regression is the task of predicting numerical outcomes from various inputs. In this course, you’ll learn about different regression models, how to train these models in R, how to evaluate the models you train and use them to make predictions.

Data Science Primer: Basic Concepts for Beginners

This collection of concise introductory data science tutorials cover topics including the difference between data mining and statistics, supervised vs. unsupervised learning, and the types pf patterns we can mine from data.

Transforming from Autonomous to Smart: Reinforcement Learning Basics

… So we are going to use this blog to deep dive into the category of artificial intelligence called reinforcement learning. We are going to see how reinforcement learning might help us to address these challenges; to work smarter at the edge when brute force technology advances will not suffice.