Teaching Machines to Draw

Abstract visual communication is a key part of how people convey ideas to one another. From a young age, children develop the ability to depict objects, and arguably even emotions, with only a few pen strokes. These simple drawings may not resemble reality as captured by a photograph, but they do tell us something about how people represent and reconstruct images of the world around them.

Predicting Mobile App User Churn: Training & Scaling Our Machine Learning Model

With the cost of acquiring new app installs skyrocketing, keeping users engaged who have already installed is critical for maximizing acquisition spend and customer lifetime value. Urban Airship’s Data Science team has spent the last year developing a way to identify and target users who are likely to stop using your app. We are calling this Predictive Churn. Here, I provide insight into the process of building a scalable predictive machine learning model over billions of events and address how these predictive capabilities lead to new insights into user behavior, fuel new engagement strategies and impact user retention.

R Best Practices: R you writing the R way!

Any programmer inevitably writes tons of codes in his daily work. However, not all programmers inculcate the habit of writing clean codes which can be easily be understood by others. One of the reasons can be the lack of awareness among programmers of the best practices followed in writing a program. This is especially the case for novice programmers. In this post, we list some of the R programming best practices which will lead to improved code readability, consistency, and repeatability. Read on!

Fuzzy string Matching using fuzzywuzzyR and the reticulate package in R

I recently released an (other one) R package on CRAN – fuzzywuzzyR – which ports the fuzzywuzzy python library in R. “fuzzywuzzy does fuzzy string matching by using the Levenshtein Distance to calculate the differences between sequences (of character strings).” There is no big news here as in R already exist similar packages such as the stringdist package. Why then creating the package? Well, I intend to participate in a recently launched kaggle competition and one popular method to build features (predictors) is fuzzy string matching as explained in this blog post. My (second) aim was to use the (newly released from Rstudio) reticulate package, which “provides an R interface to Python modules, classes, and functions” and makes the process of porting python code in R not cumbersome. First, I’ll explain the functionality of the fuzzywuzzyR package and then I’ll give some examples on how to take advantage of the reticulate package in R.