Natural Language Processing Made Easy – using SpaCy (?in Python)

Natural Language Processing is one of the principal areas of Artificial Intelligence. NLP plays a critical role in many intelligent applications such as automated chat bots, article summarizers, multi-lingual translation and opinion identification from data. Every industry which exploits NLP to make sense of unstructured text data, not just demands accuracy, but also swiftness in obtaining results. Natural Language Processing is a capacious field, some of the tasks in nlp are – text classification, entity detection, machine translation, question answering, and concept identification. In one of my last article, I discussed various tools and components that are used in the implementation of NLP. Most of the components discussed in the article were described using venerated library – NLTK (Natural Language Toolkit). In this article, I will share my notes on one of the powerful and advanced libraries used to implement nlp – spaCy.

How Not To Program the TensorFlow Graph

Using TensorFlow from Python is like using Python to program another computer. Some Python statements build your TensorFlow program, some Python statements execute that program, and of course some Python statements aren’t involved with TensorFlow at all. Being thoughtful about the graphs you construct can help you avoid confusion and performance pitfalls. Here are a few considerations.

Comparing Machine Learning as a Service: Amazon, Microsoft Azure, Google Prediction API

For most businesses, machine learning seems close to rocket science, appearing expensive and talent demanding. And, if you’re aiming at building another Netflix recommendation system, it really is. But the trend of making everything-as-a-service has affected this sophisticated sphere, too. You can jump-start an ML initiative without much investment, which would be the right move if you are new to data science and just want to grab the low hanging fruit.

?Dogs vs. Cats Redux Playground Competition, Winner’s Interview: Bojan Tunguz

The Dogs versus Cats Redux: Kernels Edition playground competition revived one of our favorite ‘for fun’ image classification challenges from 2013, Dogs versus Cats. This time Kaggle brought Kernels, the best way to share and learn from code, to the table while competitors tackled the problem with a refreshed arsenal including TensorFlow and a few years of deep learning advancements. In this winner’s interview, Kaggler Bojan Tunguz shares his 4th place approach based on deep convolutional neural networks and model blending.

Introduction to TensorFlow-Slim: Complex TensorFlow Model Building and Training Made Easy

TensorFlow-Slim (TF-Slim) is a TensorFlow wrapper library that allows you to build and train complex TensorFlow models in an easy, intuitive way by eliminating the boilerplate code that plagues many deep learning algorithms. This course teaches you how to use TF-Slim and is intended for learners with some previous experience working with TensorFlow.

Forecasting Markets using eXtreme Gradient Boosting (XGBoost)

In recent years, machine learning has been generating a lot of curiosity for its profitable application to trading. Numerous machine learning models like Linear/Logistic regression, Support Vector Machines, Neural Networks, Tree-based models etc. are being tried and applied in an attempt to analyze and forecast the markets. Researchers have found that some models have more success rate compared to other machine learning models. eXtreme Gradient Boosting also called XGBoost is one such machine learning model that has received rave from the machine learning practitioners. In this post, we will cover the basics of XGBoost, a winning model for many kaggle competitions. We then attempt to develop an XGBoost stock forecasting model using the “xgboost” package in R programming.

Discrete Event Simulation in R (and, Why R Is Different)

I was pleased to see the announcement yesterday of simmer 3.61. a discrete-event simulation (DES) package for R. I’ve long had an interest in DES, and as I will explain below, implementing DES in R brings up interesting issues about R that transcend the field of DES. I had been planning to discuss them in the context of my own DES package for R, and the above announcement will make a good springboard for that.

Understanding the Tucker decomposition, and compressing tensor-valued data (with R code)

In this post I introduce the Tucker decomposition (Tucker (1966) “Some mathematical notes on three-mode factor analysis”).

Python and R Vie for Top Spot in Kaggle Competitions

I’ve just updated the Competition Use section of The Popularity of Data Science Software. Here’s just that section for your convenience.