Tutorial to deploy Machine Learning models in Production as APIs (using Flask)

I remember initial days of my Machine Learning (ML) projects. I had put in a lot of efforts to build a really good model. I took expert advice on how to improve my model, I thought about feature engineering, I talked to domain experts to make sure their insights are captured. But, then I came across a problem! How do I implement this model in real life? I had no idea about this. All the literature I had studied till now focussed on improving the models. But I didn’t know what was the next step. This is why, I have created this guide – so that you don’t have to struggle with the question as I did. By end of this article, I will show you how to implement a machine learning model using Flask framework in Python.

Multicollinearity in R

One of the assumptions of Classical Linear Regression Model is that there is no exact collinearity between the explanatory variables. If the explanatory variables are perfectly correlated, you will face with these problems:
• Parameters of the model become indeterminate
• Standard errors of the estimates become infinitely large

R 3.4.2 is released (with several bug fixes and a few performance improvements)

R 3.4.2 (codename “Short Summer”) was released yesterday.

Oneway ANOVA Explanation and Example in R; Part 2

One relatively common question in statistics or data science is, how “big” is the difference or the effect? At this point we can state with some statistical confidence that tire brand matters in predicting tire mileage life, it isn’t likely given our data that we would see results like these by chance. But… Is this a really big difference between the brands? Often this is the most important question in our research. After all if it’s a big difference we might change our shopping habits and/or pay more. Is there a way of knowing how big this difference is?

Python Data Structures (Python for Data Science Basics #2)

Where we left off? Oh, right, we learned about how to use variables in Python. Here is the second essential topic, that you have to learn, if you are going to use Python as a Data Scientist: Python Data Structures!

Interactive Visualizations In Jupyter Notebook

This entry is a non-exhaustive introduction on how to create interactive content directly from your Jupyter notebook. Content mostly refers to data visualization artifacts, but we’ll see that we can easily expand beyond the usual plots and graphs, providing worthy interactive bits for all kind of scenarios, from data-exploration to animations.

Probability and Statistics

Next week, I will start a short course on probability and statistics. The slides of the course are now online. There will be more information soon about the exam and the projects.

New Theory Unveils the Black Box of Deep Learning

In the video presentation below (courtesy of Yandex) – “Deep Learning: Theory, Algorithms, and Applications” – Naftali Tishby, a computer scientist and neuroscientist from the Hebrew University of Jerusalem, provides evidence in support of a new theory explaining how deep learning works. Tishby argues that deep neural networks learn according to a procedure called the “information bottleneck,” which he and two collaborators first described in purely theoretical terms in 1999. The idea is that a network rids noisy input data of extraneous details as if by squeezing the information through a bottleneck, retaining only the features most relevant to general concepts. Striking new computer experiments by Tishby and his student Ravid Shwartz-Ziv reveal how this squeezing procedure happens during deep learning, at least in the cases they studied.

Tensorflow Tutorial : Part 2 – Getting Started

In this multi-part series, we will explore how to get started with tensorflow. This tensorflow tutorial will lay a solid foundation to this popular tool that everyone seems to be talking about. The second part is a tensorflow tutorial on getting started, installing and building a small use case. This series is excerpts from a Webinar tutorial series I have conducted as part of the United Network of Professionals. Time to time I will be referring to some of the slides that I used there as part of the talk to make it clearer.

The age of machine learning

Ben Lorica discusses the state of machine learning.

Unleashing intelligence and data analytics at scale

Ziya Ma explains how Intel is driving a holistic approach to powering advanced analytics and artificial intelligence workloads.

bupaR: Business Process Analysis with R

Organizations are nowadays storing huge amounts of data related to various business processes. Process mining provides different methods and techniques to analyze and improve these processes. This allows companies to gain a competitive advantage. Process mining initiated with the discovery of work-flow models from event data. However, over the past 20 years, the process mining field has evolved into a broad and diverse research discipline. bupaR is an open-source suite for the handling and analysis of business process data in R. It was developed by the Business Informatics research group at Hasselt University, Belgium. The central package includes basic functionality for creating event log objects in R. It contains several functions to get information about an event log and also provides specific event log versions of generic R functions. Together with the related packages, each of which has their own specific purpose, bupaR aims at supporting each step in the analysis of event data with R, from data import to online process monitoring.

How to rapidly master data science

To rapidly master data science, you need to do several things:
1. Break it down
2. Figure out what to do, and what not to do
3. Design a plan
4. Learn
5. Practice
Let’s dive into each of these.