Building your first machine learning model using KNIME (no coding required!)

One of the biggest challenges for beginners in machine learning / data science is that there is too much to learn simultaneously. Especially so, if you do not know how to code. You need to quickly get used to Linear Algebra, Statistics, other mathematical concepts and learn how to code them! It might end up being a bit overwhelming for the new users. If you have no background in coding and find it difficult to cope with, you can start learning data science with a tool which is GUI driven. This enables you to focus your efforts on learning the subject in initial days. Once you are comfortable with basic concepts, you can always learn how to code later on. In today’s article, I will get you started with one such GUI based tool – KNIME. By end of this article, you will be able to predict sales for a retail store without writing a piece of code! Let’s get started!

Crash Course On Multi-Layer Perceptron Neural Networks

Artificial neural networks are a fascinating area of study, although they can be intimidating when just getting started. There are a lot of specialized terminology used when describing the data structures and algorithms used in the field. In this post you will get a crash course in the terminology and processes used in the field of multi-layer perceptron artificial neural networks.

How We Combined Different Methods to Create Advanced Time Series Prediction

Today, businesses need to be able to predict demand and trends to stay in line with any sudden market changes and economy swings. This is exactly where forecasting tools, powered by Data Science, come into play, enabling organizations to successfully deal with strategic and capacity planning. Smart forecasting techniques can be used to reduce any possible risks and assist in making well-informed decisions. One of our customers, an enterprise from the Middle East, needed to predict their market demand for the upcoming twelve weeks. They required a market forecast to help them set their short-term objectives, such as production strategy, as well as assist in capacity planning and price control. So, we came up with an idea of creating a custom time series model capable of tackling the challenge. In this article, we will cover the modelling process as well as the pitfalls we had to overcome along the way.


• We are generating data at an unprecedented rate. But data is not the same as knowledge.
• To extract useful insights from the data and to tame the three Vs of data (Volume, Velocity and Variety), we need to rethink our tools and design principles.
• The general transition is as follows:
◦ Message Queues
◦ Sharding
• New set of tools we have:
◦ NoSQL Databases – Mongo, Cassandra, HBase
◦ Highly Scalable Message Queues – Kafka
◦ Distributed filesystems – HDFS
◦ MapReduce Paradigm – Hadoop, Spark
• In this series of innovations and improvement, we have an alternate paradigm for Big Data computation – the Lambda Architecture

PyTorch vs TensorFlow – spotting the difference

In this post I want to explore some of the key similarities and differences between two popular deep learning frameworks: PyTorch and TensorFlow. Why those two and not the others? There are many deep learning frameworks and many of them are viable tools, I chose those two just because I was interested in comparing them specifically.

Practical Text Classification for Production Systems

I had to create a text classification system few months ago. Unfortunately, I had never done any text processing and didn’t know anything about NLP. Fortunately, it’s relatively easy to create a simple text classifier by modifying the state of the art models. This post is about using a relatively simple yet powerful text classification model for a production system. Other topics like deployment, testing for out-of-sample texts are also discussed – they are often not the sexiest aspects, but it makes sense to discuss them in this post.

Text Analytics – Unstructured Data Analysis

Presented by noted data scientist Derek Kane, this video provides an introduction to text analytics for advanced business users and IT professionals with limited programming expertise. The presentation will go through different areas of text analytics as well as provide some real work examples that help to make the subject matter a little more relatable. The topics covered include search engine building, categorization (supervised and unsupervised), clustering, NLP, and social media analysis.

What is the most important step in a machine learning project?

The CRISP-DM is a common standard for machine-learning projects. Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation and Deployment. All these six steps of a machine-learning project are crucial. Quality issues in each step will directly affect the quality of the entire outcome. They are all important. However, advising to many organizations on machine learning, and running even more such projects ourselves, we (at YellowRoad) came to a conclusion, that the most under-invested step in the process is Business Understanding. We see many companies discussing algorithms and technology, before understanding the business aspects of the task that they are solving. This is clearly not a good starting point. We composed a series of questions that we use in any machine learning project that we get involved in, and we do not invest serious efforts in the following steps, until we have good answers to these questions. We find that practice to be extremely helpful.

6 practical guidelines for implementing conversational AI

It has been seven years since Apple unveiled Siri, and three since Jeff Bezos, inspired by Star Trek, introduced Alexa. But the idea of conversational interfaces powered by artificial intelligence has been around for decades. In 1966, MIT professor Joseph Weizenbaum introduced ELIZA—generally seen as the prototype for today’s conversational AI. Decades later, in a WIRED story, Andrew Leonard proclaimed that “Bots are hot,” further speculating they would soon be able to “find me the best price on that CD, get flowers for my mom [and] keep me posted on the latest developments in Mozambique.” Only the reference to CDs reveals that the story was written in 1996. Today, companies such as Slack, Starbucks, Mastercard, and Macy’s are piloting and using conversational interfaces for everything from customer service to controlling a connected home to, well, ordering flowers for mom. And if you doubt the value or longevity of this technology, consider that Gartner predicts that by 2019, virtual personal assistants “will have changed the way users interact with devices and become universally accepted as part of everyday life.” Get O’Reilly’s AI newsletter Not all conversational AI is created equal, nor should it be. Conversational AI can come in the form of virtual personal (Alexa, Siri, Cortana, Google Home) or professional (such as or Skipflag) assistants. They can be built on top of a rules engine or based on machine learning technology. Use cases range from the minute and specific (Taco Bell’s TacoBot) to the general and hypothetically infinite (Alexa, Siri, Cortana, Google Home).

Transfer Learning with augmented Data for Logo Detection

The last months, I have worked on brand logo detection in R with Keras. Starting with a model from scratch adding more data and using a pretrained model. The goal is to build a (deep) neural net that is able to identify brand logos in images. Just to recall, the dataset is a combination of the Flickr27-dataset, with 270 images of 27 classes and self-scraped images from google image search. In case you want to reproduce the analysis, you can download the set here. In the last post, I used the VGG-16 pretrained model and showed that it can be trained to achieve an accuracy of 55% on the training 35% on the validation set. In this post, I will show how to further improve the model accuracy.

DART: Dropout Regularization in Boosting Ensembles

In the paper http://…/korlakaivinayak15.pdf, the dropout can also be used to address the overfitting in boosting tree ensembles, e.g. MART, caused by the so-called “over-specialization”. In particular, while first few trees added at the beginning of ensembles would dominate the model performance, the rest added later can only improve the prediction for a small subset, which increases the risk of overfitting. The idea of DART is to build an ensemble by randomly dropping boosting tree members. The percentage of dropouts can determine the degree of regularization for boosting tree ensembles.

Model Operational Losses with Copula Regression

In the prevailing modeling practice for operational losses, it is often convenient to assume a functional independence between frequency and severity models, which might not be the case empirically. For instance, in the economic downturn, both the frequency and the severity of consumer frauds might tend to increase simultaneously. With the independence assumption, while we can argue that same variables could be included in both frequency and severity models and therefore induce a certain correlation, the frequency-severity dependence and the its contribution to the loss distribution might be overlooked.

Obstacles to performance in parallel programming

Making your code run faster is often the primary goal when using parallel programming techniques in R, but sometimes the effort of converting your code to use a parallel framework leads only to disappointment, at least initially. Norman Matloff, author of Parallel Computing for Data Science: With Examples in R, C++ and CUDA, has shared chapter 2 of that book online, and it describes some of the issues that can lead to poor performance.

How to Use the Fitted Mixed Model to Calculate Predicted Values

In this video I will answer a question from a recent webinar, Random Intercept and Random Slope Models.