What is Enterprise Information Management (EIM)?

Enterprise Information Management (EIM) can best be defined as a set of technologies, processes, disciplines and practices used to manage an organization’s data and content as an enterprise asset. It’s not a new concept but its importance to large organisations is growing rapidly. Frost and Sullivan has estimated that enterprise-level data grows at the rate of 30 to 40 per cent per year. The ability to harness the value within this information is critical to business success. That requires an effective EIM platform. For this reason, Gartner says that by 2020, ‘75% of organizations that use Enterprise Information Management to align, link and leverage their data and analytics investments will report substantially improved business outcomes’. You could describe the remit of EIM as ‘everything information’. So, it’s not surprising that there are a number of components to EIM including Enterprise Content Management (ECM), Business Process Management (BPM), Customer Experience Management (CEM), B2B Integration, Discovery and Analytics. While each component can be deployed individually, the power of EIM software comes when all components are combined into a complete end-to-end solution set.


Beyond Word2Vec Usage For Only Words

Making a machine learning model usually takes a lot of crying, pain, feature engineering, suffering, training, debugging, validation, desperation, testing and a little bit of agony due to the infinite pain. After all that, we deploy the model and use it to make predictions for future data. We can run our little devil on a batch of data once in an hour, day, week, month or on the fly depending on the situation and use case. Let’s take a look at an example related to an online sport betting recommender engine. The goal of that engine is to predict whether the user will play a particular selection on a game or not (e.g. final score – home win, goals – 3 or more goals, etc). These predictions are based on user history, and those predictions are used to construct a ticket that will be recommended to the user. To achieve fast recommendation in real time, we can calculate everything before the user even appears. This use case allows us to fantasize with feature extraction, we can literally play with features in order to make a more accurate model without hurting the performance of the application. Some toy pipeline and prediction is presented in the image below. So basically, our application will fast serve predictions, and our satan rituals can run safely in the background.


Democratizing Artificial Intelligence, Deep Learning, Machine Learning with Dell EMC Ready Solutions

Artificial Intelligence, Machine Learning and Deep Learning (AI | ML | DL) are at the heart of digital transformation by enabling organizations to exploit their growing wealth of big data to optimize key business and operational use cases.
• AI (Artificial Intelligence) is the theory and development of computer systems able to perform tasks normally requiring human intelligence (e.g. visual perception, speech recognition, translation between languages, etc.).
• ML (Machine Learning) is a sub-field of AI that provides systems the ability to learn and improve by itself from experience without being explicitly programmed.
• DL (Deep Learning) is a type of ML built on a deep hierarchy of layers, with each layer solving different pieces of a complex problem. These layers are interconnected into a “neural network.” A DL framework is SW that accelerates the development and deployment of these models.


One model to learn them all

Suppose you ask me if I’d like anything to eat. I can say the word ‘banana’ (such that you hear it spoken), send you a text message whereby you see (and read) the word ‘banana,’ show you a picture of a banana, and so on. All of these different modalities (the sound waves, the written word, the visual image) tie back to the same concept – they are different ways of ‘inputting’ the banana concept. Your conception of bananas is independent of the way the thought popped into your head. Likewise, as an ‘output’ I could ask you to say the word banana, write the word banana, draw a picture of a banana, and so on. We are able to reason about such concepts independently of the input and output modalities. And we seem able to reuse our conceptual knowledge of bananas in many different contexts (i.e., across many different tasks). Deep neural networks are typically designed and tuned for the problem at hand. Generalisation helps such a network to do well on new instances of the same problem not seen before, and transfer learning sometimes gives us a leg up by reusing e.g., learned feature representations from within the same domain. There do exist multi-task models, “but all these models are trained on other tasks from the same domain: translation tasks are trained with other translation tasks, vision tasks with other vision tasks, speech tasks with other speech tasks.” It’s as if we had one concept for the written word ‘banana’, another concept for pictures of bananas, and another concept for the spoken word ‘banana’ – but these weren’t linked in any way.


We need to build machine learning tools to augment machine learning engineers

In this post, I share slides and notes from a talk I gave in December 2017 at the Strata Data Conference in Singapore offering suggestions to companies that are actively deploying products infused with machine learning capabilities. Over the past few years, the data community has focused on infrastructure and platforms for data collection, including robust pipelines and highly scalable storage systems for analytics. According to a recent LinkedIn report, the top two emerging jobs are “machine learning engineer” and “data scientist.” Companies are starting to staff to put their data infrastructures to work, and machine learning is going become more prevalent in the years to come.


Building a neural network from scratch in R

Neural networks can seem like a bit of a black box. But in some ways, a neural network is little more than several logistic regression models chained together. In this post I will show you how to derive a neural network from scratch with just a few lines in R. If you don’t like mathematics, feel free to skip to the code chunks towards the end. This blog post is partly inspired by Denny Britz’s article, Implementing a Neural Network from Scratch in Python, as well as this article by Sunil Ray.


Deep Learning from first principles in Python, R and Octave – Part 2

This post is a follow-up post to my earlier post Deep Learning from first principles in Python, R and Octave-Part 1. In the first part, I implemented Logistic Regression, in vectorized Python,R and Octave, with a wannabe Neural Network (a Neural Network with no hidden layers). In this second part, I implement a regular, but somewhat primitive Neural Network (a Neural Network with just 1 hidden layer). The 2nd part implements classification of manually created datasets, where the different clusters of the 2 classes are not linearly separable.


Deep Learning With Keras To Predict Customer Churn

Customer churn is a problem that all companies need to monitor, especially those that depend on subscription-based revenue streams. The simple fact is that most organizations have data that can be used to target these individuals and to understand the key drivers of churn, and we now have Keras for Deep Learning available in R (Yes, in R!!), which predicted customer churn with 82% accuracy. We’re super excited for this article because we are using the new keras package to produce an Artificial Neural Network (ANN) model on the IBM Watson Telco Customer Churn Data Set! As with most business problems, it’s equally important to explain what features drive the model, which is why we’ll use the lime package for explainability. We cross-checked the LIME results with a Correlation Analysis using the corrr package. In addition, we use three new packages to assist with Machine Learning (ML): recipes for preprocessing, rsample for sampling data and yardstick for model metrics. These are relatively new additions to CRAN developed by Max Kuhn at RStudio (creator of the caret package). It seems that R is quickly developing ML tools that rival Python. Good news if you’re interested in applying Deep Learning in R! We are so let’s get going!!


Bayesian Binomial Test in R

In this post, I implemenent an R function for computing P(theta1 > theta2) , where theta1 and theta2 are beta-distributed random variables. This is useful for estimating the probability that one binomial proportion is greater than another.


A Primer on Web Scraping in R

If you are a data scientist who wants to capture data from such web pages then you wouldn’t want to be the one to open all these pages manually and scrape the web pages one by one. To push away the boundaries limiting data scientists from accessing such data from web pages, there are packages available in R.