How to create animated GIF images for data visualization using gganimate (in R)?

I say that because how you create data stories and visualization has a huge impact on how your customers look at your work. Ultimately, data science is not only about how complicated and sophisticated your models are. It is about solving problems using data based insights. And in order to implement these solutions, your stakeholders need to understand what you are proposing. One of the challenges in creating effective visualizations is to create images which speak for themselves. This article will tell one of the ways to do so using animated GIF images (Graphics Interchangeable format). This would be particularly helpful when you want to show time / flow based stories. Using animation in images, you can plot comparable data over time for specific set of parameters. In other words, it is easy to understand and see the growth of certain parameter over time. Let me show this with an example


Data Modelling Topologies of a Graph Database

There is a lot of confusion with the definition of graph databases. In my opinion, any definition that avoids any reference to the semantics of nodes and edges or their internal structure is preferable. Failing to follow this guideline, it is unavoidable to favor specific implementations, e.g. Property Graph Databases or Triple Stores, and you may easily become myopic to other types that are based on different models, e.g. hypergraph databases, or different data storage paradigms, e.g. key-value stores. Therefore, I propose we adopt a vendor neutral definition, such as the following one, which cannot exclude any future type of graph database.


100 Free Tutorials for Learning R

R programming language tutorials are listed below which are ideal for beginners to advanced users. R language is the world’s most widely used programming language for statistical analysis, predictive modeling and data science. It’s popularity is claimed in many recent surveys and studies. R programming language is getting powerful day by day as number of supported packages grows. Some of big IT companies such as Microsoft and IBM have also started developing packages on R and offering enterprise version of R.


Image Segmentation using deconvolution layer in Tensorflow

In this series of post, we shall learn the algorithm for image segmentation and implementation of the same using Tensorflow. This is the first part of the series where we shall focus on understanding and be implementing a deconvolutional/fractional-strided-convolutional layer in Tensorflow.


Weather Forecast With Regression Models – Part 4


How to use Windows Linux Subsystem & Win10 side by side for Machine Learning and Coding !

A large number of open source libraries\modules in machine learning are first made available for Linux and the windows versions are always released later . Maintaining two separate OS on Dual boot or switching between Virtual machines is not the best way.If you are already working on Linux only, then this guide is not for you. If you are in corporate environment with no dual boot and want to run Linux and at the same time want to be on AD server, this is the ideal solution.


K-means Clustering with Tableau – Call Detail Records Example

We show how to use Tableau 10 clustering feature to create statistically-based segments that provide insights about similarities in different groups and performance of the groups when compared to each other.


Normalization in Deep Learning

A few days ago (Jun 2017), a 100 page on Self-Normalizing Networks appeared. An amazing piece of theoretical work, it claims to have solved the problem of building very large Feed Forward Networks (FNNs). It builds upon a Batch Normalization (BN), introduced in 2015- and is now the defacto standard for all CNNs and RNNs. But not so useful for FNNs. What makes normalization so special? It makes very Deep Networks easier to train, by damping out oscillations in the distribution of activations.


Partial least squares in R

My last entry introduces principal component analysis (PCA), one of many unsupervised learning tools. I concluded the post with a demonstration of principal component regression (PCR), which essentially is a ordinary least squares (OLS) fit using the first k principal components (PCs) from the predictors. This brings about many advantages …


Automatic tools for improving R packages

During my talk at RUG BCN, for each tool I gave a short introduction and then applied it to a small package I had created for the occasion. In that post I’ll just shortly present each tool. Most of them are only automatic because they automatically provide you with a list of things to fix, but they won’t do the work for you, sorry. If you have an R package you develop at hand, I’d really advise you to apply them on it and see what you get! I concentrated on tools improving the coding style, the package structure, testing, the documentation, but not features and performance.


Non-Standard Evaluation and Function Composition in R

In this article we will discuss composing standard-evaluation interfaces (SE) and composing non-standard-evaluation interfaces (NSE) in R.