Python for Data Science – Tutorial for Beginners #1 – Python Basics

If you are learning Data Science, pretty soon you will meet Python. Why is that? Because that’s one of the most commonly used data languages. The popularity of it comes from 3 things:
• Python is fairly easy to interpret and learn.
• Python handles different data structures very well.
• Python has very powerful statistical and data visualization libraries.

Major update of D3partitionR: Interactive viz’ of nested data with R and D3.js

D3partitionR is an R package to visualize interactively nested and hierarchical data using D3.js and HTML widget. These last few weeks I’ve been working on a major D3partitionR update which is now available on GitHub. As soon as enough feedbacks are collected, the package will be on uploaded on the CRAN.

August Kaggle Dataset Publishing Awards Winners’ Interview

In August, over 350 new datasets were published on Kaggle, in part sparked by our $10,000 Datasets Publishing Award. This interview delves into the stories and background of August’s three winners-Ugo Cupcic, Sudalai Rajkumar, and Colin Morris. They answer questions about what stirred them to create their winning datasets and kernel ideas they’d love to see other Kagglers explore.

5 Ways to Get Started with Reinforcement Learning

Machine learning algorithms, and neural networks in particular, are considered to be the cause of a new AI ‘revolution’. In this article I will introduce the concept of reinforcement learning but with limited technical details so that readers with a variety of backgrounds can understand the essence of the technique, its capabilities and limitations.

You need an Analytics Center of Excellence

More than 10 years after big data emerged as a new technology paradigm, it is finally in a mature state and its business value throughout most industry sectors is established by a significant number of use cases. A couple of years ago, the discussion was still about how big data changed our way of capturing, processing, analyzing, and exploiting data in new and meaningful ways for business decision makers. Now many companies undertake analytical projects at a departmental level, redefining the relationship between business and IT by the adoption of Agile and DevOps methodologies. Real-time processing, machine learning algorithms, and even artificial intelligence are the new normal in business talk. However, companies are still struggling to adopt big data at a corporate level. In many corporations, there is a gap between launching departmental projects and industrializing and scaling-up those use cases across corporations. Embedding big data in scalable business processes is crucial to becoming a data-driven organization. Building an Analytics Center of Excellence (ACoE) can be the basis for this transformation.


Hi! We have added R API for mljar – so you can run sklearn, xgboost, lightGBM, Keras, RGF from one R line 🙂 Please check it on: https://…/mljar-api-R

12 Visualizations to Show a Single Number

Infographics, dashboards, and reports often need to highlight or visualize a single number. But how do you highlight a single number so that it has an impact and looks good? It can be a big challenge to make a lonely, single number look great. In this post, I show 12 different ways of representing a single number. Most of these visualizations have been created automatically using R.

New: Streaming Google Analytics Data for BigQuery

Streaming data for BigQuery export is here. Today we’re happy to announce that data for the Google Analytics BigQuery export can be streamed as often as every 10 minutes into Google Cloud. If you’re a Google Analytics 360 client who wants to do current-day analysis, this means you can choose to send data to BigQuery up to six times per hour for almost real-time analysis and action. That’s a 48x improvement over the existing three-times-per-day exports.

Why I use R for Data Science – An Ode to R

Working in Data Science, I often feel like I have to justify using R over Python. And while I do use Python for running scripts in production, I am much more comfortable with the R environment. Basically, whenever I can, I use R for prototyping, testing, visualizing and teaching. But because personal gut-feeling preference isn’t a very good reason to give to (scientifically minded) people, I’ve thought a lot about the pros and cons of using R. This is what I came up with why I still prefer R.