Machine Learning Translation and the Google Translate Algorithm

Every day we use different technologies without even knowing how exactly they work. In fact, it’s not very easy to understand engines powered by machine learning. The Statsbot team wants to make machine learning clear by telling data stories in this blog. Today, we’ve decided to explore machine translators and explain how the Google Translate algorithm works.

Machine Learning Explained: Dimensionality Reduction

Dealing with a lot of dimensions can be painful for machine learning algorithms. High dimensionality will increase the computational complexity, increase the risk of overfitting (as your algorithm has more degrees of freedom) and the sparsity of the data will grow. Hence, dimensionality reduction will project the data in a space with less dimension to limit these phenomena. In this post, we will first try to get an intuition of what is dimensionality reduction, then we will focus on the most widely used techniques.

Picasso: A free open-source visualizer for Convolutional Neural Networks

While it’s easier than ever to define and train deep neural networks (DNNs), understanding the learning process remains somewhat opaque. Monitoring the loss or classification error during training won’t always prevent your model from learning the wrong thing or learning a proxy for your intended classification task.

Beautiful Python Visualizations: An Interview with Bryan Van de Ven, Bokeh Core Developer

Read this insightful interview with Bokeh’s core developer, Bryan Van de Ven, and gain an understanding of what Bokeh is, when and why you should use it, and what makes Bryan a great fit for helming this project.

The wisdom hierarchy: From signals to artificial intelligence and beyond

We are swimming in data. Or possibly drowning in it. Organizations and individuals are generating and storing data at a phenomenal and ever-increasing rate. The volume and speed of data collection has given rise to a host of new technologies and new roles focused on dealing with this data, managing it, organizing it, storing it. But we don’t want data. We want insight and value.

How can I add simple, automated data visualizations and dashboards to Jupyter Notebooks

The IBM Watson Developer Advocacy team has developed an open source tool called PixieDust, which makes creating data visualizations and dashboards quick and easy in Jupyter Notebooks—something that previously required in-depth knowledge of libraries such as matplotlib or Seaborn. In this video, David Taieb gives you a high-level overview of PixieDust and explains how you can use it to increase your productivity with easy-to-create visualizations of your data right in your notebooks, allowing you to explore your data without needing a computer science degree to build data charts and graphs!

Learn How to Build Intelligent Data Applications With Amazon Web Services (AWS)

This course shows you how to use a range of AWS services to create intelligent end-to-end applications that incorporate ingestion, storage, preprocessing, machine learning (ML), and connectivity to an application client or server. The course is designed for data scientists looking for clear instruction on how to deploy locally developed ML applications to the AWS platform, and for developers who want to add machine learning capabilities to their applications using AWS services. Prerequisites include: Basic awareness of Amazon Simple Storage Service (S3), Elastic Compute Cloud (EC2), and Amazon Elastic MapReduce; as well as some knowledge of ML concepts like classification and regression analysis, model types, training and performance measures; and a general understanding of Python.
• Understand how to use Amazon Web Service’s best-in-class streaming analytics and ML tools
• Learn about Amazon data pipelines: A very lightweight way to deploy an ML algorithm
• Explore Redshift and RDS: Databases that stage input data or store model outputs
• Discover Kinesis: A streaming data ingestion service that performs streaming analytical functions
• Learn to apply streaming and batch analytical processing to prepare datasets for ML algorithms
• Gain experience building ML models using Amazon Machine Learning and calling them using Python

Introduction to the Microsoft Cortana Intelligence Suite for Advanced Analytics

The Cortana Intelligence Suite is an advanced analytics platform designed for BI developers, data scientists, features engineers, and business analysts. This course provides a description of the technologies that make up the Cortana Intelligence Suite and an explanation of the Azure Team Data Science Process, a systematic approach to building intelligent applications that takes into account business understanding, data acquisition, modeling, and deployment. In addition, the course reviews four critical types of advanced analytics and demonstrates how each can be applied to business today.
• Understand the Cortana Intelligence Suite technologies, their purposes, and relations to each other
• Learn how to use the Azure Team Data Science Process to plan intelligent applications projects
• Discover four key types of advanced analytics to consider when planning data projects

Understanding Convolutional Neural Networks (CNNs)

Convolutional neural networks (CNNs) enable very powerful deep learning based techniques for processing, generating, and sensemaking of visual information. These are revolutionary techniques in computer vision that impact technologies ranging from e-commerce to self-driving cars. This course offers an in-depth examination of CNNs, their fundamental processes, their applications, and their role in visualization and image enhancement. The course covers concepts, processes, and technologies such as CNN layers and architectures. It also explains CNN image classification and segmentation, deep dream and style transfer, super-resolution, and generative adversarial networks (GANs). Learners who come to this course with a basic knowledge of deep learning principles, some computer vision experience, and exposure to engineering math should gain the ability to implement CNNs and use them to create their own visualizations.
• Discover the connections between CNNs and the biological principles of vision
• Understand the advantages and trade-offs of various CNN architectures
• Survey the history and evolution of CNN’s on-going development
• Learn to apply the latest GAN, style transfer, and semantic segmentation techniques
• Explore CNN applications, visualization, and image enhancement

How do GANs intuitively work?

GANs or Generative Adversarial Networks are a kind of neural networks that is composed of 2 separate deep neural networks competing each other: the generator and the discriminator. Their goal is to generate data points that are magically similar to some of the data points in the training set. GAN is a really powerful idea. Even Yann LeCun (one of the fathers of Deep Learning) is saying that it’s the coolest idea of Machine Learning in the last 20 years. Currently, people use GAN to generate various things. It can generate realistic images, 3D-models, videos, and a lot more.