R 3.4.4 released

R 3.4.4 has been released, and binaries for Windows, Mac, Linux and now available for download on CRAN. This update (codenamed ‘Someone to Lean On’ — likely a Peanuts reference, though I couldn’t find which one with a quick search) is a minor bugfix release, and shouldn’t cause any compatibility issues with scripts or packages written for prior versions of R in the 3.4.x series. This update improves automatic timezone detection on some systems, and adds fixes for a some unusual corner cases in the statistics library. For a complete list of the changes, check the NEWS file for R 3.4.4 or follow the link below.


Introduction to Numpy – Part II

The final part of the introduction to Numpy. In this second part, we are going to see a few functions in order to create a specific array. Then we are going to see the computation between two arrays. The first part of Numpy you can find here.


How we grew from 0 to 4 million women on our fashion app, with a vertical machine learning approach

Three years ago we launched Chicisimo, our goal was to offer automated outfit advice. Today, with over 4 million women on the app, we want to share how our data and machine learning approach helped us grow. It’s been chaotic but it is now under control.


Using Evolutionary AutoML to Discover Neural Network Architectures

The brain has evolved over a long time, from very simple worm brains 500 million years ago to a diversity of modern structures today. The human brain, for example, can accomplish a wide variety of activities, many of them effortlessly — telling whether a visual scene contains animals or buildings feels trivial to us, for example. To perform activities like these, artificial neural networks require careful design by experts over years of difficult research, and typically address one specific task, such as to find what’s in a photograph, to call a genetic variant, or to help diagnose a disease. Ideally, one would want to have an automated method to generate the right architecture for any given task. One approach to generate these architectures is through the use of evolutionary algorithms. Traditional research into neuro-evolution of topologies (e.g. Stanley and Miikkulainen 2002) has laid the foundations that allow us to apply these algorithms at scale today, and many groups are working on the subject, including OpenAI, Uber Labs, Sentient Labs and DeepMind. Of course, the Google Brain team has been thinking about AutoML too. In addition to learning-based approaches (eg. reinforcement learning), we wondered if we could use our computational resources to programmatically evolve image classifiers at unprecedented scale. Can we achieve solutions with minimal expert participation? How good can today’s artificially-evolved neural networks be? We address these questions through two papers.


Creating a simple text classifier using Google CoLaboratory

Google CoLaboratory is Google’s latest contribution to AI, wherein users can code in Python using a Chrome browser in a Jupyter-like environment. In this article I have shared a method, and code, to create a simple binary text classifier using Scikit Learn within Google CoLaboratory environment.


Visualizing MonteCarlo Simulation Results: Mean vs Median

Simulation studies are used in a wide range of areas from risk management, to epidemiology, and of course in statistics. The MonteCarlo package provides tools to automatize the design of these kind of simulation studies in R. The user only has to specify the random experiment he or she wants to conduct and to specify the number of replications. The rest is handled by the package. So far, the main tool to analyze the results was to look at Latex tables generated using the MakeTable() function. Now, the new package version 1.0.5 contains the function MakeFrame() that allows to represent the simulation results in form of a dataframe. This makes it very easy to visualize the results using standard tools such as dplyr and ggplot2. Here, I will demonstrate some of these concepts for a simple example that could be part of an introductory statistics course: the comparison of the mean and the median as estimators for the expected value. For an introduction to the MonteCarlo package click here or confer the package vignette.


Quick Feature Engineering with Dates Using fast.ai

The fast.ai library is a collection of supplementary wrappers for a host of popular machine learning libraries, designed to remove the necessity of writing your own functions to take care of some repetitive tasks in a machine learning workflow.


PyCon.DE 2018 & PyData Karlsruhe; October 24 – 27

PyCon.DE is where Pythonistas in Germany can meet to learn about new and upcoming Python libraries, tools, software and data science. We welcome Python enthusiasts, programmers and data scientists from around the world to join us in Karlsruhe this year.
We expect 400 participants for PyCon.DE 2018 Karlsruhe. The conference will last 3 days and include about 60 talks, tutorials and hands on sessions. Python is a programming language which has found application and friends in many areas. Due to its popularity in science, Python has experienced a meteoric rise in the data science community over the past few years. At the conference, we expect a broad and interesting mix of Pythonistas including roles such as:
• Software Developer
• Data Scientist
• System Administrator
• Academic Scientist
• Technology Enthusiast
Advertisements