Don’t miss out on these awesome GitHub Repositories & Reddit Threads for Data Science & Machine Learning (May 2018)

GitHub and Reddit both serve as interesting discovery platforms for me. I not only learn some of the best applications of data science, but also see how they have been written and will hopefully be contributing to some of these repositories in the near future. GitHub was acquired by Microsoft recently in a multi-billion dollar deal. GitHub has been the ultimate platform for collaboration between developers and we have seen the data science and machine learning community embrace it with equal enthusiasm. We hope this continues under Microsoft´s umbrella as well. As for Reddit, it continues to be a wonderful source of knowledge and opinion for data scientists. People share links to their code, other people´s codes, general data science news, ask for help and opinions, post research papers, among other things. It´s a truly powerful community that continues to provide a solid platform for interacting with fellow data science enthusiasts.

1. ML.NET
2. NLP Architect
3. Amazon Scraper
4. PIGO – Face Detection in Go
5. RL-Adventure-2: Policy Gradients

1. Real-time Multihand Pose Estimation Demo
2. Which Research paper would you choose to show that Machine Learning is Beautiful?
3. What do we currently know about Generalization? What should we be asking next about it?
4. State of Machine Learning in the Healthcare Industry
5. Potential Career Paths for Data Scientists after 3 Years


ML.NET is an open source and cross-platform machine learning framework for .NET

ML.NET is a cross-platform open-source machine learning framework which makes machine learning accessible to .NET developers. ML.NET allows .NET developers to develop their own models and infuse custom ML into their applications without prior expertise in developing or tuning machine learning models, all in .NET. ML.NET was originally developed in Microsoft Research and evolved into a significant framework over the last decade and is used across many product groups in Microsoft like Windows, Bing, PowerPoint, Excel and more. With this first preview release ML.NET enables ML tasks like classification (e.g. support text classification, sentiment analysis) and regression (e.g. price-prediction). Along with these ML capabilities this first release of ML.NET also brings the first draft of .NET APIs for training models, using models for predictions, as well as the core components of this framework such as learning algorithms, transforms, and ML data structures.


Polynomial regression with Eigen library tutorial

Hello, this is my third article about how to use modern C++ for solving machine learning problems. This time I will show how to make a model for polynomial regression problem described in previous article, with well known linear algebra library called Eigen. Eigen was chosen because it is widely used and has a long history, it is highly optimized for CPU, and is a header only library. One of the famous project using it is TensorFlow.


5 major sensor data analytics challenges: deadly or curable?

Do you know 5 majorsensor analytics challenges that your company may face while trying to benefit from sensor data So thatyour analytical efforts don´t go up in smoke, see what you need to beware of. A smoothly running sensor data analytics tool may be just as difficult to manageas a symphony orchestra. Because every musician inan orchestra – and every part of an IoT system – needs to work properly and ‘harmonize’ with the others. But how do conductors make their orchestraswork so nicely and sound so heavenly instead of creating a mismanaged cacophony Obviously, there´s a lot of practice involved. But besides that, they definitely know what pitfalls they need to avoid. Which is why, if we´re talking about orchestrating sensor data, it´s important to know 5 major sensor analytics challengesthat youcan face.


A Guide to Data Science at Scale

The secret to staying ahead in today’s globally connected world is by driving a constant stream of innovation with big data. Take control of your data to develop innovative products and services that solve today’s most challenging use cases – resulting in happier customers and more market share.
Read our eBook to learn more:
• See how easy it is to build and scale machine learning models with unified analytics platform.
• Find out how to collaborate across data teams to uncover insights faster.
• Learn how companies like Shell and Hotels.com use big data and AI to drive innovation.


The What, Where and How of Data for Data Science

Data Science is a term that escapes any single complete definition, which makes it difficult to use, especially if the goal is to use it correctly. Most articles and publications use the term freely, with the assumption that it is universally understood. However, data science – its methods, goals, and applications – evolve with time and technology. Data science 25 years ago referred to gathering and cleaning datasets then applying statistical methods to that data. In 2018, data science has grown to a field that encompasses data analysis, predictive analytics, data mining, business intelligence, machine learning, and so much more. In fact, because no one definition fits the bill seamlessly, it is up to those who do data science to define it.


The Popularity of Point-and-Click GUIs for R

Point-and-click graphical user interfaces (GUIs) for R allow people to analyze data using the R software, without having to learn how to program in the R language. This is a brief look at how popular each one is. Knowing that a GUI is popular doesn´t mean it will meet your needs, but it does mean that it´s meeting the needs of many others. This may be helpful information when selecting the appropriate GUI for you, if programming is not your primary interest. For detailed information regarding what each GUI can do for you, and how it works, see my series of comparative reviews, which is currently in progress. There are many ways to estimate the popularity of data science software, but one of the most accurate is by counting the number of downloads (see appendix for details). Figure 1 shows the monthly downloads of four of the six R GUIs that I´m reviewing (i.e. all that exist as far as I know). We can see that the R Commander (Rcmdr) is the most popular GUI, and it has had steady growth since its introduction. Next comes Rattle, which is more oriented towards machine learning tasks. It too, has shown high popularity and steady growth.
Advertisements