|Quotes| = 197
Jake Porway Data is new eyes.
Daniel Tunkelang Failure is a great teacher.
Isabelle Nuage
(February 24, 2015)
Big Data by itself is of little use.
Eric Jonas Life is too short to not be having fun.
Kira Radinsky Working with data is like an adventure.
William Edwards Deming
In God we trust, all others bring data.
David Cearley
Every app now needs to be an analytic app.
Caitlin Smallwood It’s very obvious how different people are.
Patrick Gerald
(May 22, 2015)
No good business succeeds without analytics.
Timothy E. Carone
(January 30, 2015)
Big Data is the oxygen for autonomous systems.
Amy Heineike Data science is already kind of a broad church.
Ryan Irwin
(August 20, 2014)
Have a sense of humor, and never stop learning!
John Keats Nothing ever becomes real till it is experienced.
Daniel Tunkelang Anything that looks interesting is probably wrong.
John Foreman I find it tough to find and hire the right people.
Kira Radinsky The person you hire has to understand the business.
Miguel de Cervantes from Don Quixote By a small sample, we may judge of the whole piece.
Victor Hugo Nothing is stronger than an idea whose time has come.
Larry Hardesty
(August 15, 2014)
In the age of big data, visualization tools are vital.
Joel Cadwell
(August 21, 2014)
R makes it so easy to fit many models to the same data.
Attributed to Einstein Models should be as simple as possible, but not more so.
Ed Burns
(September 2015)
Big data analytics architecture requires integration push.
Niels Bohr Prediction is very difficult, especially about the future.
Christopher Bishop
Half of what we do at Microsoft Research is Machine Learning.
Daniel Tunkelang Search is the problem at the heart of the information economy.
Victor Hu It is hard to know what you really need until you dig into it.
Hans Rosling The idea is to go from numbers to information to understanding.
Isaiah, XXX 8 Now go, write it before them in a table, and note it in a book.
Rishi Shah
(September 24, 2014)
Big data profitability depends on your employee’s data literacy.
Ghandi If one takes care of the means, the end will take care of itself.
Chris Wiggins The main driver of my ideas has been seeing people doing it ‘wrong’
All analysis starts with an understandable set of data and algorithms.
Shayne Miel You have to turn your inputs into things the algorithm can understand.
Matthew Zeiler
Google is not really a search company. It’s a machine-learning company.
Andre Karpistsenko There is a big part of intuition in choosing the most important problem.
Josh Bloom
The first rule of data science is: don’t ask how to define data science.
Jake Porway Data scientists in the business world are all generally well-compensated.
John Foreman Talking to users is crucial because they point you in the right direction.
Caitlin Smallwood You imagine a data set & you salivate at just thinking about that data set.
Kamil Bartocha
(26. Apr 2015)
There is no fully automated Data Science. You need to get your hands dirty.
Jeff Dean
(November 2014)
Anything humans can do in 0.1 sec, the right big 10-layer network can do too.
Jeffrey Fry Having more data does not always give you the power to make better decisions.
Deepak Mohapatra
Anytime you can correlate a person, location and time, you can identify schemes.
Kaiser Fung One of the biggest myth of Big Data is that data alone produce complete answers.
Lana Klein
Analytics today is at the point of high awareness and very little understanding.
Andre Karpistsenko The idea or the initial enthusiasm is just a small part of doing something great.
Bob McDonald Data modeling, simulation, and other digital tools are reshaping how we innovate.
n.n. Multi-Criteria Decision Making is the aim to order multidimensional alternatives.
ATKearney Is Big Data the 21st century equivalent of the Industrial Revolution? We think so.
R is in the process of becoming the multi-platform lingua franca of data analysis.
Foster Provost & Tom Fawcett
Increasingly, business decisions are being made automatically by computer systems.
Manoj Sharma
(December 30, 2014)
One of the most important steps in the Data Analytics process is Feature Selection.
Richard Pugh
(25 June 2015)
In my opinion, the single most important skill for a data scientist is … Empathy.
Andre Karpistsenko The core lesson from tool-and-method explorations is that there is NO silver bullet.
Jake Porway The world will be more effective if everyone can at least converse about data science.
Jonathan Lenaghan Losing somebody else’s money is one of the most horrible sinking feelings in the world.
John Tukey Numerical quantities focus on expected values, graphical summaries on unexpected values.
Jeffrey Heer It’s an absolute myth that you can send an algorithm over raw data and have insights pop up.
John W. Tukey
The greatest value of a picture is when it forces us to notice what we never expected to see.
Tamara Dull
(March 20, 2015)
The data lake is essential for any organization who wants to take full advantage of its data.
Henri Poincaré
Mit Logik kann man Beweise führen, aber keine neuen Erkenntnisse gewinnen, dazu gehört Intuition.
John Cook
(26 March 2015)
Statistics aims to build accurate models … Machine learning aims to solve problems more directly.
Pierre Simon, Marquis de Laplace The most important questions of life are, for the most part, really only problems of probability.
Yann LeCun It’s useful for a company to have its scientists actually publish what they do. It keeps them honest.
BI Community What is the most used feature in any business intelligence solution? It is the Export to Excel button.
David Hilbert Mathematics knows no races or geographic boundaries; for mathematics, the cultural world is one country.
Gabriel Lowy
(February 24, 2015)
Big data does not change the relationship between data quality and decision outcomes. It underscores it.
John Foreman What we focus on, and this is going to sound goofy for a data scientist – is the happiness of our users.
Milton Friedman The only relevant test of the validity of a hypothesis is comparison of its predictions with experience.
Krzysztof Zawadzki
(August 30, 2014)
Finding a data scientist is hard. Finding people who understand who a data scientist is, is equally hard.
Eric Jonas The biggest thing people should be working on is problems they find interesting, exciting, and meaningful.
Linus Torvalds Bad programmers worry about the code. Good programmers worry about data structures and their relationships
Eran Levy
Mashing up multiple data sources to generate a single source of truth is an integral part of data analysis.
TJ Laher
(November 14, 2014)
Leading organizations have already begun to see serious returns on deploying a pervasive analytics strategy.
Michael Greene
To find new trends and strong patterns from large complex data sets, a strong analytics foundation is needed.
Andrew Gelman
(28 April 2015)
Measurement, measurement, measurement. It’s central to statistics. It’s central to how we learn about the world.
Ivan Vasilev The hidden layer is where the (neural) network stores it’s internal abstract representation of the training data.
P. Dawid
Causal inference is one of the most important, most subtle, and most neglected of all the problems of Statistics.
Xavier Conort The algorithms we used are very standard for Kagglers. […] We spent most of our efforts in feature engineering.
Yann LeCun Most of the knowledge in the world in the future is going to be extracted by machines and will reside in machines.
Thomas Carlyle A judicious man looks on statistics not to get knowledge, but to save himself from having ignorance foisted on him.
Arthur Samuel
[Machine learning is the] field of study that gives computers the ability to learn without being explicitly programmed.
n.n. Data does replace heuristics, hard-coded rules, assumptions and beliefs. Machine learning only enables data to do that.
Ed Burns
(August 2014)
One of the keys to success in big data analytics projects is building strong ties between data analysts and business units.
Yann LeCun The data sets are truly gigantic. There are some areas where there’s more data than we can currently process intelligently.
(June 2014)
Traditional BI looks at data through a soda straw. Big data analytics looks at data through powerful, wide-angle binoculars.
Amy Heineike The key is figuring out how you get those three things: the right problem, the right data, and the right methodology to meld.
Antoine de Saint-Exupéry A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.
Michael Walker
(October 14, 2014)
Beware of tech firms selling you data tech with fantastic claims of finding meaning in data and creating competitive advantage.
Paul Roehrig, Ben Pring
It’s a new era in business, one in which growth will be driven as much by insight and foresight as by physical products and assets.
Suetonia Palmer
Innovative statistical techniques’ are important, but the key to getting good results here is a mind-boggling amount of actual work.
Jake Porway
(October 1, 2015)
Data is not truth, and tech is not an answer in-and-of-itself. Without designing for the humans on the other end, our work is in vain.
Claudia Perlich The conversation is based around how to properly deal with even more sensitive information about where exactly people spend their lives.
Pradyumna S. Upadrashta
(February 13, 2015)
Before jumping on the Big Data bandwagon, I think it is important to ask the question of whether the problem you have requires much data.
David Puglia, FrontRange
(30. December 2014)
In comparison to IPv4’s 4.3 billion IP addresses, IPv6 can assign about 340 trillion trillion trillion addresses and corresponding devices.
Yann LeCun You don’t want to just hire clones of the same person, because then they will all want to explore the same things. You want some diversity.
Josh Wills A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.
Andrew Ng Coming up with features is difficult, time-consuming, requires expert knowledge. “Applied machine learning” is basically feature engineering.
John Foreman
If your goal is to positively impact the business, not to build a clustering algorithm that leverages storm and the Twitter API, you’ll be OK.
Erin Shellman As a data scientist, even if you don’t have the domain expertise you can learn it, and can work on any problem that can be quantitatively described.
Michele Nemschoff
(August 30, 2014)
Big data isn’t just for developers and analysts in the technical arena. In today’s digital age, big data has become a powerful tool across industries.
European Union’s General Data Protection Regulation (GDPR)
(Dec. 2016)
Organizations that use ML to make user-impacting decisions must be able to fully explain the data and algorithms that resulted in a particular decision.
H. James Harrington
If you can’t measure something, you can’t understand it. If you can’t understand it, you can’t control it. If you can’t control it, you can’t improve it.
Michael Young
For many organisations, the accessibility of the tools and products to deliver analytics and data mining has led to an increased awareness of the benefits.
Eric Jonas Graduate students, perhaps because of an adherence to sunk cost fallacy, often write really great surveys of the field at the beginning of their PhD thesis.
Yann LeCun Knowledge is some compilation of data that allows you to make decisions, and what we find today is that computers are making a lot of decisions automatically.
Foster Provost, Tom Fawcett
However, there is confusion about what exactly data science is, and this confusion could lead to disillusionment as the concept diffuses into meaningless buzz.
John Foreman It’s essential for a data science team to hire people who can really speak about the technical things they’ve done in a way that nontechnical people can understand.
David Lewis-Williams Scientists do not collect data randomly and utterly comprehensively. The data they collect are only those that they consider ‘relevant’ to some hypothesis or theory.
Ronald Fisher To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem – he may be able to say what the experiment died of.
Kaiser Fung Before getting into the methodological issues, one needs to ask the most basic question. Did the researchers check the quality of the data or just take the data as is?
W. H. Auden Thou shalt not answer questionnaires Or quizzes upon world affairs, Nor with compliance Take any test. Thou shalt not sit with statisticians nor commit A social science.
Justin Washtell
(November 3, 2014)
The central premise of predictive modeling is precisely that one size does not fit all – otherwise we would just assign the same outcome to all cases and be done with it.
William S. Cleveland
Data analysis needs to be part of the blood stream of each department and all should be aware of the workings of subject matter investigations and derive stimulus from them.
ATKearney Although Big Data processes large, diverse data sets to reveal complex relationships, humans are the crucial ingredient for interpreting the data and relationships into insights.
Valdis Krebs Innovation happens at the intersection of two or more different, yet similar, groups. Where one technology meets another, one discipline meets another, one department meets another.
Martyn Jones
(March 12, 2015)
Is Big Data really about high volumes, high velocity and high variety, or is it in fact about much noise, too much pomposity and abundant similarity leading to unnecessary high anxiety?
Tavish Srivastava
(May 19, 2015)
Machine Learning algorithms are like solving a Rubik Cube. You grapple at the beginning to figure out the hidden algorithm, but once learnt, some can even solve it in less than 7 seconds.
Yann LeCun The idea that somehow you can put a bunch of research scientists together and then put some random manager who’s not a scientist directing them doesn’t work. I’ve never ever seen it work.
Sundar Pichai Machine learning is a core, transformative way by which we’re rethinking everything we’re doing. We’re thoughtfully applying it across all our products, be it search, ads, YouTube or Play.
Julie Hunt
(April 7, 2015)
The need to analyze data is at the foundation of every effective data management strategy, whether the analysis is handled from the business perspective or the technology side of the equation.
Nikhil Buduma
(29 December 2014)
In general, choosing smart training cases is a very good idea. There’s lots of research that shows that by engineering a clever training set, you can make your neural net a lot more effective.
David McCandless By visualizing information, we turn it into a landscape that you can explore with your eyes, a sort of information map. And when you’re lost in information, an information map is kind of useful.
Kaiser Fung
(May 2015)
Story time is the moment in a report on data analysis when the author deftly moves from reporting a finding of data to the telling of stories based on assumptions that do not come from the data.
Robert Neuhaus Feature engineering and feature selection are not mutually exclusive. They are both useful. I’d say feature engineering is more important though, especially because you can’t really automate it.
William S. Cleveland
Model building is complex because it requires combining information from exploring the data and information from sources external to the data such as subject matter theory and other sets of data.
SBS documentary “The Age of Big Data” Data is becoming a powerful and most valuable commodity in 21st century. It is leading to scientific insights and new ways of understanding human behaviour. Data can also make you rich. Very rich.
Lord Kelvin When you can measure what you are speaking
about and express it in numbers, you know
something about it. When you cannot express it in
numbers, your knowledge is of a meagre and
unsatisfactory kind.
European Union’s General Data Protection Regulation (GDPR)
(Dec. 2016)
How could a result be explained, especially a result of a machine learning model, without a versioned record of what data was input to generate the result and what data was output representing the result?
William S. Cleveland
Theory, both mathematical and non-mathematical theory, is vital to data science. … Tools of data science – models and methods together with computational methods and computing systems – link data and theory.
Jeroen Janssens Data scientists love to create interesting models and exciting data visualizations. However, before they get to that point, usually much effort goes into obtaining, scrubbing, and exploring the required data.
Michal Klos
(January 28, 2015)
We are in the Golden Age of Data. For those of us on the front-lines, it doesn’t feel that way. Every step forward this technology takes, the need for deeper analytics takes two. We’re constantly catching up.
Kaiser Fung We are not saying that statisticians should not tell stories. Story-telling is one of our responsibilities. What we want to see is a clear delineation of what is data-driven and what is theory (i.e., assumptions).
Kune, Konugurthi, Agarwal, Chillarige, Buyya
Big Data and traditional data warehousing systems, however, have the similar goals to deliver business value through the analysis of data, but, they differ in the analytics methods and the organization of the data.
Suman Malekani
(January 29, 2015)
While working on Big Data & planning to implement it for the benefit of business, it is very important to explain the insights & valuable knowledge in a way that non-technical business user can actually understand.
Dr. Olly Downs
(May 18, 2015)
Most of the big data investment focus to date has been on the underlying infrastructure, while development of the applications that make use of that infrastructure – and that deliver actual business value – has lagged.
Data integration features have gained prominence during the last year as companies struggled to incorporate new data sources in their analysis, a process that can consume a sizable percentage of the total project time.
Jeff Leek
To evaluate a person’s work or their productivity requires three things:
1. To be an expert in what they do
2. To have absolutely no reason to care whether they succeed or not
3. To have time available to evaluate them.
R. A. Fisher … the null hypothesis is never proved or established, but is possibly disproved, in the course of experimentation. Every experiment may be said to exist only to give the facts a chance of disproving the null hypothesis.
Julia Evans Cleaning up data to the point where you can work with it is a huge amount of work. If you’re trying to reconcile a lot of sources of data that you don’t control like in this flight search example, it can take 80% of your time.
Eric Jonas The right thing to do is to not build a tool company but to build a consultancy based on the tools. Identify the company, identify the market, and build a consultancy. Later, if that works, you can then pivot to being a tool company.
Stephan Duquesnoy
Comment of a DeepLearning user: As a side-note, even though I’m good with pattern-based thinking, I do not have an academic background. I lack patience and feel the need to create, rather than to completely understand what I’m doing.
Eric Colson, Brad Klingenberg, Jeff Magnusson
(March 31, 2015)
Data science can directly enable a strategic differentiator if the company’s core competency depends on its data and analytic capabilities. When this happens, the company becomes supportive to data science instead of the other way around.
Eric Jonas I actually think a lot of the future is in small data …. As the big data hype cycle crests, we’re going to see more and more people recognizing that what they really want to be doing is asking interesting questions of smaller data sets.
Analise Polsky
Improving Visual Data Discovery:
1. Always have new data sources.
2. Always have new techniques.
3. Always have new tools and platforms.
Visual data discovery is not once and done. It is an iterative process that requires communication and exploration.
(March 4th, 2015)
Data Science has its own language. So, if you want to have at least a slight chance of surviving in the enterprise world of tomorrow -with its obsessive focus on collecting and analyzing data- you better have started yesterday with learning this terminol.
Bill Franks
(December 10, 2015)
One of the legendary events in the history of analytics was the original Netflix prize. The event led to a terrific example of the need to focus on not only theoretical results, but also pragmatically achievable results, when developing analytic processes.
Brandon Rohrer
(Dec 19, 2015)
Before data science can build the solution to simplify your life or make you lots of money, you have to give it some high quality raw materials to work with. Just like making a pizza, the better the ingredients you start with, the better the final product.
Richard Fichera Part of Hadoop’s appeal is that it is not specifically optimized for any specific solution or data type but rather a general framework for parallel processing, so your developers and data scientists can add any relevant data, whatever its format or source.
Richard A. Becker, William S. Cleveland
Making graphs is very basic to data analysis. Whether you use the leading edge of statistical methods, or whether you want to quickly see the main features of your data, graphs are a must. They are the single most powerful class of tools for analyzing data.
Vladimir N. Vapnik
After the success of the SVM in solving real-life problems, the interest in statistical learning theory significantly increased. For the first time, abstract mathematical results in statistical learning theory have a direct impact on algorithmic tools of data analysis.
Zachary Chase Lipton
(January 2015)
Generally, the systems implementation of machine learning methodology and ongoing software maintenance challenges are an understudied area that will continue to grow in importance as machine learning systems become more commonplace in commercial and open source software.
Ferris Jumah
(Sep 3, 2014)
We see that machine learning, data mining, data analysis and statistics are all highly ranking skills in the (Data Science Skill) network. This indicates that being able to understand and represent data mathematically, with statistical intuition, is a key skill for data scientists.
Kune, Konugurthi, Agarwal, Chillarige, Buyya
Big Data technologies are being adopted widely for information exploitation with the help of new analytics tools and large scale computing infrastructure to process huge variety of multi-dimensional data in several areas ranging from business intelligence to scientific explorations.
H. Simon The aim … is to provide a clear and rigorous basis for determining when a causal ordering can be said to hold between two variables or groups of variables in a model . . . . The concepts refer to a model-a system of equations-and not to the ‘real’ world the model purports to describe.
Dean Abbott
(December 06, 2015)
This kind of mindset is not learned in a university program; it is part of the personality of the individual. Good predictive modelers need to have a forensic mindset and intellectual curiosity, whether or not they understand the mathematics enough to derive the equations for linear regression.
Rao Naveen
There’s been a lot of talk about trying to make AI work on existing infrastructure. But the sad reality is that you’re always going to end up with something that’s far less than state-of-the-art. And I don’t mean it will be 30 or 40 percent slower. It’s more likely to be a thousand times slower
Mark Barrenechea
(September 11, 2015)
Digital leaders know their data. They convert their information into actionable business insight. Considering that more data is shared online every second today than was stored in the entire Internet 20 years ago, it’s no wonder that differentiating products and services requires advanced tools.
Jonas Salk Reason alone will not serve. Intuition alone can be improved by reason, but reason alone without intuition can easily lead the wrong way … both are necessary. For myself, that’s how my mind works, and that’s how I work … It’s this combination that must be recognized and acknowledged and valued.
Jeffrey Heer, Michael Bostock, Vadim Ogievetsky
Graphical Perception Experiments find that spatial position (as in a scatter plot or bar chart) leads to the most accurate decoding of numerical data and is generally preferable to visual variables such as angle, one-dimensional length, two-dimensional area, three-dimensional volume, and color saturation.
Lana Klein
Remember that the most critical thing is not building analytic solution but making sure that your organization starts using it: that means creating buy-in, working to build adoption, educating and training, redesigning processes to include analytics. Give it time, be persistent, improve and results will follow!
Enric Junqué de Fortuny, David Martens, Foster Provost
This study provides a clear illustration that larger data indeed can be more valuable assets for predictive analytics. This implies that institutions with larger data assets – plus the skill to take advantage of them – potentially can obtain substantial competitive advantage over institutions without such access or skill.
Nikhil Buduma
(29 December 2014)
[In Neural Networks] It is not required that a neuron has its outlet connected to the inputs of every neuron in the next layer. In fact, selecting which neurons to connect to which other neurons in the next layer is an art that comes from experience. Allowing maximal connectivity will more often than not result in overfitting.
Christophe Bourguignat
(Sep 16, 2014)
In real organizations, people need dead simple story-telling – Which features are you using ? How your algorithms work ? What is your strategy ? etc. … If your models are not parsimonious enough, you risk to lose the audience confidence. Convincing stackeholders is a key driver for success, and people trust what they understand.
Mark van Rijmenam
(October 16, 2014)
Although such Business Intelligence is still quite common and does give you at least some insights, the fast-changing world of today requires a different approach. Organisations today should strive for a holistic overview of their internal and external data that is analysed on the spot and returned graphically via live storylines.
John Von Neumann The sciences do not try to explain, they hardly even try to interpret, they mainly make models. By a model is meant a mathematical construct which, with the addition of certain verbal interpretations, describes observed phenomena. The justification of such a mathematical construct is solely and precisely that it is expected to work.
The Economist
The end of data scientists. Data science moves from the specialist to the everyman. Familiarity with data analysis becomes part of the skill set of ordinary business users, not experts with “analyst” in their titles. Organizations that use data to make decisions are more successful, and those that don’t use data begin to fall behind.
Foster Provost & Tom Fawcett
On a scale less grand, but probably more common, data-analytics projects reach into all business units. Employees throughout these units must interact with the data-science team. If these employees do not have a fundamental grounding in the principles of data-analytic thinking, they will not really understand what is happening in the business.
Gil Allouche
(January 9, 2015)
Improvements in technology and big data trends have given rise to improvements in machine learning. The sheer volume of data is growing exponentially, and companies are looking for faster speeds and real-time analytics. Cognitive computing combines machine learning and artificial intelligence to go beyond data mining and provide actionable insights.
Mark van Rijmenam
(September 2, 2014)
All these new Big Data applications require a new way of working. As a result General Motors is currently undergoing a massive, cultural, change to become data-driven; hiring thousands of new employees will have a profound affect on the company culture, but in the end all existing and new employees must learn and adapt to this new, data-driven and information-centric, culture.
Hal Varian If you are looking for a career where your services will be in high demand, you should find something where you provide a scarce, complementary service to something that is getting ubiquitous and cheap. So what’s getting ubiquitous and cheap? Data. And what is complementary to data? Analysis. So my recommendation is to take lots of courses about how to manipulate and analyze data.
Joyce Jackson In many applications, particularly in the business domain, the data is not stationary, but rather changing and evolving. This changing data may make previously discovered patterns invalid and as a result, there is clearly a need for incremental methods that are able to update changing models, and for strategies to identify and manage patterns of temporal change in knowledge bases.
Wojciech Bolanowski
Numerous changes and innovations have come to life recently. The pace of digital revolution is unimaginable concerning it keeps on increasing. There is no doubt most of approaching digital changes are potentially disruptive to older habits, businesses, beliefs. Unconditionally they are changing former way of life on the globe. They push whole humanity into something very new and completely unknown.
Jeff Leek
(Feb. 14, 2014)
Since most people performing data analysis are not statisticians there is a lot of room for error in the application of statistical methods. This error is magnified enormously when naive analysts are given too many “researcher degrees of freedom”. If a naive analyst can pick any of a range of methods and does not understand how they work, they will generally pick the one that gives them maximum benefit.
Marissa Mayer
(January 24, 2013)
The Web is so vast … you need to extend categorization and make sense of the content and have a Web ordered for you … One of the key pieces is you have to understand and decide what the Ontology of entities is. Meaning how things are named and how are they organized into hierarchies … By mapping people’s search habits you pull all their content together and have a feed of information that is the web ordered for you.
Some decisions you need to make are big enough to change the course for your business. And your past experiences may not be good predictors of the future. More data are within your reach to understand what was previously unknown. Sophisticated analytical tools are available to you to ‘see’ a wider range of possibilities and evaluate them quickly. Now is a good time for an upgrade in your decision making capabilities.
Judy Selby
(April 20, 2015)
Big Data’s undeniable impact on companies’ goodwill and reputation has permeated the landscape of corporate valuation. Recent research confirms that companies need to face the new normal whereby corporate reputations suffer after mishaps with data under their control. Today’s companies must appreciate that their use, misuse and governance of Big Data can have an impactful effect on their goodwill and resulting valuation.
Gordon S. Linoff
(September 15, 2014)
In any case, I come to the conclusion that Data Science is just another term in a long-line of terms. Whether called statistics or customer analytics or data mining or analytics or data science, the goal is the same. Computers have been and are gathering incredible amounts of data about people, businesses, markets, economies, needs, desires, and solutions – there will always be people who take up the challenge of transforming the data into solutions.
Avi Kalderon
(JAN 27, 2015)
Without effective data governance and data management, big data can mean big problems for many organizations already struggling with more data than they can handle. That “lake” they are building can very easily become a “cesspool” without appropriate data management practices that are adapted to this new platform. The solution? Firms need to actively adapt their data governance and data management capabilities – from implementing to ongoing maintenance.
Jeffrey P. Bigham
A machine isn’t a human. It’s not going to necessarily incorporate bias even from biased training data in the same way that a human would. Machine learning isn’t necessarily going to adopt-for lack of a better word-a clearly racist bias. It’s likely to have some kind of much more nuanced bias that is far more difficult to predict. It may, say, come up with very specific instances of people it doesn’t want to hire that may not even be related to human bias.
Mkhuseli Mthukwane
(August 27, 2015)
Data Science forms the very substratum of an Analytics Practitioners’ work, it’s what sets us apart from Statisticians or Mathematicians. However in some instances we cannot rely on it alone, we need to employ other measures to increase its definitiveness. In any event I am sure many Data Scientists use math and other means to augment the potency of their Analytics, some not even scientific at all. It is undeniably prudent to do so where necessary, especially in fields that demand a higher standard of accuracy and care.
Mark van Rijmenam
(October 16, 2014)
In the fast moving world of today, data is being created at lightning speed. Data comes from an infinite variety of sources and all this data can be used to discover valuable business insights. Combining internal and external data can enable organisations to beat the competition, as the analysis will provide valuable insights. The more business users that work with such insights, the better your organisation will become. Organisations should therefore strive for a data-driven, information-centric culture, where every business user makes decisions based on data.
Durgesh Kaushik
(October 9, 2015)
Analytics no matter how advanced they are, does not remove the need for human insights. On the contrary, there is a compelling need for skilled people with the ability to understand data, think from the business point of view and come up with insights. For this very reason technology professionals with Analytics skill are finding themselves in high demand as businesses look to harness the power of Big Data. A professional with the Analytical skills can master the ocean of Big Data and become a vital asset to an organization, boosting the business and their career.
Foster Provost & Tom Fawcett
It is important to understand data science even if you never intend to do it yourself, because data analysis is now so critical to business strategy. Businesses increasingly are driven by data analytics, so there is great professional advantage in being able to interact competently with and within such businesses. Understanding the fundamental concepts, and having frameworks for organizing data-analytic thinking not only will allow one to interact competently, but will help to envision opportunities for improving data-driven decision-making, or to see data-oriented competitive threads.
Shahbaz Ali
(DEC 24, 2014)
When data is locked in silos, organizations are unable to find and include all enterprise data for use with big data analytics tools. Planning to implement a data centric data management strategy enables the distributed metadata repository to be a source for analytics tools, as it can be used to provide real-time insight, without having to migrate data from silos to a separate analytics platform. It also enhances the quality of results, because having more relevant data often produces more accurate analysis. If organizations can harness all of its data, they will attain a greater competitive advantage.
Philipp Max Hartmann, Mohamed Zaki, Niels Feldmann, Andy Neely In the field of ‘big data’, Gartner identified five different types of data source used to ‘exploit big data’ in a company (Buytendijk et al., 2013): ‘Operational data comes from transaction systems, the monitoring of streaming data and sensor data; Dark data is data that you already own but don’t use: emails, contracts, written reports and so forth; Commercial data may be structured or unstructured, and is purchased from industry organisations, social media providers and so on; Social data comes from Twitter, Facebook and other interfaces; Public data can have numerous formats and topics, such as economic data, socio-demographic data and even weather data.’
Tracey Wallace
(September 8, 2014)
Our Collective Data Science Duty: Here’s the thing, technology is empowering the public in never before seen ways, and data is the backbone of that shift. Between wearable tech and digital identity platforms, people are creating more data every day than has ever been created in decades, no, centuries past. Each of us is essentially our own personal data scientist, and those working in the digital space have very much been their own statisticians for quite some time. It’s why platforms like Google Analytics, Omniture and more are so popular across the industry. They put the power of analytics in the hands of users, requiring little training but returning lots of measurability.
Tom Phelan
(February 10, 2015)
An agile environment is one that’s adaptive and promotes evolutionary development and continuous improvement. It fosters flexibility and champions fast failures. Perhaps most importantly, it helps software development teams build and deliver optimal solutions as rapidly as possible. That’s because in today’s competitive market chock-full of tech-savvy customers used to new apps and app updates every day and copious amounts of data with which to work, IT teams can no longer respond to IT requests with months-long development cycles. It doesn’t matter if the request is from a product manager looking to map the next rev’s upgrade or a data scientist asking for a new analytics model.
Jeff Leek
Data science done well looks easy – and that is a big problem for data scientists. The really tricky twist is that bad data science looks easy too. You can scrape a data set off the web and slap a machine learning algorithm on it no problem. So how do you judge whether a data science project is really ‘hard’ and whether the data scientist is an expert? Just like with anything, there is no easy shortcut to evaluating data science projects. You have to ask questions about the details of how the data were collected, what kind of biases might exist, why they picked one data set over another, etc. In the meantime, don’t be fooled by what looks like simple data science – it can often be pretty effective.
Mike Barlow
Top takeaways from my interviews with experts from organizations offering AI products and services:
• AI is too big for any single device or system
• AI is a distributed phenomenon
• AI will deliver value to users through devices, but the heavy lifting will be performed in the cloud
• AI is a two-way street, with information passed back and forth between local devices and remote systems
• AI apps and interfaces will be designed and engineered increasingly for nontechnical users
• Companies will incorporate AI capabilities into new products and services routinely
• A new generation of AI-enriched products and services will be connected and supported through the cloud
• AI in the cloud will become a standard combination, like peanut butter and jelly
Alice Zheng
If we think of training the model as a part of it, then even after you’ve trained a model and evaluated it and found it to be good by some evaluation metric standards, when you deploy it, where it actually goes and faces users, then there’s a different set of metrics that would impact the users. You might measure: how long do users actually interact with this model? Does it actually make a difference in the length of time? Did they used to interact less and now they’re more engaged, or vice versa? That’s different from whatever evaluation metric that you used, like AUC or per class accuracy or precision and recall. … It’s probably not enough to just say this model has a .85 F1 score and expect someone who has not done any data science to understand what that means. How good are the results? What does it actually mean to the end users of the product?
Vincent Granville
(November 15, 2014)
A different perspective on what data scientists are capable of:
• Imagine dozens of scenarios and rank them by chance of occurring
• Get siloed data from various departments (finance, sales, marketing, product, IT)
• Analyze the data in connection with the scenarios (including checking data validity)
• Get external data (competitive intelligence) as needed
• Find the causes (not just correlations)
• Find the remedies
• Detect issues well before anyone else can see them, by looking in summary data
• Complete the analysis with a 48 hours turnaround
Such a data scientist who can save billions to a company, is usually not hired, for the following reasons
• Companies are looking for coders, not business solvers, when they hire a data guru, despite claiming the contrary
• A data scientist without Python on his resume is unlikely to ever get hired
• Hard work gets rewarded, smart work does not.
Strategy& There is no general rule dictating how organizations should navigate the stages of big data maturity. They must each decide for themselves, based on their own situation – the competitive environment they are operating in, their business model, and their existing internal capabilities. In less-advanced sectors, with executives still grappling with existing data, making intelligent use of what they already possess may have a substantial impact on decision making.
The main priorities for executives are to:
• develop a clear (big) data strategy;
• prove the value of data in pilot schemes;
• identify the owner for “big data” in the organization and formally establish a “Chief Data Scientist” position (where applicable);
• recruit/train talent to ask the right questions and technical personnel to provide the systems and tools to allow data scientists to answer those questions;
• position big data as an integral element of the operating model; and establish a data-driven decision culture and launch a communication campaign around it.
Mark van Rijmenam
(31 Dec. 2014)
Pattern Analytics can be defined as a discipline of Big Data that enables business leaders to understand how different variables of the business interact and are linked with each other. Variables can be of any kind and within any data source, structured as well as unstructured. Such patterns can indicate opportunities for innovation or threats of disruption for your business and therefore require action. Finding patterns within the data and sifting it out is difficult. Machine learning can contribute in helping us humans find patterns that are relevant, but too difficult for us to see. This enables organizations to find patterns they act on. Business leaders can learn from these patterns and use them in their decision-making process. Business leaders therefore should rely less on their gut feeling and years of experience, and more on the data. Pattern Analytics does not require predefined models; the algorithms will do the work for you and find whatever is relevant in a combination of large sets of data. The key with pattern analytics is automatically revealing intelligence that is hidden in the data and these insights will help you grow your business.
Alice Zheng
There’s structure in it, but it’s kind of a different form. … It’s spit out by machines and programs. There’s structure, but that structure is difficult to understand for humans. … So, you can’t just throw all of it into an algorithm and expect the algorithm to be able to make sense of it. You really have to process the features, do a lot of pre-processing, and first do things like extract out the frequent sequences, maybe, or figure out what’s the right way to represent IP addresses, for instance. Maybe you don’t want to represent latency by the actual latency number, which could have a very skewed distribution, with lots and lots of large numbers. You might want to assign them into bins or something. There are a lot of things that you need to do to get the data into a format that’s friendly to the model, and then you want to choose the right model. Maybe after you choose the model, you realize this model really is suitable for numeric data and not categorical data. Then you need to go back to the feature engineering part and figure out the best way to represent the data. … I hesitate to say anything critical because half of my friends are in machine learning, which is all about algorithms. I think we already have enough algorithms. It’s not that we don’t need more and better algorithms. I think a much, much bigger challenge is data itself, features, and feature engineering.
Michael Jordan
Graphical models are a marriage between probability theory and graph theory. They provide a natural tool for dealing with two problems that occur throughout applied mathematics and engineering — uncertainty and complexity — and in particular they are playing an increasingly important role in the design and analysis of machine learning algorithms. Fundamental to the idea of a graphical model is the notion of modularity — a complex system is built by combining simpler parts. Probability theory provides the glue whereby the parts are combined, ensuring that the system as a whole is consistent, and providing ways to interface models to data. The graph theoretic side of graphical models provides both an intuitively appealing interface by which humans can model highly-interacting sets of variables as well as a data structure that lends itself naturally to the design of efficient general-purpose algorithms. Many of the classical multivariate probabalistic systems studied in fields such as statistics, systems engineering, information theory, pattern recognition and statistical mechanics are special cases of the general graphical model formalism — examples include mixture models, factor analysis, hidden Markov models, Kalman filters and Ising models. The graphical model framework provides a way to view all of these systems as instances of a common underlying formalism. This view has many advantages — in particular, specialized techniques that have been developed in one field can be transferred between research communities and exploited more widely. Moreover, the graphical model formalism provides a natural framework for the design of new systems.
Istvan Hajnal
(February 23, 2015)
There are few trends in the Big Data and Data Science world that can be of interest to market researchers:
• Visualization. There is a lot of interest in the Big Data and Data Science world for everything that has to do with Visualization. I’ll admit that sometimes it is Visualize to Impress rather than to Inform, but when it comes to informing clearly, communicating in a simple and understandable way, storytelling, and so on, we market researchers have a head start.
• Natural Language Processing. One of the 4 V’s of Big Data stands for Variety. Very often this refers to unstructured data, which sometimes refers to free text. Big Data and Data Science folks, for instance, start to analyze text that is entered in the free fields of production systems. This problem is not disimilar to what we do when we analyse open questions. Again market research has an opportunity to play a role here. By the way, it goes beyond sentiment analysis. Techniques that I’ve seen successfully used in the Big Data / Data Science world are topic generation and document classification. Think about analysing customer complaints, for instance.
• Deep Learning. Deep learning risks to become the next fad, largely because of the name Deep. But deep here does not refer to profound, but rather to the fact that you have multiple hidden layers in a neural network. And a neural network is basically a logistic regression (OK, I simplify a bit here). So absolutely no magic here, but absolutely great results. Deep learning is a machine learning technique that tries to model high-level abstractions by using so called learning representations of data where data is transformed to a representation of that data that is easier to use with other Machine Learning techniques. A typical example is a picture that constitutes of pixels. These pixels can be represented by more abstract elements such as edges, shapes, and so on. These edges and shapes can on their turn be furthere represented by simple objects, and so on. In the end, this example, leads to systems that are able to reasonably describe pictures in broad terms, but nonetheless useful for practical purposes, especially, when processing by humans is not an option. How can this be applied in Market Research? Already today (shallow) Neural networks are used in Market Research. One research company I know uses neural networks to classify products sold in stores in broad buckets such as petfood, clothing, and so on, based on the free field descriptions that come with the barcode data that the stores deliver.

5 thoughts on “Quotes”

  1. Dear Michael !
    I liked your Quotes really. You can see my work at Also you can 2 video here. It’s original for kdnuggets post 😉
    my best regards


    • Hello Andy, thank you very much for your hint. I had a look at your list and found 40 which were not in my list right now. My list now contains >700 from which I publish one a day. So at least another 2 Years …. There are some typos in your list, e.g “better plac”. You might have a look. Thank you very much, Michael


      • Hello Michael !
        Thanks a lot for your attention to my humble work. I have fixed typo “better place” and hope for best. How did you find videos for #1, #2 interviews quotes ?
        I hope you enjoy it too :)) I saw your web site and found it very useful for me.
        So thanks again for your attention.


  2. Very nice post. I simply stumbled upon your weblog and wanted to say that
    I’ve really loved browsing your blog posts. In any case
    I will be subscribing for your rss feed and I’m hoping you write again soon!


  3. hatemgkotb said:

    This is simply AMAZING!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s