|Quotes| = 264
Jake Porway Data is new eyes.
Satya Nadella
(February 21, 2017)
Bots are the new apps.
Daniel Tunkelang Failure is a great teacher.
Jonathan Lenaghan Having no competitors is bad.
Anna Smith Work on positivity and patience.
Isabelle Nuage
(February 24, 2015)
Big Data by itself is of little use.
Ray Major
(November 6, 2014)
Better Predictions = Higher Profits.
Ed Burns
(June 2015)
Machine learning automates analytics.
John Langford
Prefer simplicity in algorithm design.
Eric Jonas Life is too short to not be having fun.
Kira Radinsky Working with data is like an adventure.
William Edwards Deming
In God we trust, all others bring data.
David Cearley
Every app now needs to be an analytic app.
Caitlin Smallwood It’s very obvious how different people are.
Patrick Gerald
(May 22, 2015)
No good business succeeds without analytics.
Joey Zwicker
(12. February 2015)
Hadoop has an irreparably fractured ecosystem.
Timothy E. Carone
(January 30, 2015)
Big Data is the oxygen for autonomous systems.
Amy Heineike Data science is already kind of a broad church.
Kamil Bartocha
(26. Apr 2015)
Academia and business are two different worlds.
Ryan Irwin
(August 20, 2014)
Have a sense of humor, and never stop learning!
John Keats Nothing ever becomes real till it is experienced.
Daniel Tunkelang Anything that looks interesting is probably wrong.
John Foreman I find it tough to find and hire the right people.
Kira Radinsky The person you hire has to understand the business.
Miguel de Cervantes from Don Quixote By a small sample, we may judge of the whole piece.
Victor Hugo Nothing is stronger than an idea whose time has come.
Larry Hardesty
(August 15, 2014)
In the age of big data, visualization tools are vital.
Joel Cadwell
(August 21, 2014)
R makes it so easy to fit many models to the same data.
Attributed to Einstein Models should be as simple as possible, but not more so.
Rob Kitchin Big data should complement small data, not replace them.
Ed Burns
(September 2015)
Big data analytics architecture requires integration push.
Niels Bohr Prediction is very difficult, especially about the future.
Christopher Bishop
Half of what we do at Microsoft Research is Machine Learning.
Kamil Bartocha
(26. Apr 2015)
You will spend most of your time cleaning and preparing data.
Daniel Tunkelang Search is the problem at the heart of the information economy.
Victor Hu It is hard to know what you really need until you dig into it.
Hans Rosling The idea is to go from numbers to information to understanding.
Isaiah, XXX 8 Now go, write it before them in a table, and note it in a book.
Rishi Shah
(September 24, 2014)
Big data profitability depends on your employee’s data literacy.
Ghandi If one takes care of the means, the end will take care of itself.
Chris Wiggins The main driver of my ideas has been seeing people doing it ‘wrong’
Jake Porway Every company has data that can help make the world a better place.
All analysis starts with an understandable set of data and algorithms.
It´s no longer good enough to make decisions based on intuition alone.
Shayne Miel You have to turn your inputs into things the algorithm can understand.
Matthew Zeiler
Google is not really a search company. It’s a machine-learning company.
Andre Karpistsenko There is a big part of intuition in choosing the most important problem.
Josh Bloom
The first rule of data science is: don’t ask how to define data science.
Jake Porway Data scientists in the business world are all generally well-compensated.
John Foreman Talking to users is crucial because they point you in the right direction.
Caitlin Smallwood You imagine a data set & you salivate at just thinking about that data set.
Kamil Bartocha
(26. Apr 2015)
There is no fully automated Data Science. You need to get your hands dirty.
Jeff Dean
(November 2014)
Anything humans can do in 0.1 sec, the right big 10-layer network can do too.
Jeffrey Fry Having more data does not always give you the power to make better decisions.
Deepak Mohapatra
Anytime you can correlate a person, location and time, you can identify schemes.
Kaiser Fung One of the biggest myth of Big Data is that data alone produce complete answers.
Lana Klein
Analytics today is at the point of high awareness and very little understanding.
Andre Karpistsenko The idea or the initial enthusiasm is just a small part of doing something great.
Bob McDonald Data modeling, simulation, and other digital tools are reshaping how we innovate.
n.n. Multi-Criteria Decision Making is the aim to order multidimensional alternatives.
ATKearney Is Big Data the 21st century equivalent of the Industrial Revolution? We think so.
R is in the process of becoming the multi-platform lingua franca of data analysis.
Foster Provost & Tom Fawcett
Increasingly, business decisions are being made automatically by computer systems.
Manoj Sharma
(December 30, 2014)
One of the most important steps in the Data Analytics process is Feature Selection.
Richard Pugh
(25 June 2015)
In my opinion, the single most important skill for a data scientist is … Empathy.
Andre Karpistsenko The core lesson from tool-and-method explorations is that there is NO silver bullet.
Brian Caffo Like nearly all aspects of statistics, good modeling decisions are context dependent.
Joel Cadwell
(August 21, 2014)
Naming is an art, yet be careful not to add surplus meaning by being overly creative.
Jake Porway The world will be more effective if everyone can at least converse about data science.
Jonathan Lenaghan Losing somebody else’s money is one of the most horrible sinking feelings in the world.
John Tukey Numerical quantities focus on expected values, graphical summaries on unexpected values.
Tony Fisher
(May 15, 2015)
Today, big data is considered a differentiator. Soon, it will be considered a commodity.
Jeffrey Heer It’s an absolute myth that you can send an algorithm over raw data and have insights pop up.
John W. Tukey
The greatest value of a picture is when it forces us to notice what we never expected to see.
Tamara Dull
(March 20, 2015)
The data lake is essential for any organization who wants to take full advantage of its data.
Thomas J. Watson The great accomplishments of man have resulted from the transmission of ideas and enthusiasm.
John Mount
(April 19, 2013)
Machine learning and statistics may be the stars, but data science orchestrates the whole show.
When you staff a project with people who are skilled and fascinated by the problem, you get gold.
Henri Poincaré
Mit Logik kann man Beweise führen, aber keine neuen Erkenntnisse gewinnen, dazu gehört Intuition.
John Cook
(26 March 2015)
Statistics aims to build accurate models … Machine learning aims to solve problems more directly.
Pierre Simon, Marquis de Laplace The most important questions of life are, for the most part, really only problems of probability.
Pradyumna S. Upadrashta
(February 13, 2015)
You shouldn’t be collecting Big Data under the premise that more data is better, cooler, sexier, etc.
Yann LeCun It’s useful for a company to have its scientists actually publish what they do. It keeps them honest.
BI Community What is the most used feature in any business intelligence solution? It is the Export to Excel button.
David Hilbert Mathematics knows no races or geographic boundaries; for mathematics, the cultural world is one country.
Gabriel Lowy
(February 24, 2015)
Big data does not change the relationship between data quality and decision outcomes. It underscores it.
John Foreman What we focus on, and this is going to sound goofy for a data scientist – is the happiness of our users.
Milton Friedman The only relevant test of the validity of a hypothesis is comparison of its predictions with experience.
W. Edwards Deming The only useful function for a statistician is to make predictions, and thus provide a basis for action.
Krzysztof Zawadzki
(August 30, 2014)
Finding a data scientist is hard. Finding people who understand who a data scientist is, is equally hard.
Eric Jonas The biggest thing people should be working on is problems they find interesting, exciting, and meaningful.
Linus Torvalds Bad programmers worry about the code. Good programmers worry about data structures and their relationships
A. N. Whitehead The aim of science is to seek the simplest explanation of complex facts… Seek simplicity and distrust it.
Eran Levy
Mashing up multiple data sources to generate a single source of truth is an integral part of data analysis.
TJ Laher
(November 14, 2014)
Leading organizations have already begun to see serious returns on deploying a pervasive analytics strategy.
H.G. Wells/Samuel S. Wilks
Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.
Michael Greene
To find new trends and strong patterns from large complex data sets, a strong analytics foundation is needed.
Andrew Gelman
(28 April 2015)
Measurement, measurement, measurement. It’s central to statistics. It’s central to how we learn about the world.
Ivan Vasilev The hidden layer is where the (neural) network stores it’s internal abstract representation of the training data.
P. Dawid
Causal inference is one of the most important, most subtle, and most neglected of all the problems of Statistics.
R. A. Fisher … the actual and physical conduct of an experiment must govern the statistical procedure of its interpretation.
Xavier Conort The algorithms we used are very standard for Kagglers. […] We spent most of our efforts in feature engineering.
Yann LeCun Most of the knowledge in the world in the future is going to be extracted by machines and will reside in machines.
Thomas Carlyle A judicious man looks on statistics not to get knowledge, but to save himself from having ignorance foisted on him.
Big data is about infrastructure, while analytics is about enabling informed decisions and measuring business impact.
Pelin Thorogood
(August 21, 2014)
We really need people who have the left brain and right working in balance, while also knowledgeable of the business.
Arthur Samuel
[Machine learning is the] field of study that gives computers the ability to learn without being explicitly programmed.
Hadley Wickham Any real data analysis involves data manipulation (sometimes called wrangling or munging), visualization and modelling.
n.n. Data does replace heuristics, hard-coded rules, assumptions and beliefs. Machine learning only enables data to do that.
Hilaire Belloc Statistics are the triumph of the quantitative method, and the quantitative method is the victory of sterility and death.
Ed Burns
(August 2014)
One of the keys to success in big data analytics projects is building strong ties between data analysts and business units.
Yann LeCun The data sets are truly gigantic. There are some areas where there’s more data than we can currently process intelligently.
(June 2014)
Traditional BI looks at data through a soda straw. Big data analytics looks at data through powerful, wide-angle binoculars.
Amy Heineike The key is figuring out how you get those three things: the right problem, the right data, and the right methodology to meld.
Mike Barlow
Thanks to a perfect storm of recent advances in the tech industry, AI has risen from the ashes and regained its aura of cool.
Antoine de Saint-Exupéry A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.
Michael Walker
(October 14, 2014)
Beware of tech firms selling you data tech with fantastic claims of finding meaning in data and creating competitive advantage.
Paul Roehrig, Ben Pring
It’s a new era in business, one in which growth will be driven as much by insight and foresight as by physical products and assets.
Suetonia Palmer
Innovative statistical techniques’ are important, but the key to getting good results here is a mind-boggling amount of actual work.
Foster Provost & Tom Fawcett
Take big data to mean datasets that are too large for traditional data-processing systems and that therefore require new technologies.
Jake Porway
(October 1, 2015)
Data is not truth, and tech is not an answer in-and-of-itself. Without designing for the humans on the other end, our work is in vain.
Claudia Perlich The conversation is based around how to properly deal with even more sensitive information about where exactly people spend their lives.
Pradyumna S. Upadrashta
(February 13, 2015)
Before jumping on the Big Data bandwagon, I think it is important to ask the question of whether the problem you have requires much data.
David Puglia, FrontRange
(30. December 2014)
In comparison to IPv4’s 4.3 billion IP addresses, IPv6 can assign about 340 trillion trillion trillion addresses and corresponding devices.
Yann LeCun You don’t want to just hire clones of the same person, because then they will all want to explore the same things. You want some diversity.
Josh Wills A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.
Andrew Ng Coming up with features is difficult, time-consuming, requires expert knowledge. “Applied machine learning” is basically feature engineering.
John Foreman
If your goal is to positively impact the business, not to build a clustering algorithm that leverages storm and the Twitter API, you’ll be OK.
Erin Shellman As a data scientist, even if you don’t have the domain expertise you can learn it, and can work on any problem that can be quantitatively described.
John W. Tukey The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
Michele Nemschoff
(August 30, 2014)
Big data isn’t just for developers and analysts in the technical arena. In today’s digital age, big data has become a powerful tool across industries.
European Union’s General Data Protection Regulation (GDPR)
(Dec. 2016)
Organizations that use ML to make user-impacting decisions must be able to fully explain the data and algorithms that resulted in a particular decision.
Foster Provost & Tom Fawcett
At a high level, data science is a set of fundamental principles that support and guide the principled extraction of information and knowledge from data.
H. James Harrington
If you can’t measure something, you can’t understand it. If you can’t understand it, you can’t control it. If you can’t control it, you can’t improve it.
John W. Tukey Far better an approximate answer to the right question which is often vague, than an exact answer to the wrong question which can always be made precise.
(August 20, 2015)
Every number has a story. As a data scientist, you have the incredible job of digging in and analyzing massive sets of numbers to find what that story is.
Michael Young
For many organisations, the accessibility of the tools and products to deliver analytics and data mining has led to an increased awareness of the benefits.
Eric Jonas Graduate students, perhaps because of an adherence to sunk cost fallacy, often write really great surveys of the field at the beginning of their PhD thesis.
Andre Karpistsenko Getting through life, through those uncertainties in a way, when you look back and see things still connect and exist, that’s the biggest measure of success.
Yann LeCun Knowledge is some compilation of data that allows you to make decisions, and what we find today is that computers are making a lot of decisions automatically.
Foster Provost, Tom Fawcett
However, there is confusion about what exactly data science is, and this confusion could lead to disillusionment as the concept diffuses into meaningless buzz.
Third Nature
Data warehouses have not been able to keep up with business demands for new sources of information, new types of data, more complex analysis and greater speed.
Eric Jonas When I evaluate machine learning papers, what I am looking to find out is whether the technique worked or not. This is something that the world needs to know …
Tess Nesbitt Sampling – analyzing representative portions of the available information – can help speed development time on models, enabling them to be deployed more quickly.
John Foreman It’s essential for a data science team to hire people who can really speak about the technical things they’ve done in a way that nontechnical people can understand.
David Lewis-Williams Scientists do not collect data randomly and utterly comprehensively. The data they collect are only those that they consider ‘relevant’ to some hypothesis or theory.
Fred Brooks Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowchart; it’ll be obvious.
Ronald Fisher To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem – he may be able to say what the experiment died of.
Kaiser Fung Before getting into the methodological issues, one needs to ask the most basic question. Did the researchers check the quality of the data or just take the data as is?
W. H. Auden Thou shalt not answer questionnaires Or quizzes upon world affairs, Nor with compliance Take any test. Thou shalt not sit with statisticians nor commit A social science.
Justin Washtell
(November 3, 2014)
The central premise of predictive modeling is precisely that one size does not fit all – otherwise we would just assign the same outcome to all cases and be done with it.
John Foreman Vendors are there to sell you a tool for a problem you may or may not have yet, and they’re very good at convincing you that you need it whether you actually need it or not.
Victor Hu Hiring data scientists is very exciting at this time because in some ways there are no established guidelines on how to do it. People have skills in so many different areas.
William S. Cleveland
Data analysis needs to be part of the blood stream of each department and all should be aware of the workings of subject matter investigations and derive stimulus from them.
Istvan Hajnal
(February 23, 2015)
My advice to the market research world is to stop conceptualizing so much when it comes to Big Data and Data Science and simply apply the new techniques there were appropriate.
ATKearney Although Big Data processes large, diverse data sets to reveal complex relationships, humans are the crucial ingredient for interpreting the data and relationships into insights.
Valdis Krebs Innovation happens at the intersection of two or more different, yet similar, groups. Where one technology meets another, one discipline meets another, one department meets another.
(June 2015)
IBM will educate one million data scientists and data engineers on Apache Spark through extensive partnerships with AMPLab, DataCamp, MetiStream, Galvanize and Big Data University MOOC.
Martyn Jones
(March 12, 2015)
Is Big Data really about high volumes, high velocity and high variety, or is it in fact about much noise, too much pomposity and abundant similarity leading to unnecessary high anxiety?
n.n. We Learn . . .
10% of what we read
20% of what we hear
30% of what we see
50% of what we see and hear
70% of what we discuss
80% of what we experience
95% of what we teach others.
Tavish Srivastava
(May 19, 2015)
Machine Learning algorithms are like solving a Rubik Cube. You grapple at the beginning to figure out the hidden algorithm, but once learnt, some can even solve it in less than 7 seconds.
Yann LeCun The idea that somehow you can put a bunch of research scientists together and then put some random manager who’s not a scientist directing them doesn’t work. I’ve never ever seen it work.
Sundar Pichai Machine learning is a core, transformative way by which we’re rethinking everything we’re doing. We’re thoughtfully applying it across all our products, be it search, ads, YouTube or Play.
To quickly detect and respond to issues, organizations need an analytics platform that offers rich statistical process control (SPC) functionality as well as real-time monitoring and alerting.
Julie Hunt
(April 7, 2015)
The need to analyze data is at the foundation of every effective data management strategy, whether the analysis is handled from the business perspective or the technology side of the equation.
Nikhil Buduma
(29 December 2014)
In general, choosing smart training cases is a very good idea. There’s lots of research that shows that by engineering a clever training set, you can make your neural net a lot more effective.
David McCandless By visualizing information, we turn it into a landscape that you can explore with your eyes, a sort of information map. And when you’re lost in information, an information map is kind of useful.
Kaiser Fung
(May 2015)
Story time is the moment in a report on data analysis when the author deftly moves from reporting a finding of data to the telling of stories based on assumptions that do not come from the data.
Robert Neuhaus Feature engineering and feature selection are not mutually exclusive. They are both useful. I’d say feature engineering is more important though, especially because you can’t really automate it.
William S. Cleveland
Model building is complex because it requires combining information from exploring the data and information from sources external to the data such as subject matter theory and other sets of data.
Yann LeCun The amount of human brainpower on the planet is actually increasing exponentially as well, but with a very, very, very small exponent. It’s very slow growth rate compared to the data growth rate.
SBS documentary “The Age of Big Data” Data is becoming a powerful and most valuable commodity in 21st century. It is leading to scientific insights and new ways of understanding human behaviour. Data can also make you rich. Very rich.
Lord Kelvin When you can measure what you are speaking
about and express it in numbers, you know
something about it. When you cannot express it in
numbers, your knowledge is of a meagre and
unsatisfactory kind.
European Union’s General Data Protection Regulation (GDPR)
(Dec. 2016)
How could a result be explained, especially a result of a machine learning model, without a versioned record of what data was input to generate the result and what data was output representing the result?
William S. Cleveland
Theory, both mathematical and non-mathematical theory, is vital to data science. … Tools of data science – models and methods together with computational methods and computing systems – link data and theory.
Jeroen Janssens Data scientists love to create interesting models and exciting data visualizations. However, before they get to that point, usually much effort goes into obtaining, scrubbing, and exploring the required data.
Mark Hammond
There are 18 million developers in the world, but only one in a thousand have expertise in artificial intelligence. To a lot of developers, AI is inscrutable and inaccessible. We’re trying to ease the burden.
Michal Klos
(January 28, 2015)
We are in the Golden Age of Data. For those of us on the front-lines, it doesn’t feel that way. Every step forward this technology takes, the need for deeper analytics takes two. We’re constantly catching up.
Fatih Hamurcu
(May 7, 2015)
On a sequential computer, the fast algorithm is the best algorithm, but for new science area, I believe we need more creative approaches for algorithm design in order to extract more valuable insight in real-time.
Kaiser Fung We are not saying that statisticians should not tell stories. Story-telling is one of our responsibilities. What we want to see is a clear delineation of what is data-driven and what is theory (i.e., assumptions).
Kune, Konugurthi, Agarwal, Chillarige, Buyya
Big Data and traditional data warehousing systems, however, have the similar goals to deliver business value through the analysis of data, but, they differ in the analytics methods and the organization of the data.
Suman Malekani
(January 29, 2015)
While working on Big Data & planning to implement it for the benefit of business, it is very important to explain the insights & valuable knowledge in a way that non-technical business user can actually understand.
Dr. Olly Downs
(May 18, 2015)
Most of the big data investment focus to date has been on the underlying infrastructure, while development of the applications that make use of that infrastructure – and that deliver actual business value – has lagged.
Data integration features have gained prominence during the last year as companies struggled to incorporate new data sources in their analysis, a process that can consume a sizable percentage of the total project time.
Jeff Leek
To evaluate a person’s work or their productivity requires three things:
1. To be an expert in what they do
2. To have absolutely no reason to care whether they succeed or not
3. To have time available to evaluate them.
R. A. Fisher … the null hypothesis is never proved or established, but is possibly disproved, in the course of experimentation. Every experiment may be said to exist only to give the facts a chance of disproving the null hypothesis.
Amir Hajian
A good data scientist in my mind is the person that takes the science part in data science very seriously; a person who is able to find problems and solve them using statistics, machine learning, and distributed computing.
Julia Evans Cleaning up data to the point where you can work with it is a huge amount of work. If you’re trying to reconcile a lot of sources of data that you don’t control like in this flight search example, it can take 80% of your time.
One robust way to determine if two times series, xt and yt, are related is to analyze if there exists an equation like yt=βxt+ut such us residuals (ut) are stationary (its mean and variance does not change when shifted in time).
Eric Jonas Academic culture teaches you that you’re dumb and that you’re probably wrong because most things never work, nature is very hard, and the best you can hope for is working on interesting problems and making a tiny bit of progress.
Donald Rumsfeld There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don´t know. But there are also unknown unknowns. There are things we don´t know we don´t know.
Marcia Kaufman, Daniel Kirsch
It is no longer sufficient for businesses to understand what has happened in the past, rather it has become essential to ask what will happen in the future, to anticipate trends and to take action that optimize results for business.
Eric Jonas The right thing to do is to not build a tool company but to build a consultancy based on the tools. Identify the company, identify the market, and build a consultancy. Later, if that works, you can then pivot to being a tool company.
Stephan Duquesnoy
Comment of a DeepLearning user: As a side-note, even though I’m good with pattern-based thinking, I do not have an academic background. I lack patience and feel the need to create, rather than to completely understand what I’m doing.
Eric Colson, Brad Klingenberg, Jeff Magnusson
(March 31, 2015)
Data science can directly enable a strategic differentiator if the company’s core competency depends on its data and analytic capabilities. When this happens, the company becomes supportive to data science instead of the other way around.
Eric Jonas I actually think a lot of the future is in small data …. As the big data hype cycle crests, we’re going to see more and more people recognizing that what they really want to be doing is asking interesting questions of smaller data sets.
Analise Polsky
Improving Visual Data Discovery:
1. Always have new data sources.
2. Always have new techniques.
3. Always have new tools and platforms.
Visual data discovery is not once and done. It is an iterative process that requires communication and exploration.
Nikhil Buduma
(29 December 2014)
So what’s the idea behind backpropagation? We don’t know what the hidden units ought to be doing, but what we can do is compute how fast the error changes as we change a hidden activity. Essentially we’ll be trying to find the path of steepest descent!
(March 4th, 2015)
Data Science has its own language. So, if you want to have at least a slight chance of surviving in the enterprise world of tomorrow -with its obsessive focus on collecting and analyzing data- you better have started yesterday with learning this terminol.
Bill Franks
(December 10, 2015)
One of the legendary events in the history of analytics was the original Netflix prize. The event led to a terrific example of the need to focus on not only theoretical results, but also pragmatically achievable results, when developing analytic processes.
Brandon Rohrer
(Dec 19, 2015)
Before data science can build the solution to simplify your life or make you lots of money, you have to give it some high quality raw materials to work with. Just like making a pizza, the better the ingredients you start with, the better the final product.
Richard Fichera Part of Hadoop’s appeal is that it is not specifically optimized for any specific solution or data type but rather a general framework for parallel processing, so your developers and data scientists can add any relevant data, whatever its format or source.
Richard A. Becker, William S. Cleveland
Making graphs is very basic to data analysis. Whether you use the leading edge of statistical methods, or whether you want to quickly see the main features of your data, graphs are a must. They are the single most powerful class of tools for analyzing data.
T. Alan Keahey Analytics plays a key role by helping to reduce the size and complexity of big data to a point where it can be effectively visualized and understood. In the best scenario, the visualization and analytics are integrated so that they work seamlessly with each other.
Nathan Yau What is good visualization? It is a representation of data that helps you see what you otherwise would have been blind to if you looked only at the naked source. It enables you to see trends, patterns and outliers that tell you about yourself and what surrounds you.
Vladimir N. Vapnik
After the success of the SVM in solving real-life problems, the interest in statistical learning theory significantly increased. For the first time, abstract mathematical results in statistical learning theory have a direct impact on algorithmic tools of data analysis.
Zachary Chase Lipton
(January 2015)
Generally, the systems implementation of machine learning methodology and ongoing software maintenance challenges are an understudied area that will continue to grow in importance as machine learning systems become more commonplace in commercial and open source software.
Ferris Jumah
(Sep 3, 2014)
We see that machine learning, data mining, data analysis and statistics are all highly ranking skills in the (Data Science Skill) network. This indicates that being able to understand and represent data mathematically, with statistical intuition, is a key skill for data scientists.
Kune, Konugurthi, Agarwal, Chillarige, Buyya
Big Data technologies are being adopted widely for information exploitation with the help of new analytics tools and large scale computing infrastructure to process huge variety of multi-dimensional data in several areas ranging from business intelligence to scientific explorations.
H. Simon The aim … is to provide a clear and rigorous basis for determining when a causal ordering can be said to hold between two variables or groups of variables in a model . . . . The concepts refer to a model-a system of equations-and not to the ‘real’ world the model purports to describe.
Dean Abbott
(December 06, 2015)
This kind of mindset is not learned in a university program; it is part of the personality of the individual. Good predictive modelers need to have a forensic mindset and intellectual curiosity, whether or not they understand the mathematics enough to derive the equations for linear regression.
Rao Naveen
There’s been a lot of talk about trying to make AI work on existing infrastructure. But the sad reality is that you’re always going to end up with something that’s far less than state-of-the-art. And I don’t mean it will be 30 or 40 percent slower. It’s more likely to be a thousand times slower
Mark Barrenechea
(September 11, 2015)
Digital leaders know their data. They convert their information into actionable business insight. Considering that more data is shared online every second today than was stored in the entire Internet 20 years ago, it’s no wonder that differentiating products and services requires advanced tools.
Jonas Salk Reason alone will not serve. Intuition alone can be improved by reason, but reason alone without intuition can easily lead the wrong way … both are necessary. For myself, that’s how my mind works, and that’s how I work … It’s this combination that must be recognized and acknowledged and valued.
Jeffrey Heer, Michael Bostock, Vadim Ogievetsky
Graphical Perception Experiments find that spatial position (as in a scatter plot or bar chart) leads to the most accurate decoding of numerical data and is generally preferable to visual variables such as angle, one-dimensional length, two-dimensional area, three-dimensional volume, and color saturation.
Kevin Daly
Big data is not for the feint of heart, you and your team must be willing to master many disciplines in order to be successful. You’ll need understanding of code, hardware, Virtualization, networking, databases (SQL & NoSQL), ETL, Cloud, and more. Don’t fool yourself, you’ll need some serious skills on-board.
Lana Klein
Remember that the most critical thing is not building analytic solution but making sure that your organization starts using it: that means creating buy-in, working to build adoption, educating and training, redesigning processes to include analytics. Give it time, be persistent, improve and results will follow!
Enric Junqué de Fortuny, David Martens, Foster Provost
This study provides a clear illustration that larger data indeed can be more valuable assets for predictive analytics. This implies that institutions with larger data assets – plus the skill to take advantage of them – potentially can obtain substantial competitive advantage over institutions without such access or skill.
Nikhil Buduma
(29 December 2014)
[In Neural Networks] It is not required that a neuron has its outlet connected to the inputs of every neuron in the next layer. In fact, selecting which neurons to connect to which other neurons in the next layer is an art that comes from experience. Allowing maximal connectivity will more often than not result in overfitting.
Christophe Bourguignat
(Sep 16, 2014)
In real organizations, people need dead simple story-telling – Which features are you using ? How your algorithms work ? What is your strategy ? etc. … If your models are not parsimonious enough, you risk to lose the audience confidence. Convincing stackeholders is a key driver for success, and people trust what they understand.
Mark van Rijmenam
(October 16, 2014)
Although such Business Intelligence is still quite common and does give you at least some insights, the fast-changing world of today requires a different approach. Organisations today should strive for a holistic overview of their internal and external data that is analysed on the spot and returned graphically via live storylines.
John Von Neumann The sciences do not try to explain, they hardly even try to interpret, they mainly make models. By a model is meant a mathematical construct which, with the addition of certain verbal interpretations, describes observed phenomena. The justification of such a mathematical construct is solely and precisely that it is expected to work.
Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals
(10 Nov 2016)
Indeed, in neural networks, we almost always choose our model as the output of running stochastic gradient descent. Appealing to linear models, we analyze how SGD acts as an implicit regularizer. For linear models, SGD always converges to a solution with small norm. Hence, the algorithm itself is implicitly regularizing the solution.
The Economist
The end of data scientists. Data science moves from the specialist to the everyman. Familiarity with data analysis becomes part of the skill set of ordinary business users, not experts with “analyst” in their titles. Organizations that use data to make decisions are more successful, and those that don’t use data begin to fall behind.
Foster Provost & Tom Fawcett
On a scale less grand, but probably more common, data-analytics projects reach into all business units. Employees throughout these units must interact with the data-science team. If these employees do not have a fundamental grounding in the principles of data-analytic thinking, they will not really understand what is happening in the business.
Dan Hirpara
What data fusion brings to the table is the idea that end-users, whether they are humans or machines, are brought into the data processing loop as collaborators. By iteratively combining multiple data streams in new and interesting ways, driven by the changing needs of users, data fusion produces a wide variety of ways to aggregate data streams.
Gil Allouche
(January 9, 2015)
Improvements in technology and big data trends have given rise to improvements in machine learning. The sheer volume of data is growing exponentially, and companies are looking for faster speeds and real-time analytics. Cognitive computing combines machine learning and artificial intelligence to go beyond data mining and provide actionable insights.
Mark van Rijmenam
(September 2, 2014)
All these new Big Data applications require a new way of working. As a result General Motors is currently undergoing a massive, cultural, change to become data-driven; hiring thousands of new employees will have a profound affect on the company culture, but in the end all existing and new employees must learn and adapt to this new, data-driven and information-centric, culture.
Hal Varian If you are looking for a career where your services will be in high demand, you should find something where you provide a scarce, complementary service to something that is getting ubiquitous and cheap. So what’s getting ubiquitous and cheap? Data. And what is complementary to data? Analysis. So my recommendation is to take lots of courses about how to manipulate and analyze data.
Joyce Jackson In many applications, particularly in the business domain, the data is not stationary, but rather changing and evolving. This changing data may make previously discovered patterns invalid and as a result, there is clearly a need for incremental methods that are able to update changing models, and for strategies to identify and manage patterns of temporal change in knowledge bases.
Wojciech Bolanowski
Numerous changes and innovations have come to life recently. The pace of digital revolution is unimaginable concerning it keeps on increasing. There is no doubt most of approaching digital changes are potentially disruptive to older habits, businesses, beliefs. Unconditionally they are changing former way of life on the globe. They push whole humanity into something very new and completely unknown.
John Geer
(May 6, 2015)
There is predictable data as far as the eye can see. Millions of variables quietly tracing the path we thought, and perhaps hoped, they would. Because there are so many, noticing when one of these variables does something unexpected is a task that is unsolvable by diligence alone. In order to spot these rare unexpected observations, we need an often-overlooked statistical analysis: anomaly detection.
Jeff Leek
(Feb. 14, 2014)
Since most people performing data analysis are not statisticians there is a lot of room for error in the application of statistical methods. This error is magnified enormously when naive analysts are given too many “researcher degrees of freedom”. If a naive analyst can pick any of a range of methods and does not understand how they work, they will generally pick the one that gives them maximum benefit.
Daniel Gutierrez
(December 31, 2014)
What hiring companies consider requirements for being a data scientist. Here is a short list for an honest assessment:
– Are you really good at math – undeterred with calculus, differential equations, and linear algebra? Are you also strong in statistics and probability theory?
– Do you also know R and/or Python for developing machine learning algorithms?
– Do you have deep domain knowledge of a particular industry?
Marissa Mayer
(January 24, 2013)
The Web is so vast … you need to extend categorization and make sense of the content and have a Web ordered for you … One of the key pieces is you have to understand and decide what the Ontology of entities is. Meaning how things are named and how are they organized into hierarchies … By mapping people’s search habits you pull all their content together and have a feed of information that is the web ordered for you.
Some decisions you need to make are big enough to change the course for your business. And your past experiences may not be good predictors of the future. More data are within your reach to understand what was previously unknown. Sophisticated analytical tools are available to you to ‘see’ a wider range of possibilities and evaluate them quickly. Now is a good time for an upgrade in your decision making capabilities.
Judy Selby
(April 20, 2015)
Big Data’s undeniable impact on companies’ goodwill and reputation has permeated the landscape of corporate valuation. Recent research confirms that companies need to face the new normal whereby corporate reputations suffer after mishaps with data under their control. Today’s companies must appreciate that their use, misuse and governance of Big Data can have an impactful effect on their goodwill and resulting valuation.
Gordon S. Linoff
(September 15, 2014)
In any case, I come to the conclusion that Data Science is just another term in a long-line of terms. Whether called statistics or customer analytics or data mining or analytics or data science, the goal is the same. Computers have been and are gathering incredible amounts of data about people, businesses, markets, economies, needs, desires, and solutions – there will always be people who take up the challenge of transforming the data into solutions.
Anmol Rajpurohit
(May 15, 2014)
For a long time, Predictive Analytics has been primarily the responsibility of the Data Science and Analytics team, but this outlook is changing fast. While Data Science team still remains the primary contributor, the responsibility is increasingly being shared with database management, BI, LOB (Line of Business) analysts and others. This clearly demonstrates the need for better training and support for the non-technical users of Predictive Analytics.
Avi Kalderon
(JAN 27, 2015)
Without effective data governance and data management, big data can mean big problems for many organizations already struggling with more data than they can handle. That “lake” they are building can very easily become a “cesspool” without appropriate data management practices that are adapted to this new platform. The solution? Firms need to actively adapt their data governance and data management capabilities – from implementing to ongoing maintenance.
Jeffrey P. Bigham
A machine isn’t a human. It’s not going to necessarily incorporate bias even from biased training data in the same way that a human would. Machine learning isn’t necessarily going to adopt-for lack of a better word-a clearly racist bias. It’s likely to have some kind of much more nuanced bias that is far more difficult to predict. It may, say, come up with very specific instances of people it doesn’t want to hire that may not even be related to human bias.
Albert Einstein You believe in a God who plays dice, and I in complete law and order in a world which objectively exists, and which I, in a wildly speculative way, am trying to capture. I firmly believe, but hope that someone will discover a more realistic way, or rather a more tangible basis than it has been my lot to do. Even the great initial success of the quantum theory does not make me believe in the fundamental dice game, although I am well aware that your younger colleagues interpret this as a consequence of senility.
Mkhuseli Mthukwane
(August 27, 2015)
Data Science forms the very substratum of an Analytics Practitioners’ work, it’s what sets us apart from Statisticians or Mathematicians. However in some instances we cannot rely on it alone, we need to employ other measures to increase its definitiveness. In any event I am sure many Data Scientists use math and other means to augment the potency of their Analytics, some not even scientific at all. It is undeniably prudent to do so where necessary, especially in fields that demand a higher standard of accuracy and care.
Mark van Rijmenam
(October 16, 2014)
In the fast moving world of today, data is being created at lightning speed. Data comes from an infinite variety of sources and all this data can be used to discover valuable business insights. Combining internal and external data can enable organisations to beat the competition, as the analysis will provide valuable insights. The more business users that work with such insights, the better your organisation will become. Organisations should therefore strive for a data-driven, information-centric culture, where every business user makes decisions based on data.
Durgesh Kaushik
(October 9, 2015)
Analytics no matter how advanced they are, does not remove the need for human insights. On the contrary, there is a compelling need for skilled people with the ability to understand data, think from the business point of view and come up with insights. For this very reason technology professionals with Analytics skill are finding themselves in high demand as businesses look to harness the power of Big Data. A professional with the Analytical skills can master the ocean of Big Data and become a vital asset to an organization, boosting the business and their career.
Foster Provost & Tom Fawcett
It is important to understand data science even if you never intend to do it yourself, because data analysis is now so critical to business strategy. Businesses increasingly are driven by data analytics, so there is great professional advantage in being able to interact competently with and within such businesses. Understanding the fundamental concepts, and having frameworks for organizing data-analytic thinking not only will allow one to interact competently, but will help to envision opportunities for improving data-driven decision-making, or to see data-oriented competitive threads.
Shahbaz Ali
(DEC 24, 2014)
When data is locked in silos, organizations are unable to find and include all enterprise data for use with big data analytics tools. Planning to implement a data centric data management strategy enables the distributed metadata repository to be a source for analytics tools, as it can be used to provide real-time insight, without having to migrate data from silos to a separate analytics platform. It also enhances the quality of results, because having more relevant data often produces more accurate analysis. If organizations can harness all of its data, they will attain a greater competitive advantage.
Strategy& Big data have the potential to improve or transform existing business operations and reshape entire economic sectors. Big data can pave the way for disruptive, entrepreneurial companies and allow new industries to emerge. The technological aspect is important, but insufficient to allow big data to show their full potential and to stop companies from feeling swamped by this information. What matters is to reshape internal decision-making culture so that executives base their judgments on data rather than hunches. Research already indicates that companies that have managed this are more likely to be productive and profitable than the competition.
Philipp Max Hartmann, Mohamed Zaki, Niels Feldmann, Andy Neely In the field of ‘big data’, Gartner identified five different types of data source used to ‘exploit big data’ in a company (Buytendijk et al., 2013): ‘Operational data comes from transaction systems, the monitoring of streaming data and sensor data; Dark data is data that you already own but don’t use: emails, contracts, written reports and so forth; Commercial data may be structured or unstructured, and is purchased from industry organisations, social media providers and so on; Social data comes from Twitter, Facebook and other interfaces; Public data can have numerous formats and topics, such as economic data, socio-demographic data and even weather data.’
Tracey Wallace
(September 8, 2014)
Our Collective Data Science Duty: Here’s the thing, technology is empowering the public in never before seen ways, and data is the backbone of that shift. Between wearable tech and digital identity platforms, people are creating more data every day than has ever been created in decades, no, centuries past. Each of us is essentially our own personal data scientist, and those working in the digital space have very much been their own statisticians for quite some time. It’s why platforms like Google Analytics, Omniture and more are so popular across the industry. They put the power of analytics in the hands of users, requiring little training but returning lots of measurability.
Tom Phelan
(February 10, 2015)
An agile environment is one that’s adaptive and promotes evolutionary development and continuous improvement. It fosters flexibility and champions fast failures. Perhaps most importantly, it helps software development teams build and deliver optimal solutions as rapidly as possible. That’s because in today’s competitive market chock-full of tech-savvy customers used to new apps and app updates every day and copious amounts of data with which to work, IT teams can no longer respond to IT requests with months-long development cycles. It doesn’t matter if the request is from a product manager looking to map the next rev’s upgrade or a data scientist asking for a new analytics model.
Jeff Leek
Data science done well looks easy – and that is a big problem for data scientists. The really tricky twist is that bad data science looks easy too. You can scrape a data set off the web and slap a machine learning algorithm on it no problem. So how do you judge whether a data science project is really ‘hard’ and whether the data scientist is an expert? Just like with anything, there is no easy shortcut to evaluating data science projects. You have to ask questions about the details of how the data were collected, what kind of biases might exist, why they picked one data set over another, etc. In the meantime, don’t be fooled by what looks like simple data science – it can often be pretty effective.
Guerrilla Analytics
(July 21, 2015)
Data Scientists and automation (data products, algorithms, production code, whatever) are complementary functions. Good Data Science supports automation. It quickly adds value by investigating, testing, and quantifying hypotheses about existing data and potential new data. Simply switching on software ignores the reality of working with data, regardless of the claims of that software. Data is full of nuances, errors and unknown relationships that are best discovered and tested by an expert Data Scientist. This takes time and does not scale but it does not have to scale. It is the necessary prudent investment that you make before spending months in product development and automation of the wrong algorithm on the wrong or broken data.
Mike Barlow
Top takeaways from my interviews with experts from organizations offering AI products and services:
• AI is too big for any single device or system
• AI is a distributed phenomenon
• AI will deliver value to users through devices, but the heavy lifting will be performed in the cloud
• AI is a two-way street, with information passed back and forth between local devices and remote systems
• AI apps and interfaces will be designed and engineered increasingly for nontechnical users
• Companies will incorporate AI capabilities into new products and services routinely
• A new generation of AI-enriched products and services will be connected and supported through the cloud
• AI in the cloud will become a standard combination, like peanut butter and jelly
Alice Zheng
If we think of training the model as a part of it, then even after you’ve trained a model and evaluated it and found it to be good by some evaluation metric standards, when you deploy it, where it actually goes and faces users, then there’s a different set of metrics that would impact the users. You might measure: how long do users actually interact with this model? Does it actually make a difference in the length of time? Did they used to interact less and now they’re more engaged, or vice versa? That’s different from whatever evaluation metric that you used, like AUC or per class accuracy or precision and recall. … It’s probably not enough to just say this model has a .85 F1 score and expect someone who has not done any data science to understand what that means. How good are the results? What does it actually mean to the end users of the product?
Philip Russom
Managing big data for analytics is not the same as managing DW data for reporting. In fact, the two are almost opposites … . For example, reporting is about seeing the latest values of the numbers that you track over time via a report. Obviously, you know the report, the business entities it represents, and the data warehouse that feeds the report. An analysis is more about discovering variables you don’t know, based on data that you probably don’t know very well. Also, a report requires a solid audit trail, so its data must be managed with welldocumented metadata and possibly master data, too. Since most analyses have no expectation of an audit trail, there’s no need to manage one. That’s just a sampling of the differences. The point is to embrace Big Data Management for analytics as a unique practice that doesn’t follow all the strict rules we’re taught for reporting and data warehousing.
Vincent Granville
(November 15, 2014)
A different perspective on what data scientists are capable of:
• Imagine dozens of scenarios and rank them by chance of occurring
• Get siloed data from various departments (finance, sales, marketing, product, IT)
• Analyze the data in connection with the scenarios (including checking data validity)
• Get external data (competitive intelligence) as needed
• Find the causes (not just correlations)
• Find the remedies
• Detect issues well before anyone else can see them, by looking in summary data
• Complete the analysis with a 48 hours turnaround
Such a data scientist who can save billions to a company, is usually not hired, for the following reasons
• Companies are looking for coders, not business solvers, when they hire a data guru, despite claiming the contrary
• A data scientist without Python on his resume is unlikely to ever get hired
• Hard work gets rewarded, smart work does not.
Yanir Seroussi
People like simple explanations for complex phenomena. If you work as a data scientist, or if you are planning to become/hire one, you’ve probably seen storytelling listed as one of the key skills that data scientists should have. Unlike “real” scientists that work in academia and have to explain their results mostly to peers who can handle technical complexities, data scientists in industry have to deal with non-technical stakeholders who want to understand how the models work. However, these stakeholders rarely have the time or patience to understand how things truly work. What they want is a simple hand-wavy explanation to make them feel as if they understand the matter – they want a story, not a technical report (an aside: don’t feel too smug, there is a lot of knowledge out there and in matters that fall outside of our main interests we are all non-technical stakeholders who get fed simple stories).
Strategy& There is no general rule dictating how organizations should navigate the stages of big data maturity. They must each decide for themselves, based on their own situation – the competitive environment they are operating in, their business model, and their existing internal capabilities. In less-advanced sectors, with executives still grappling with existing data, making intelligent use of what they already possess may have a substantial impact on decision making.
The main priorities for executives are to:
• develop a clear (big) data strategy;
• prove the value of data in pilot schemes;
• identify the owner for “big data” in the organization and formally establish a “Chief Data Scientist” position (where applicable);
• recruit/train talent to ask the right questions and technical personnel to provide the systems and tools to allow data scientists to answer those questions;
• position big data as an integral element of the operating model; and establish a data-driven decision culture and launch a communication campaign around it.
Mark van Rijmenam
(31 Dec. 2014)
Pattern Analytics can be defined as a discipline of Big Data that enables business leaders to understand how different variables of the business interact and are linked with each other. Variables can be of any kind and within any data source, structured as well as unstructured. Such patterns can indicate opportunities for innovation or threats of disruption for your business and therefore require action. Finding patterns within the data and sifting it out is difficult. Machine learning can contribute in helping us humans find patterns that are relevant, but too difficult for us to see. This enables organizations to find patterns they act on. Business leaders can learn from these patterns and use them in their decision-making process. Business leaders therefore should rely less on their gut feeling and years of experience, and more on the data. Pattern Analytics does not require predefined models; the algorithms will do the work for you and find whatever is relevant in a combination of large sets of data. The key with pattern analytics is automatically revealing intelligence that is hidden in the data and these insights will help you grow your business.
Alice Zheng
There’s structure in it, but it’s kind of a different form. … It’s spit out by machines and programs. There’s structure, but that structure is difficult to understand for humans. … So, you can’t just throw all of it into an algorithm and expect the algorithm to be able to make sense of it. You really have to process the features, do a lot of pre-processing, and first do things like extract out the frequent sequences, maybe, or figure out what’s the right way to represent IP addresses, for instance. Maybe you don’t want to represent latency by the actual latency number, which could have a very skewed distribution, with lots and lots of large numbers. You might want to assign them into bins or something. There are a lot of things that you need to do to get the data into a format that’s friendly to the model, and then you want to choose the right model. Maybe after you choose the model, you realize this model really is suitable for numeric data and not categorical data. Then you need to go back to the feature engineering part and figure out the best way to represent the data. … I hesitate to say anything critical because half of my friends are in machine learning, which is all about algorithms. I think we already have enough algorithms. It’s not that we don’t need more and better algorithms. I think a much, much bigger challenge is data itself, features, and feature engineering.
Michael Jordan
Graphical models are a marriage between probability theory and graph theory. They provide a natural tool for dealing with two problems that occur throughout applied mathematics and engineering — uncertainty and complexity — and in particular they are playing an increasingly important role in the design and analysis of machine learning algorithms. Fundamental to the idea of a graphical model is the notion of modularity — a complex system is built by combining simpler parts. Probability theory provides the glue whereby the parts are combined, ensuring that the system as a whole is consistent, and providing ways to interface models to data. The graph theoretic side of graphical models provides both an intuitively appealing interface by which humans can model highly-interacting sets of variables as well as a data structure that lends itself naturally to the design of efficient general-purpose algorithms. Many of the classical multivariate probabalistic systems studied in fields such as statistics, systems engineering, information theory, pattern recognition and statistical mechanics are special cases of the general graphical model formalism — examples include mixture models, factor analysis, hidden Markov models, Kalman filters and Ising models. The graphical model framework provides a way to view all of these systems as instances of a common underlying formalism. This view has many advantages — in particular, specialized techniques that have been developed in one field can be transferred between research communities and exploited more widely. Moreover, the graphical model formalism provides a natural framework for the design of new systems.
Istvan Hajnal
(February 23, 2015)
There are few trends in the Big Data and Data Science world that can be of interest to market researchers:
• Visualization. There is a lot of interest in the Big Data and Data Science world for everything that has to do with Visualization. I’ll admit that sometimes it is Visualize to Impress rather than to Inform, but when it comes to informing clearly, communicating in a simple and understandable way, storytelling, and so on, we market researchers have a head start.
• Natural Language Processing. One of the 4 V’s of Big Data stands for Variety. Very often this refers to unstructured data, which sometimes refers to free text. Big Data and Data Science folks, for instance, start to analyze text that is entered in the free fields of production systems. This problem is not disimilar to what we do when we analyse open questions. Again market research has an opportunity to play a role here. By the way, it goes beyond sentiment analysis. Techniques that I’ve seen successfully used in the Big Data / Data Science world are topic generation and document classification. Think about analysing customer complaints, for instance.
• Deep Learning. Deep learning risks to become the next fad, largely because of the name Deep. But deep here does not refer to profound, but rather to the fact that you have multiple hidden layers in a neural network. And a neural network is basically a logistic regression (OK, I simplify a bit here). So absolutely no magic here, but absolutely great results. Deep learning is a machine learning technique that tries to model high-level abstractions by using so called learning representations of data where data is transformed to a representation of that data that is easier to use with other Machine Learning techniques. A typical example is a picture that constitutes of pixels. These pixels can be represented by more abstract elements such as edges, shapes, and so on. These edges and shapes can on their turn be furthere represented by simple objects, and so on. In the end, this example, leads to systems that are able to reasonably describe pictures in broad terms, but nonetheless useful for practical purposes, especially, when processing by humans is not an option. How can this be applied in Market Research? Already today (shallow) Neural networks are used in Market Research. One research company I know uses neural networks to classify products sold in stores in broad buckets such as petfood, clothing, and so on, based on the free field descriptions that come with the barcode data that the stores deliver.
Alistair Croll, Benjamin Yoskovitz
What makes a good metric?

Here are some rules of thumb for what makes a good metric-a number that will drive the changes you’re looking for.

A good metric is comparative.

Being able to compare a metric to other time periods, groups of users, or competitors helps you understand which way things are moving. “Increased conversion from last week” is more meaningful than “2% conversion”.

A good metric is understandable.

If people can’t remember it and discuss it, it’s much harder to turn a change in the data into a change in the culture.

A good metric is a ratio or a rate.

Accountants and financial analysts have several ratios they look at to understand, at a glance, the fundamental health of a company. You need some, too.

There are several reasons ratios tend to be the best metrics:

1 Ratios are easier to act on. Think about driving a car. Distance travelled is informational. But speed-distance per hour-is something you can act on, because it tells you about your current state, and whether you need to go faster or slower to get to your destination on time.

2 Ratios are inherently comparative. If you compare a daily metric to the same metric over a month, you’ll see whether you’re looking at a sudden spike or a long-term trend. In a car, speed is one metric, but speed right now over average speed this hour shows you a lot about whether you’re accelerating or slowing down.

3 Ratios are also good for comparing factors that are somehow opposed, or for which there’s an inherent tension. In a car, this might be distance covered divided by traffic tickets. The faster you drive, the more distance you cover-but the more tickets you get. This ratio might suggest whether or not you should be breaking the speed limit. A good metric changes the way you behave. This is by far the most important criterion for a metric: what will you do differently based on changes in the metric?

1 “Accounting” metrics like daily sales revenue, when entered into your spreadsheet, need to make your predictions more accurate. These metrics form the basis of Lean Startup’s innovation accounting, showing you how close you are to an ideal model and whether your actual results are converging on your business plan.

2 “Experimental” metrics, like the results of a test, help you to optimize the product, pricing, or market. Changes in these metrics will significantly change your behavior. Agree on what that change will be before you collect the data: if the pink website generates more revenue than the alternative, you’re going pink; if more than half your respondents say they won’t pay for a feature, don’t build it; if your curated MVP doesn’t increase order size by 30%, try something else. Drawing a line in the sand is a great way to enforce a disciplined approach. A good metric changes the way you behave precisely because it’s aligned to your goals of keeping users, encouraging word of mouth, acquiring customers efficiently, or generating revenue. If you want to choose the right metrics, you need to keep five things in mind:

1 Qualitative versus quantitative metrics

Qualitative metrics are unstructured, anecdotal, revealing, and hard to aggregate; quantitative metrics involve numbers and statistics, and provide hard numbers but less insight.

2 Vanity versus actionable metrics

Vanity metrics might make you feel good, but they don’t change how you act. Actionable metrics change your behavior by helping you pick a course of action.

3 Exploratory versus reporting metrics

Exploratory metrics are speculative and try to find unknown insights to give you the upper hand, while reporting metrics keep you abreast of normal, managerial, day-to-day operations.

4 Leading versus lagging metrics

Leading metrics give you a predictive understanding of the future; lagging metrics explain the past. Leading metrics are better because you still have time to act on them-the horse hasn’t left the barn yet.

5 Correlated versus causal metrics

If two metrics change together, they’re correlated, but if one metric causes another metric to change, they’re causal. If you find a causal relationship between something you want (like revenue) and something you can control (like which ad you show), then you can change the future

Analysts look at specific metrics that drive the business, called key performance indicators (KPIs). Every industry has KPIs-if you’re a restaurant owner, it’s the number of covers (tables) in a night; if you’re an investor, it’s the return on an investment; if you’re a media website, it’s ad clicks; and so on.

5 thoughts on “Quotes”

  1. Dear Michael !
    I liked your Quotes really. You can see my work at Also you can 2 video here. It’s original for kdnuggets post 😉
    my best regards


    • Hello Andy, thank you very much for your hint. I had a look at your list and found 40 which were not in my list right now. My list now contains >700 from which I publish one a day. So at least another 2 Years …. There are some typos in your list, e.g “better plac”. You might have a look. Thank you very much, Michael


      • Hello Michael !
        Thanks a lot for your attention to my humble work. I have fixed typo “better place” and hope for best. How did you find videos for #1, #2 interviews quotes ?
        I hope you enjoy it too :)) I saw your web site and found it very useful for me.
        So thanks again for your attention.


  2. Very nice post. I simply stumbled upon your weblog and wanted to say that
    I’ve really loved browsing your blog posts. In any case
    I will be subscribing for your rss feed and I’m hoping you write again soon!


  3. hatemgkotb said:

    This is simply AMAZING!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s