Automation is the key to unlock large, sustainable advantage in firms across sectors ..
Big data can be a big nothing without a strategic automation approach.
On one hand, we’re in a heady time of information richness, with unprecedented volumes of data on everything from equipment performance to consumer social-media behavior (more than half of all global citizens are on social media). But without thoughtful automation – the use of machines and algorithms to handle, process, and analyze available data – your business will lose out on large potential opportunity.
Done well, automation transforms “dead” big data into a living, breathing resource you can use to drive value. So it’s no surprise many businesses aim to automate anything that can be automated, as one top Google exec said recently.
To help you think about automation in your business context, I present the three main ways this technology-driven activity helps you create value.
The first thing automation helps you do is feature extraction, or pulling critical needles of information from massive haystacks of data. Imagine that your organization has to review patent applications for information on a specific technology and related ones. You may be looking at thousands or tens of thousands of applications, each running 30 or more pages, for millions and millions of words. But only a tiny proportion of those words and interrelationships among patents matter, such as what the patented technology depends on or the inventors’ qualifications and past patents.
This task, then, like many in the business domain, involves a very small signal-to-noise ratio, and would require thousands of people hours to complete manually — something far too cost- and time-prohibitive. But a machine-learning-based algorithm could be trained to ferret out relatively quickly the key information needed, saving significant time and effort. Moreover, say that in the future you wanted to search the same set of patents or related ones but for different information, such as the size of the patent-applicant team. You could easily reprogram or retrain the algorithm to take on that task, gaining economies of scale and greater returns on your initial investment.
Second, automation helps with data-checking and cleanup. Data sets often need work. There are errors and missing values, anomalies, and sometimes evidence of bias. For example, if an algorithm were trained to spot the characteristics of lawbreakers but uses data only on offenders who were caught, the algorithm will be biased because it lacks data on offenders who were not captured – a particular problem for white collar crime, which tends to be underreported. Again, checking and addressing this vast volume of potential issues is too much to take on manually. But automation allows rapid deployment of tools for testing and cleanup, again saving time while creating value.
Third, and this is a big one, automation is the driving engine of analytics. Yesterday’s simple regression analyzes have become today’s clustering and random forests, powered by machine learning, whether for understanding product users, forecasting next month’s sales to optimize inventory, or predicting the impact of a new advertising campaign. Machine-based automation not only enables you to repeat standardized analytics processes regularly at low cost, but also can spot nonlinear patterns we humans can’t.
For example, my lab studied over 5 million patents using algorithm-driven analyzes to see if we could predict the debut of groundbreaking future technologies based on their patent application information. We hypothesized that the machine would identify future hit patents from application data if the invention had standalone, “miracle-like” capabilities or ideas. Ultimately, the algorithm did find the hit patents of the future with high accuracy, but not in the way we humans had imagined. That is, the algorithm did not identify a future hit patent based on its standalone capabilities; rather, it identified hit patents based on whether they were part of a cluster of affiliated patents that together could solve specific problems in combination that no individual patent could have solved on its own.
For instance, ultrasound technology made a large impact on healthcare several years after it was first unveiled, enabling non-invasive imaging and treatment of physical conditions like kidney stones and even some cancers. But that progress would have been impossible without smaller-scale inventions beyond the core technology — applicators, static-diminishing processes, specialized medical pads and clamps that were developed independently of ultrasound technology yet critical for its successful application in medicine. Our automated analysis reliably recognized the existence of these clusters of related patents in over 5 million patents from health products to the latest golf ball technology, and that these clusters were correlated with the probability that the patents in them would become tomorrow’s future dominant technologies – an inference not before appreciated.
My Northwestern colleague Andrew Papachristos employed similar analytics to show that police corruption in Chicago stems not from a few “bad apple” officers but a network of connected police acting in bad faith; his work enables earlier detection of such issues.
I hope I’ve made the mutually reinforcing advantages of automation clear, and how it can help you transform data into large, sustainable value. Indeed, the more data you have, the more you need automation; Once you have strong automation capabilities, you can collect and harness even more data, and the cycle continues.
The bottom line: automation is an increasingly critical capability, and may be pivotal to your business’s near- and longer-term performance. But it’s important to understand how it drives value, and to take steps to mitigate its very real downsides, for the good of your company and the broad community in which it operates.
In the second part of this article I’ll discuss the three major downsides of automation — explainability, transparency, and cost — and how to address these.