Author: Ulrik Stig Hansen and Eric Landau | c.2022 Harvard Business Review
Today, companies everywhere are generating unprecedented amounts of data. While data has always grown naturally as a byproduct of economic and business activity, these days, as more and more of our personal and work lives take place online, humans are creating an abundance of data daily. In fact, 90% of all the world’s internet data has been created since 2016.
For more than a decade, only the so-called FAANG companies (Facebook, Apple, Amazon, Netflix, and Google) were in the position to take advantage of collecting vast amounts of data at scale. For these companies, data is the prime product and inherent to their value proposition, so they invested early in AI teams, servers, network infrastructure, and more. Such intensive resource allocation was nearly impossible for non-tech companies that had other pressing expenditure and outlay demands.
More recently, cloud computing platforms, improvements in general tooling, and the democratization of machine learning models has brought advanced data capabilities within reach for more companies. At the end of 2021, over half of all companies had adopted artificial intelligence in at least one business function, and more than a quarter of all companies report at least 5% of their EBITDA was attributable to AI adoption. Mass-produced machine learning models are ubiquitous.
This is a significant change. With AI tools, non-tech companies can use the data they already have to improve sales, logistics, and operations in general ways.
Getting the right tools isn’t enough, however. To gain long-term profits and a competitive edge, companies need to optimize their data — and learn to use it strategically. Their leaders should prioritize developing data processes as a core component of the business. To do so, they should take the following actions.
1. Educate Yourself About Your Data Usage
The first step to understanding how to use your data is to understand what data you already have. Do an inventory of your business processes to determine which ones are naturally creating data. Ask: What does the company log and record? What don’t we log and why not? What information are we throwing away that we could be keeping?
Once you have inventory over your data, educate yourself about data usage by looking at the ways other companies are storing and using similar data to improve their business functions.
For instance, how do other companies use their quality assurance recordings? Are they building machine learning algorithms to discover which sales pitches work best and training their representatives based on what they find? And what about supply chain and logistics data? Are other companies using that data to create optimization programs that route inventory more effectively?
For example, we know that other companies have begun using historical data about utilities and building maintenance to save on future expenses. Think of what Google achieved when it connected its energy consumption data to DeepMind AI. By taking historical data about temperature, power, pump speeds, and more that had been collected by thousands of sensors and using it to train a group of deep neural networks, DeepMind AI developed a series of recommendations that reduced the amount of energy used to cool Google’s data centers by 40%.
Also, think critically about the data that other companies are collecting publicly and use that information to gain insights into the problems they’re trying to solve. For instance, what image is Google asking you to label in its CAPTCHA and why? If most of the recent CAPTCHAs have involved low lighting conditions for cars, then it’s likely Google wants that information to solve for edge cases in its training data for autonomous vehicle models. Observing and reasoning about the data other companies are collecting will help you better understand which data processes you should retain and invest in.
2. Copy and Paste
After you understand how companies are using data, check out how the latest tech startups are gaining value from data. These companies can offer a cheat sheet for data usage by helping leaders understand how people who work with data as their core business are monetizing it.
Consider engaging early-stage startups with proof of concept contracts or creating data-sharing agreements with seed-stage startups to understand what innovations are happening in these companies. Sponsor corporate hackathons that attract tech talent and help you find data-centric AI solutions for your longstanding operational challenges.
Read the news sources that startup founders and developer influencers read — such as Hacker News and ML Substacks — to learn about the latest products and cutting-edge ideas. After all, 10 years ago, Stripe didn’t launch their product at a Fortune 500 conference. They launched it on Hacker News.
Look at these applications and see if you can translate them to your business. Don’t ignore disruptive technologies — think about how to use them for your business needs.
3. Buy Don’t Build
For many of the problems that arise in capturing and managing data, SaaS solutions already exist. Unfortunately, companies often attempt to solve these problems internally rather than buying a solution off the shelf. Many large corporations build data management tools in house, which leads to clunky, slow infrastructure that doesn’t evolve alongside other technologies. And when new companies attempt to build these tools in house, they increase their time to market and risk losing their first-mover advantage.
Don’t trick yourself into thinking your use case is so specific that it requires a special internal infrastructure. Building in-house data infrastructure tools takes months, their upkeep is costly, and oftentimes, the results aren’t as good as a product that’s already on the market.
Whenever possible, you should buy — not build — the tools you need to structure and manage data. If the tools aren’t core to your business, don’t rebuild them in house. Doing so will slow down the development of your machine learning model, and that’s the product that’s going to save you money and help you stay ahead of the competition.
4. Start Building a Data Moat
Collecting large amounts of data within normal business functions can help companies begin to build a structural data moat that can be used for higher-value-generating activities. Eventually, this moat might become so large that it’s too wide for other companies to cross it, so the data provides you with a competitive edge.
Consider the example of Waymo and Tesla, two major players in the autonomous vehicle market.
The former spends a significant amount of resources driving around and processing thousands of video hours of street-driving footage to capture suitable data to train its models.
The latter — having sold nearly 2 million electric vehicles — can leverage readily available data from the thousands of Tesla owners who use their vehicles’ self-driving software. The company has access to information about accidents, human behavior, and more. Having this real-world data at scale sets Tesla apart from the competition. Moreover, if Tesla decides to abandon their AV aspirations, the company could continue to make money by selling its valuable data inventory to other AV companies.
So don’t throw away your data. Collect and store it until later down the line when you can use it to meet future business objectives.
Think about the story of Rockefeller and crude oil byproducts. Most refinery owners regarded the byproducts from converting crude oil to kerosene as waste and threw them away. Rockefeller, however, saw their value: He collected the paraffin wax to sell to candle makers and the petroleum jelly to sell to medical supply companies.
Be like Rockefeller. Keep your data so that you can monetize it later. Don’t treat it as a useless byproduct just because it isn’t your primary product right now.
The days of AI and machine learning being a luxury only accessible to major tech companies are over. But while powerful new tools are more accessible than ever, companies need to learn how to use them strategically — and how to think about the data that powers them. Learning to do that is where you’ll really find the competitive advantage of AI.
Ulrik Stig Hansen is a co-founder of Encord. He has been coding since he was 14. He started his career in the emerging market rates and FX team at JP Morgan. He holds an M.S. in Computer Science from Imperial College London.
Eric Landau is a co-founder of Encord. Before Encord, Eric spent nearly a decade in high frequency trading at DRW where he was lead quantitative researcher on a global equity delta one desk and put thousands of models into production. He holds an S.M. in Applied Physics from Harvard University, M.S. in Electrical Engineering, and B.S. in Physics from Stanford University.
For more great articles, go to HBR.org.
c.2022 Harvard Business Review. Distributed by The New York Times Licensing Group.