Does your organization monetize the data that it collects? …Does data increase corporate value? These were two poll questions given to participants at the LES webinar presented by Efrat Kasznik, President of Foresight Valuation. Over 200 people from 25 countries called on July 17 to learn about mining data. (*Poll results in text below)
In this interactive presentation, Kasznik outlined several sections:
- The Data Market Opportunity
- The State Of Data Monetization
- Data Monetization Business Models
- Data and Corporate Value: Case Studies
- Key Takeaways: Data Monetization Action Plan
The Data Market Opportunity
The Economist said, “The world’s most valuable resource is no longer oil, but data.” Kasznik emphasized this by pointing to the size and growth of the “datasphere” (defined as data stored in the cloud, at the data center edge and on endpoint devices), citing the IDC Data Age 2025 study that all data created, replicated, or stored will grow to 175 ZB (zettabytes) by 2025. (1 ZB = 1 trillion gigabytes).
Kasznik also sees massive monetary growth in the generative AI market size, showing a Bloomberg study projecting exponential growth for the generative AI market from $14B in 2020 to $1.3T by 2032. AI infrastructure as a service and generative AI-driven advertising will be key segments (expected to grow to $247 billion and $192 billion, respectively, by 2032). There are different opportunities in various industry sectors. For example, in the automotive industry, car-generated data could become a $450-750 billion market by 2030.
The State Of Data Monetization
Back to the introductory question, “Does your organization monetize the data that it collects?” Most said, “No, we choose not to monetize data that we collect.” Kasznik showed that those responses are consistent with a survey conducted by cybersecurity firm, Splunk (2021), showing that 55% of surveyed IT and business leaders consider their data “dark” (defined as “untapped and, often completely unknown”) despite recognizing data as “extremely valuable for success”.
Kasznik continued to explore the reasons for the relatively low rate of monetization of data, which she attributed to three factors: data privacy, data protection, and data ownership—all are crucial factors in clearing the way for data monetization.
- Data protection: Kasznik emphasized the importance of protecting data and information, both legally and physically. As intangible assets, the legal protections afforded to data assets, including copyright and trade secrets (not patents), are governed by legal contracts. On the physical side, Kasznik mentioned that data protection is a big part of cybersecurity, an industry expected to grow globally from $100 billion in 2017 to $200 billion by 2024.
- Data privacy: Kasznik then discussed data privacy laws, which are jurisdictional in nature, and presented a table by Bloomberg comparing five different laws in Europe (GDPR) and the U.S. (in California, Virginia, and Colorado). Data privacy compliance is an important piece of getting data ready for monetization.
- Data ownership: Kasznik said that data ownership is a tricky area, particularly in industries like the Internet of Things (IoT), where multiple parties interact with data and questions about who owns and can monetize it may arise. She brought up the example of the Nest thermostat and the multiple parties collecting and providing data, including the consumer, the utility company, and the device manufacturer (Google). Kasznik also presented a visual showing the thicket of lawsuits raging in courts these days between content providers (like the NYT) and AI companies using copyrighted content from said providers to train AI models (OpenAI, Microsoft, and others). Copyright ownership is at the core of these lawsuits, but according to Kasznik, where there’s litigation, there are also licensing opportunities.
Data Monetization Business Models
In data monetization, four different active business models exist, depending on the type of business (B2B or B2C) and who is paying for the data. Kasznik started with SaaS (software-as-a-service, common in the B2B sector) and advertising (common in the B2C sector) as the “status quo” business models that have been around for a long time. SaaS is an interesting model for companies with data to monetize, offering it in a subscription model, where the customer pays directly for the data. The advertising model (think Google, Facebook) is common to the B2C space, and Kasznik refers to it as a “3-way model: the consumer does not pay for apps (84% of apps are free, according to Tim Cook), but their data is sold to third party advertisers who fund the free service. This is a lucrative business model (both Facebook and Google generate the vast majority of their revenues from advertising) but raises consumer privacy concerns.
Kasznik then discussed two additional business models that she referred to as the “next frontier” of data monetization: IoT data subscriptions attached to connected devices (Tesla is an example of collecting data from connected devices via a subscription to their connectivity package) and data mining platforms which serve as the basis for developers, pulling in data feeds from various sources(the last one is the most sophisticated of the models, which Kasznik also called the “holy grail” of data monetization). Most recently, a new model emerged in the data mining space, which involves data licensing for AI training, a rapidly growing market with limited availability of pricing information. This is a lot of the licensing activity we see in various industries: from licensing healthcare data for drug discovery (12andMe example) to licensing content for LLM model training (Reddit).
Data and Corporate Value: Case Studies
Before the case studies, a poll asked, “How does data collected by a corporation impact corporate value?” The majority said that data increases corporate value. To help understand this, the case studies examined the relationship between data and corporate value.
- Data Licensing for funding Kasznik presented a fictitious startup, MyDNA, and explained, “They have a collection of data on health and genetic data, and they are considering using this data for potential revenue.” The question is, “How do you value the revenue potential of this data set?” She then presented a multi-step approach to valuing the data, starting with identifying the business model, selecting the valuation approach, conducting research for comparable transactions, adjusting valuation inputs, and running the valuation model.Kasznik elaborated on the most challenging step in the process: research to uncovered licensing deals, which resulted in a host of licensing of genetic screening data from 23andMe to GSK and Genentech, highlighting the challenge of finding pricing information that is granular enough to base a valuation on. Kasznik also provided insights on normalizing the data licensing pricing points to make it applicable to other situations, for example, taking a global data access price and calculating a price per user or per data record.
- Data Licensing for TrainingKasznik discussed licensing copyright-protected data for training in several industries, including imaging (Shutterstock), news (Wall Street Journal), and user-generated social media (Reddit), highlighting the link between the extensive litigation in this field and the emerging of active content licensing to avoid or settle litigation. As with the genetic data, questions on normalization the pricing are critical for applying the pricing to other contracts. The problem in many cases is that the data access license provides a global one-time or multi-year access but does not detail how many records are included, so it is difficult to apply this to a different database.One interesting example Kasznik spent some time on is the IPO filing of Reddit, a social media discussion forum that discloses interesting details on Reddit’s data licensing revenue potential. With a $200 million deal signed with Google. Reddit is exploring data licensing as a new revenue stream, with an agreement worth $200 million over 2-3 years (averaging around $60 million annually). Reddit previously depended 100% on advertising for revenue: it has 430 million monthly active users. It generates $800 million in revenue in 2023, with potential for growth through data licensing (close to 10% revenue increase just in the first year of adding data monetization).
Key Takeaways: Data Monetization Action Plan
Kasznik suggests three action areas:
- Prepare Your Data:
- Keep your data well protected: legally and physically
- Ensure compliance with privacy laws such as GDPR
- Identify the data monetization model
- In the B2B Space
- If you collect data and are not monetizing it, you are behind the trend
- Explore data packaging for monetization through access (SaaS or other models) to other companies/industries
- In the B2C Space
- Think beyond advertising!
- Expand into B2B licensing for LLM training or other creative models (Reddit)
- With rampant litigation, now is a good time for data licensing opportunities
The webinar concluded with a Q&A discussion that helped attendees clarify major points in data monetization. Thank you to Michael Zachary for his help in moderating questions and Stacey Ramsey for organizing this webinar.
Get Social