This is an excerpt of the paper “The data-driven enterprise of 2025”, with the title above, focusing on the topic in question.
Neil Assur and Kayvaun Rowshankish
Rapidly accelerating technology advances, the recognized value of data, and increasing data literacy are changing what it means to be “data driven.”
By 2025, smart workflows and seamless interactions among humans and machines will likely be as standard as the corporate balance sheet, and most employees will use data to optimize nearly every aspect of their work.
We know 2025 isn’t too far off, but that’s the point.
Seven characteristics will define this new data-driven enterprise, and we’ve already seen many companies exhibit at least some of them, with many more beginning the journey to do so.
Those able to make the most progress fastest stand to capture the highest value from data-supported capabilities.
Companies already seeing 20 percent of their earnings before interest and taxes (EBIT) contributed by artificial intelligence (AI), for example, are far more likely to engage in data practices that underpin these characteristics.
BOX 1: The following are the seven characteristics of the data-driven enterprise:
1. Data is embedded in every decision, interaction, and process.
2. Data is processed and delivered in real time.
3. Flexible data stores enable integrated, ready-to-use data.
4. Data operating model treats data like a product.
5. The chief data officer’s role is expanded to generate value
6. Data-ecosystem memberships are the norm.
7. Data management is prioritized and automated for privacy, security, and resiliency.
Seven characteristics will define the new data-driven enterprise.
Those companies able to make the most progress fastest stand to capture the highest value from data-supported capabilities.
1. Data is embedded in every decision, interaction, and process
- Organizations often apply data-driven approaches — from predictive systems to AI-driven automation — sporadically throughout the organization, leaving value on the table and creating inefficiencies.
- Many business problems still get solved through traditional approaches and take months or years to resolve.
- Nearly all employees naturally and regularly leverage data to support their work.
- Rather than defaulting to solving problems by developing lengthy — sometimes multiyear — road maps, they are empowered to ask how innovative data techniques could resolve challenges in hours, days, or weeks.
- Organizations are capable of better decision making as well as automating basic day-to-day activities and regularly occurring decisions.
- Employees are free to focus on more “human” domains, such as innovation, collaboration, and communication.
- The data-driven culture fosters continuous performance improvement to create truly differentiated customer and employee experiences and to enable the growth of sophisticated new applications that aren’t widely available today.
2. Data is processed and delivered in real time
- Only a fraction of data from connected devices is ingested, processed, queried, and analyzed in real time because of the limits of legacy technology structures, the challenges of adopting more modern architectural elements, and the high computational demands of intensive, real-time processing jobs.
- Companies often must choose between speed and computational intensity, which can delay more sophisticated analyses and inhibit the implementation of real-time use cases.
- Vast networks of connected devices gather and transmit data and insights, often in real time.
- How data is generated, processed, analyzed, and visualized for end users is dramatically transformed by new and more ubiquitous technologies, such as kappa or lambda architectures for real-time analysis, leading to faster and more powerful insights.
- Even the most sophisticated advanced analytics are reasonably available to all organizations as the cost of cloud computing continues to decline and more powerful “in-memory” data tools come online (for example, Redis, Memcached).
- Altogether, this enables many more advanced use cases for delivering insights to customers, employees, and partners.
3. Flexible data stores enable integrated, ready-to-use data
- Though the proliferation of data is driven by unstructured or semistructured data, most usable data is still organized in a structured fashion using relational database tools.
- Data engineers often spend significant time manually exploring data sets, establishing relationships among them, and joining them together.
- They also frequently must refine data from its natural, unstructured state into a structured form using manual and bespoke processes that are time-consuming, not scalable, and error prone.
- Data practitioners increasingly leverage an array of database types — including time-series databases, graph databases, and NoSQL databases — enabling more flexible ways of organizing data.
- This allows teams to query and understand relationships between unstructured and semistructured data easier and faster, which accelerates development of new AI-driven capabilities and the discovery of new relationships in the data to drive innovation.
- Combining these flexible data stores with advances in real-time technology and architecture also enables organizations to develop data products, such as “customer 360” data platforms and digital twins — real-time-enabled data models of physical entities (such as a manufacturing facility, supply, or even the human body).
- This enables sophisticated simulations and what-if scenarios using traditional machine-learning capabilities or more advanced techniques such as reinforcement learning.
4. Data operating model treats data like a product
- An organization’s data function, if one exists outside of IT, manages data using top-down standards, rules, and controls.
- Data often has no true “owner” ensuring that it is updated and ready for use in various ways.
- Data sets are also stored — sometimes in duplication — across sprawling, siloed, and often costly environments, making it difficult for users within an organization (such as data scientists looking for data to build analytics models) to find, access, and integrate the data they need quickly.
- Data assets are organized and supported as products, regardless of whether they are used by internal teams or external customers.
- These data products have dedicated teams, or “squads,” aligned against them to embed data security, evolve data engineering (for example, to transform data or continuously integrate new sources of data), and implement self-service access and analytics tools.
- Data products continuously evolve in an agile manner to meet the needs of consumers, leveraging DataOps (DevOps for data) and continuous integration and delivery processes and tools.
- Altogether, these products provide data solutions that can more easily and repeatedly be used to meet various business challenges and reduce the time and cost of delivering new AI-driven capabilities.
5. The chief data officer’s role is expanded to generate value
- Chief data officers (CDOs) and their teams function as a cost center responsible for developing and tracking compliance with policies, standards, and procedures to manage data and ensure its quality.
- CDOs and their teams function as a business unit with profit-and-loss responsibilities.
- The unit, in partnership with business teams, is responsible for ideating new ways to use data, developing a holistic enterprise data strategy (and embedding it as part of a business strategy), and incubating new sources of revenue by monetizing data services and data sharing.
6. Data-ecosystem memberships are the norm
- Data is often siloed, even within organizations.
- While data-sharing arrangements with external partners and competitors are increasing, they are still uncommon and often limited.
- Large, complex organizations use data-sharing platforms to facilitate collaboration on data-driven projects, both within and between organizations.
- Data-driven companies actively participate in a data economy that facilitates the pooling of data to create more valuable insights for all members.
- Data marketplaces enable the exchange, sharing, and supplementation of data, ultimately empowering companies to build truly unique and proprietary data products and gain insights from them.
- Altogether, barriers to the exchange and combining of data are greatly reduced, bringing together various data sources in such a way that the value generated is much greater than the sum of its parts.
7. Data management is prioritized and automated for privacy, security, and resiliency
- Data security and privacy are often viewed as compliance issues, driven by nascent regulatory data-protection mandates and consumers beginning to realize how much of their information is collected and used.
- Data-security and -privacy protections are often either insufficient or monolithic, rather than tailored to individual data sets.
- Providing employees with secure data access is a highly manual process, making it error prone and lengthy.
- Manual data-resiliency processes make it difficult to recover data quickly and fully, creating risks for lengthy data outages that affect employee productivity.
- Organizational mindsets have fully shifted toward treating data privacy, ethics, and security as areas of required competency, driven by evolving regulatory expectations such as the Virginia Consumer Data Protection Act (VCDPA), General Data Protection Regulation (GDPR), and California Consumer Privacy Act (CCPA); increasing consumer awareness of their data rights; and the increasingly high stakes of security incidents.
- Self-service provisioning portals manage and automate data provisioning using predefined “scripts” to safely and securely provide users with access to data in near real time, greatly improving user productivity.
- Automated, near-constant backup procedures ensure data resiliency; faster recovery procedures rapidly establish and recover the “last good copy” of data in minutes rather than days or weeks, thus minimizing risks when technological glitches occur.
- AI tools become available to more effectively manage data — for example, by automating the identification, correction, and remediation of data-quality issues.
- Altogether, these efforts enable organizations to build greater trust in both the data and how it’s managed, ultimately accelerating adoption of new data-driven services.