Why Internal Data Marketplace Adoption Is Growing Faster Than Ever

Steve Touw
on October 29, 2024
Last edited: November 4, 2024
Default alt text

Introduction

At Immuta, we work with the world’s most complex and secure organizations, and almost all have one thing in common: a desire to enable data and analytics through an internal data marketplace. This is quite different from the data space a mere two years ago, which gravitated toward centralized operations. Now, we are seeing a rapid evolution – and it is not just technical, but also philosophical.

What Is an Internal Data Marketplace?

Before diving into what’s changed to cause this shift, let’s clarify what we mean by the term “internal data marketplace.” An internal data marketplace is a mechanism that allows the business to securely share trustworthy data internally, as reusable components to build more data and/or analytical insights. To do this effectively, the internal data marketplace must make the data discoverable, interpretable, and most importantly, accessible.

The backbone of the marketplace is your data platform(s). When trustworthy data is discovered, requested, approved, and provisioned, that is reflected immediately in those platforms. This allows those reusable data components to be leveraged in downstream data engineering and analytical dashboards.

In short, the internal data marketplace is the portal to all data across all lines of business.

What's Changed?

The primary driver of the shift toward internal data marketplaces is decentralization beating out centralization.

We have been operating in a centralized data stack world for quite a long time, since the early days of Hadoop. This is a world where there is a single data platform and all business data is managed, curated, and made useful by a centralized technical team. In this world, business units have struggled with data quality, interpretability, discovery, and access. This is not because the centralized technical team is weak; far from it. It is because that centralized technical team lacks the business knowledge of each unit’s data, and where the most value from that data lies.

Over time, with data being core to business strategy and success, the centralized teams have ceded that control to the business units so they can get the most from their data. This concept was introduced in 2020 and termed Data Mesh. It represents a philosophical and cultural shift across industry. As Zhamek Dehghani, credited with articulating the data mesh philosophy, says:

“Our aspiration to augment and improve every aspect of business and life with data, demands a paradigm shift in how we manage data at scale. While the technology advances of the past decade have addressed the scale of volume of data and data processing compute, they have failed to address scale in other dimensions: changes in the data landscape, proliferation of sources of data, diversity of data use cases and users, and speed of response to change. Data mesh addresses these dimensions, founded in four principles: domain-oriented decentralized data ownership and architecture, data as a product, self-serve data infrastructure as a platform, and federated computational governance. Each principle drives a new logical view of the technical architecture and organizational structure.”

Years after their introduction, these data mesh concepts are finally bearing fruit. Many organizations do not even realize they are following data mesh principals – it’s simply that the demands of the business units have forced decentralization, and Dehghani was prescient in predicting the shift.

How Marketplaces Solve Decentralization Challenges

What does decentralization have to do with an internal data marketplace?

Everything.

There are many benefits to centralization; in a centralized world, you get distribution “for free.” Everyone knows the single place to go and search for data, and who to ask to get access to it. Additionally, legal, security, and compliance teams have a single avenue to enforce consistent rules and access controls. However, these benefits do not outweigh the overwhelming negative consequence of centralization: the lack of useful and interpretable data which, as discussed, has forced the decentralization shift.

As soon as you decentralize your data operations, you lose what you got for free with centralization. This raises questions about:

  1. How is data documented and published
  2. How it’s discovered
  3. How access is managed
  4. How access is provisioned
  5. Whether it’s delivering value to the business as a whole

These decentralization challenges have led organizations to adopt internal data marketplaces. The marketplace formalizes what centralization previously solved informally. It provides a single place to set standards on what can be published, how to publish it and make it discoverable, and how to manage access controls through provisioning in the data platform. If your marketplace is doing its job, everyone in the organization can quickly realize greater data agility and satisfaction, and it will deliver quantifiable benefits. This is because the marketplace solves the challenges of decentralization while reaping the benefits of decentralization, namely by making the data useful, trustworthy, and understandable.

It’s Not Just About Marketplace

One of the keys to a data marketplace is having a strong definition and requirements about what can be published to it. This sets an expectation between producers and consumers about what is exchanged in the marketplace, often called data products.

Data products have many characteristics, but in general share the following:

  • A fundamental “object” which can be shared across business units
  • Protection through access control gates
  • Product managed, like you would have for an application
  • Targeting to a specific business use case and/or audience
  • Metadata and documentation
  • Automatic updates and versioning
  • Quality and reliability SLAs, sometimes termed a data contract
  • Terms of Use or a Data Use Agreement

While data marketplaces show signs of great adoption, we do see many organizations struggle to mature from using and sharing raw data assets (e.g. tables, views, objects in storage) to more formalized data products, as described above. And since there’s a broad spectrum of maturity with data products, we’ve seen organizations introduce less formal data assets in the marketplace.

This can pass early on because the business needs those foundational data assets to build the data products in the first place, and the marketplace can fill that need. But those business units must also have an eye towards reaching the goal of operationalizing data products, because that will drive the true decentralization outcomes. Ultimately, raw data assets suffer from some of the same challenges discussed above related to reliability, interpretability, and utility.

Bottom line: While we see strong internal data marketplace adoption, we see lag in data product adoption, and this is a problem if not remedied over time.

Should Your Data Marketplace Be Purpose-Built?

Our last observation on data marketplaces is that while some are purpose-built, others extend existing business capabilities into the marketplace functional needs.

We have seen much more success with the former than the latter. This is because an internal data marketplace is not a simple bolt-on. It has a large swath of complicated requirements that are necessary in order to meet the needs of both the producers and consumers. It’s important that the business evaluates its marketplace carefully, because it is the “centralized” tool that enables all decentralized initiatives. If you do it poorly, your philosophical shift will fail.

More specifically, we’ve seen customers that have implemented their own purpose-built marketplace achieve much more success than customers that have tried to turn their existing catalog solution into a marketplace. This discussion could be an article on its own, so we wrote one.

Conclusion

Internal data marketplace initiatives have exploded onto the scene due to central data teams ceding more control to lines of business to manage and drive value from their own data. The marketplace acts as the centralized tool to allow consistency, while efficiently enabling this decentralization. And we’ve seen that, to truly reap the benefits of this philosophy, the businesses must agree on a data product mindset and empower the delivery of those data products through a purpose-built internal data marketplace.

Ultimately, moving to decentralization and a marketplace well results in the following business outcomes:

  • Faster and better data-driven decisions
  • Take advantage of market trends
  • Streamline operations
  • Reduce workload on technical teams

And more specifically, some real world examples we’ve seen:

  • A pharmaceutical company that was able to optimize supply chain and reduce outages
  • An online retailer that was able to enhance the customer experience by enabling users to interact with different products, without having to sacrifice control
  • An automotive company that was able to leverage customer usage data while adhering to geographic compliance requirements

Start your marketplace implementation.

Explore the product with us.

your data

Put all your data to work. Safely.

Innovate faster in every area of your business with workflow-driven solutions for data access governance and data marketplaces.