A Guide to Enabling Inter-Domain Data Sharing

Love Malhotra, Senior Cloud Solution Architect, phData on April 19, 2024
Last edited: October 23, 2024
Default alt text

For many, the appeal of a decentralized data architecture relates to its potential for enhanced collaboration. But to achieve this kind of streamlined collaboration, your team must first establish a system of secure, self-service domains.

In a previous blog, we explored how to make decentralized data mesh architectures a reality based on phData’s experience with our customers. In this blog, we delve into the intricacies of implementing decentralized domain-based architectures and enabling seamless data sharing for data-driven collaboration. We will explore the common challenges organizations face in this endeavor, and how phData’s proven process for domain implementation addresses these hurdles. Additionally, we will examine the implications and best practices for inter-domain data sharing, striking the right balance between domain autonomy and global data access governance.

By combining theoretical principles with practical insights, this blog aims to equip organizations with the knowledge and tools to engage in decentralized data mesh sharing and collaboration successfully.

What is phData’s Approach to Domain Implementation?

phData’s approach to implementing decentralized domain-based architectures follows a structured and phased process, designed to ensure both a seamless initial integration and long-term success. This process includes:

  • Conducting a comprehensive assessment of an organization’s data landscape, identifying potential domains based on business functions, data usage patterns, and cross-functional dependencies.
  • Engaging with stakeholders across different business units to gather domain-specific requirements and establish a shared understanding of domain boundaries.
  • Defining clear domain ownership and governance models, aligning with the organization’s overall data strategy.

Following these initial steps, our team is able to work with an organization to develop and implement the various project, use case, or team-specific domains their business requires. The timeline for this process should follow the model below:

[Read More] How to Make Data Mesh a Reality: An Implementation Roadmap

Tools for Enabling Inter-Domain Data Sharing

Once domains have been developed, organizations can explore streamlined opportunities for more collaborative data use. For the decentralized implementation example that we’ll examine in this article, each domain team/business unit will independently manage their data’s transformation, curation, and security. However, this decentralized independence must be balanced with seamless and efficient access to data across these business units, along with effective data governance.

In this example, we are using the Snowflake Data Cloud to support the organization’s domains. Compared to  other  popular approaches, we’ve organized a separate Snowflake account per each domain in this instance. Snowflake’s data sharing capabilities helped us achieve this, ensuring business continuity.

Leveraging Snowflake’s self-service searchability across enterprise data catalogs and subscription data in the Snowflake data warehouse, we are able to implement an access management process that enables just-in-time access upon successful completion of workflows via the organization’s existing ITSM tools and phData’s provision tool’s resource management.

NOTE: phData’s provision tool allows you to manage Snowflake resource lifecycles — creating, updating, and destroying — with a templated approach to provide a well-structured information architecture.

Additionally, we use a centralized account to store the catalog information, contract details, approval workflow policies, and access management details for the organization’s decentralized data mesh architecture.

The Main Challenges to Inter-Domain Data Sharing

When attempting to achieve secure domain-based collaboration, teams should be aware of the following challenges:

Scalable Data Sharing:

Heterogeneous data access policies and agreements across business domains can complicate the implementation of data sharing. Maintaining these data access policies and masking rules as new data shares are created can easily become complex and error-prone. Leveraging tools like the Immuta Data Security Platform can help simplify and scale the management of data sharing policies.

Data Discoverability and Documentation:

As your decentralized data expands, there’s a risk of creating data silos and limiting visibility if metadata and documentation are not properly managed. Robust metadata management and advanced search capabilities are crucial for helping users discover relevant data products. Comprehensive documentation on data products’ purpose, schema, lineage, ownership, and other details is essential.

Governance and Operational Efficiency:

A growing data mesh requires ongoing housekeeping to manage technical debt, such as deprecated assets and inefficient processes. This requires robust governance frameworks and tooling that can identify and address these issues proactively. Cost management and budget control are also important considerations as the data mesh scales, as you need to ensure that operational costs are supported by business-driving results.

Data Privacy and Compliance:

Adhering to data privacy compliance requirements is a key challenge, requiring a robust data privacy framework and governance controls that incorporate the standards of compliance laws and regulations, contracts, industry standards, and more. Techniques like data masking, data access controls, and auditing must be consistently implemented across the data mesh to enforce and maintain compliance.

Three Key Requirements for Inter-Domain Data Sharing

An organization can enable secure and efficient inter-domain data sharing by incorporating the following key centralized modules:

  1. A Data Contract Framework
  2. A Centralized Data Catalog
  3. Self-Service Platform Enablement

1. Data Contract Framework

The data contracts framework acts as a central governance mechanism, ensuring that data products across domains conform to agreed-upon schemas, semantics, interoperability with other data products, and quality levels. It enables efficient collaboration and data-driven decision-making by providing a consistent and reliable foundation for data sharing within the decentralized data architecture.

By implementing a robust data contracts framework, organizations ensure that inter-domain data sharing adheres to well-defined standards and expectations. This framework not only promotes trust and transparency among data consumers but also facilitates domain autonomy by enabling domain teams to evolve their data products independently – while maintaining compatibility, security, and quality guarantees.

For this implementation example, we have used a centralized Snowflake account that  is well suited to serve as a registry for the  data contract framework that hosts and stores the data product schema, definitions, versioning, service level agreements, privacy policies, and workflows for the access management process. In addition, a contract registry serves both as the store for state management and monitor access patterns.

[Read More] What is dbt Mesh and How to Adopt It? 

 

2. A Centralized Data Catalog

In this implementation example, we’ve created a data catalog repository within the centralized Snowflake account in its simplest form. With this in place,  we can automate the metadata sharing process in a secure manner. This automation also includes a mechanism to update the catalog as new data sets become available, keeping metadata fresh throughout the decentralized ecosystem. These kinds of implementations may involve third party tools that support data catalog features at enterprise level, such as Alation and Collibra.

[Read More] How to choose the right data catalog for your business

3. Self-Service Platform Enablement

To ensure domains are adopted by data owners and leveraged for enhanced sharing purposes, you need to make sure that your team is enabled with contextual information and best practices. Without effective change management, your domain-based ecosystem is likely doomed from the start.

Domain owners must be brought up to speed on their new responsibilities, preparing them to manage new data products, prioritize various initiatives, and control data access. They need to know how to generate and manage data use standards and access controls in order to keep their data secure while allowing access for a larger number of decentralized data users.

The Inter-Domain Data Product Access Flow

With the aforementioned platforms, frameworks, and tools in place, how might shared access look in this new decentralized ecosystem?

The process flow starts with registering domain-specific metadata to the central Snowflake account. Consumers can search for available data products through the data marketplace, as well as request access to the data by searching the available data product catalog. The approval process will then be triggered through the central account.

The steps of the process flow are as follows:

  1. All the data products and metadata  are registered in a centralized account.
  2. Users can browse the data catalog to find the data products available across the organization.
  3. Users can request access and a workflow gets triggered that sends the request to the data product owner. Upon the data product owner’s approval, access is granted.
  4. Consumers can then access the shared data product via a data share.

Conclusion

In this article, we provided a high-level overview of how a modern organization can implement a decentralized architecture that supports inter-domain data sharing and collaboration.  By working with tools and providers that support distributed domains while maintaining cohesive data governance, compliance, and accessibility – such as Snowflake, phData, and Immuta – you can  create a scalable data mesh that meets your organization’s collaborative, self-service data needs.

To jumpstart your Snowflake data sharing knowledge, check out phData’s quick start guide and this Immuta Snowflake data sharing guide.  To explore how you can get started with secure, collaborative domains, request a demo from Immuta.

Start Collaborating with Data

Get in touch to learn more about secure domains today.

your data

Put all your data to work. Safely.

Innovate faster in every area of your business with workflow-driven solutions for data access governance and data marketplaces.