TDAN: The Data Administration Newsletter, Since 1997

THE DATA ADMINISTRATION NEWSLETTER – TDAN.com
ROBERT S. SEINER – PUBLISHER

Subscribe to TDAN

   > home
 Printer-friendly
 E-mail to friend

A New Way of Thinking - October 2006
Data Governance and the Federated Community

by David Loshin
Published: October 1, 2006

Published in TDAN.com October 2006

The concept of a “federated community” has started to pop up in a number of instances, and the nature of information sharing within (or across?) a federated community. And as more organizations begin to explore ways to exploit free-flowing information streams (such as those enabled through services-oriented frameworks), the more important the notion of federation becomes. So as an advocate for data quality management, I think that it is worthwhile for us to consider the special challenges that are introduced when attempting to define a data quality strategy for interoperability within a federated community model.

First of all, what is a federated community? At the simplest level, it is a collection of participants (individuals or organizations), each of which under its own administrative domain and governance, who agree to collaborate in some way that benefits the participants, both individually and as a community. These communities may cross organizational, political, geographic, and jurisdictional boundaries. An easy way to identify the formation of a federation, at least one based on information sharing, is watching the development of data standards. The need for a standard exists when two parties need to agree on a way to understand each other; the more parties that join in the activity is evidence that there is general agreement that there is benefit in collaboration.

In addition, there may be external obligations and expectations with which each participant may be required to comply. Each participant may have different roles and obligations, and the level of expected conformance to these obligations can vary from voluntary to various degrees of compliance.

Assessing the degree to which participants conform to best practices and the various implementations of best practices introduce interesting challenges, especially in the area of data quality management. First of all, within an administered environment, policies regarding the quality of information can be defined and enforced, but as data leaves the organization boundary, so too does the ability to control its quality. Second, the quality expectations for data used within a functional or operational activity within one organization may be insufficient for the needs of the “extended enterprise.” Third, the existence of data outside the administrative domains suggests the notion of ownerless data, for which no one is necessarily accountable.

Consider this: while data quality cannot necessarily be mandated, the expected benefits of collaboration through data sharing can only be achieved when all participants willingly contribute to successful data quality management. An important objective of the community is the development of a data quality framework is that encourages participants to willingly conform to and broaden the integration of data quality. Here are some of the challenges:

Quantification and Measurement

By considering a definition of “quality data” as “fit for purposes,” we can infer that there are few objective metrics for measuring data quality, and that the quality of data is dependent on business user expectations. Even within individual organizations, there often is no formalized approach for quantifying data quality. In a federated community, the absence of formal methods for expressing ways to measure the quality of data adds additional complexity. Consistent criteria and measurements are required.

This introduces two kinds of challenges. One challenge lies in the need to express business expectations for data quality that support the various business needs of shared services within the constraints imposed by the collaborative environment. The other challenge is specifying a universal set of dimensions of data quality for the entire community against which data quality performance can be measured.

Politics and Organizational Behavior

While shared services provide the ability to access data materialized from numerous supplier sources, the actual components of this virtual record are likely to have been distributed across multiple organizations, and may be located in different geographical areas and governmental jurisdictions. The challenge involves providing a framework for information sharing across multi-jurisdictional domains while considering the differences in policies across those different jurisdictions. Since the policies for security, privacy, and management may be defined in different ways by different jurisdictions, non-governmental boards/standards organizations, and private organizations, the framework must accommodate data quality management while maintaining conformance to political and organizational policies.

Regulatory

Accompanying the individual political and organizational challenges are the regulatory policies and legislation that may govern the various geopolitical jurisdictions. While it is reasonable to assume that there will be overarching policies regulating the use, sharing, and privacy of the data exchanged through the use of shared services, it is also likely that each jurisdiction’s policies may have slight differences. The challenge is to provide a means of data quality policy management that maps the various policies into business and data rules for validating that the policies are being consistently conformed to, while accommodating variant rules as information crosses jurisdictional borders.

Technical

Typically, the collaboration is an attempt to exploit the abilities of production or legacy systems, and layer service-oriented functionality on top of the existing applications. The combination of existing application systems implemented across a distributed environment consisting of heterogeneous systems introduces a need to provide a data quality infrastructure that can accommodate existing systems while ensuring alignment with future systems. Differences between hardware platforms, operating systems, data storage, and database management systems can introduce challenges in consistency of the data and integration challenges for the conceptual consolidation of many data sources into a virtual “master data source.”

Operational

The ability to best use shared information services relies on the ability to provide accurate and current data from trusted data sources. To ensure that the quality of this data is maintained, there is a need to establish a process for “qualifying,” as well as continuously monitoring these data sources and establish their trustworthiness for ensuring quality across the relevant dimensions of data quality.

A lot of these issues boil down into some more fundamental questions regarding the definition of a governance framework for data quality management that is implementable (and doesn’t impose too much of an impact on the participants), acceptable (i.e., maintains the level of data quality high enough for all to benefit), and operational (i.e., can provide continuous measures of community-wide conformance to data quality standards). The trick lies in gaining acceptance for two ideas:

  1. Integrating data quality management and monitoring within the service-oriented architecture. The technical components supporting community-wide data quality management must also be provided as component services that can be embedded within the architecture.
  2. Providing a forum for balancing the differences in administrative and jurisdictional policies, certification, and participant requirements, to provide methods for managing and implementing the numerous policies that reflect how data quality expectations are validated. This implies establishing protocols for configuring and deploying the shared and collaborative management and measurement of data quality.

It is inevitable that data quality issues will erupt, and developing a governance program that is built on cross-organizational consensus will help in developing policies and protocols to address those issues. And what I find funny is that, the more we look at the issues associated with federated communities, we will start to see (in an almost fractal way) similarities within our own organizations’ boundaries – cross-division, cross-region, cross-department, cross-application, cross-database… Perhaps the federation concept might not be such a new one after all.

Copyright © 2006 Knowledge Integrity, Inc.

Go to Current Issue | Go to Issue Archive


Recent articles by David Loshin

David Loshin - David is the President of Knowledge Integrity, Inc., a consulting and development company focusing on customized information management solutions including information quality solutions consulting, information quality training and business rules solutions. Loshin is the author of Master Data Management, Enterprise Knowledge ManagementThe Data Quality Approach and Business IntelligenceThe Savvy Manager's Guide and is a frequent speaker on maximizing the value of information. David can be reached at loshin@knowledge-integrity.com or at (301) 754-6350.

Editor's note: More David Loshin articles, resources, news and events are available in the David Loshin Expert Channel on the BeyeNETWORK. Be sure to visit today!