The Enterprise Data Model: A Key Ingredient for Successful Data Warehousing
Published: June 1, 1998
The EDM is a key tool that will help us make sure that the warehouse delivers its promised return on investment.
A friend once commented to me, "I can't justify the investment of developing an enterprise data model to support a warehousing effort. Warehousing is focused on dealing with data at the physical level. I need to move data from a physical source to a physical target. A conceptual enterprise data model is not going to help me in that effort".
I can't argue that source to target data migration analysis and design deals with detailed column domain definitions that are not addressed at the conceptual level. However, I think the enterprise data model (EDM) plays a critical role in the planning, design and future success of the enterprise data warehouse. It is precisely because the information sourced to the warehouse is constrained by the type of implementation design issues we wrestle with in column mapping, that we need the EDM. The EDM is a key tool that will help us make sure that the warehouse delivers its promised return on investment. It is the component of data architecture that supports the mission of the data warehouse, to provide business intelligence and enable analysts to form tomorrow's profit making strategies. This article will examine the role of the enterprise data model in a warehousing initiative and provide the rationale for its development.
Enterprise Data Model Definition
In a general sense, a data model defines the objects that are of interest to the business and the rules that describe and govern those objects. John Zachman's Framework for Information Systems Architecture provides a taxonomy for relating the concepts that describe business objects and the concepts that describe an information system and its implementation. Each cell in the Framework represents a model. The rows of the Framework represent the functional purpose and use of the model. The columns represent the who (people), what (data), when (timing/events), where (network), how (process) and why (mission).
Zachman's Framework is introduced here for two reasons. First, to illustrate that the data model satisfies only one perspective of information systems architecture, the "what" column. Second, primarily to distinguish between different types of data models:
Row 1 – the planners model – enterprise scope
The owners model (Row 2) is the enterprise data model discussed here. This model defines the common terms and strategic business rules for corporate entities without technology constraints. In this model, Customer and Product are defined as conceptual enterprise-wide entities. The critical business rules that govern the management of each entity are defined with a common corporate viewpoint.
The EDM is an entity relationship model with primary entities, common supertypes, important subtypes, and important attributes defined. Many-to-many relationships are not resolved. The scope of the EDM should cross-functional and organizational boundaries. There is no such thing as a Sales and Marketing ‘enterprise data model'. In this case, the model is a conceptual data model drawn from a functional perspective, not an EDM.
Benefits of the Enterprise Data Model to a Warehousing Initiative
Some familiar data architecture challenges:
The above list of challenges looks familiar to anyone who has worked on defining the data architecture for a data warehouse. Ironically, this list is not from current warehousing literature, but from an old slide on data modeling challenges and its review shows that the enterprise data model seeks to conquer the same challenges that we face in data warehousing. The challenges in defining the data architecture for warehousing lie in achieving data integration, promoting information reuse, and providing business intelligence. The EDM will help define a current and future data architecture that meets those challenges.
1. The EDM provides an enterprise perspective.
One purpose of the EDM is to discover common threads and develop a cross-functional, common definition of the entity. For example, a telecommunications corporation manufactures equipment, purchases equipment for use and resale and develops for sale network management software that manages equipment. The functional information requirements for equipment data differ depending on whether equipment is perceived from the viewpoint of the manufacturing function or the network systems management function. Is the network switch equipment or product? Does the enterprise have the same business terms and rules for equipment manufactured, purchased, and managed via network software?
The definition of information in common enterprise terms improves the context of information. The warehouse often provides global enterprise access to information that had previously been limited to a functional area. This broader audience requires that the information be defined in a context that makes sense to the reader regardless of their industry background or global location. Another important purpose is to document derivations for strategic key performance indicators. These are the business measures that executive management applies to measure performance.
2. The EDM defines strategic information needs.
The enterprise data model is a ‘to be' model. It is strategic. It recasts data into a model with vision that transcends biases in today's data. Strategic definitions are developed that remove data from its literal system representation and transform it to information in business language that will support growth and new initiatives. For the natural gas utility that seeks new and unique service and product offerings as it continues to evolve under FERC deregulation, the EDM plays a critical role.
Many of us are familiar with the complexities of modeling customer information. Is the customer the individual who orders the item, receives the item, pays for the item ? The roles of orderer, receiver and payer could all be different individuals. In some cases a customer can be a competitor. A supplier can be a customer. Are there universal customer rules and rules that apply only to specific customers? It is through the definition of the EDM that the natural gas utility can initiate the definition of new requirements for business party information and business party roles that will support a more dynamic business model that meets marketplace challenges.
Because the EDM is at a conceptual business level its development explores the entities and business rules most important to corporate survival and success. It is developed with key decision makers and its purpose is to represent information in a manner that will help the enterprise compete and grow. The development of the EDM provides a good understanding of business party roles and important super-type and sub-type entities which will be critical to the design of the data warehouse and dimensional models.
3. The EDM forms the foundation of the warehouse atomic data store.
The data warehouse architecture most subscribed to today is designed as a multi-tiered data architecture comprised of a logically centralized, normalized, atomic data store that is the source for multiple functional data marts. In the design of the atomic data store the goal is to build a foundation of data that is a shareable, consistent, reusable atomic source of information for the dimensional data marts.
Since the atomic data store must be designed to meet cross-functional needs, it requires the EDM as a blueprint. The reference to the EDM ensures that the atomic data store is built on a forward thinking data model and a solid understanding of cross-functional entity super-types and sub-types. It ensures a more stable design of the entities that form the business dimensions in the atomic data store as the corporate business model evolves in response to market challenges. The reference to the EDM also facilitates cross-functional data integration.
4. The EDM supports planning the data warehouse release strategy.
The EDM describes the integration points between primary entities, provides insight into the complexity of information, and supports the gap analysis between strategic information needs and currrent information availability. The EDM provides an improved understanding of information requirements across functional and organizational boundaries. This knowledge is input to the long term warehousing strategy and assists in determining the phased implementation of information blocks that will provide a solid return on investment.
5. The EDM initiates business data stewardship.
The purpose of data stewardship is to place the accountability, control, shareability and quality management of enterprise data in the hands of the business people who define, create and access the data. The subject matter experts that contribute to the EDM serve as strategic data stewards. They are responsible for the development of data definitions and standards that support enterprise-wide use and are an important contributor to the quality management process for the data warehouse.
The following outlines some critical success factors for the development of an Enterprise Data Model:
The EDM is more than an ideal, an academic exercise that is subscribed to in theory, but when the reality of project deadlines loom, gets tossed out of the project plan. Familiar reasons for eliminating the EDM are time and lack of subject matter expertise to contribute. There is no argument that the development of an enterprise data model is a challenging process. The internal political realities of the enterprise, extent of decentralization, and dynamics of the acquisitions and divestitures are all factors that increase its complexity. However, I think too often the EDM is bypassed because its content, purpose and application are not fully understood.
As the conceptual blueprint for the enterprise data warehouse, the EDM is indispensable. It defines the warehouse vision. It provides a strategic, horizontal view of information across the enterprise that is critical to the logical and physical design of the warehouse data architecture. The more complex the business, the more critical the EDM is to the success of the information systems architecture and specifically the data warehouse.
Kathy Long - Kathy Long is a Consultant with Knowledge Partners, Inc. She has enjoyed a diverse career in the information systems field over the past 16 years working as a business analyst, data and process modeler, meta-data analyst, and application designer. She has managed a variety of projects including strategic business modeling, data integration and systems development in the manufacturing, natural gas, pharmaceutical, consumer products and telecommunications fields. As a consultant, she has played a lead role in several full-lifecycle data warehouse iterations that defined sales, cost, product supply chain and inventory analytical information. Ms. Long is experienced in data migration design, logical and dimensional modeling, meta-data architecture and analytical information requirements definition.