Model Driven Information Architecture
Published: April 1, 2002
Published in TDAN.com April 2002
The Challenge of Information Integration
Over the past twenty years, enterprises have created many diverse systems to manage their information and data. Individual systems combine a myriad of hardware configurations, operating systems, databases, and applications. Often, individual enterprises have found themselves with several disparate information systems among their divisions and departments, especially after mergers or acquisitions have broadened the scope and depth of the enterprise.
As the world, not to mention the enterprise, networks more completely, the enterprise needs to integrate its diverse systems to operate and analyze its resources more effectively. Numerous external sources, from partner information resources to real-time data feeds, have become available. The enterprise needs to marshal and integrate these disparate systems. At the heart of the systems integration challenge lies an information integration challenge.
Model-driven integration differs from the programmed integration. Programmed integration relies upon hard-coding a finite, and inextensible, solution to a particular challenge. Model-driven integration focuses on abstracting the information content into a model that describes the enterprise’s information resources. This model captures the nature of the information the enterprise has within its systems and the way the enterprise uses data in its daily operations.
The data model does not rely on a particular hardware or operating system platform; instead, it contains standard constructs that show the data entities and operations. Once an enterprise captures its information resources in a model, it can easily integrate this information using a middleware server.
Model-driven integration offers a complete solution because the data model can demonstrate:
Model-driven integration represents an evolution of the common and accepted practice of developing applications and representing systems using standard constructions in a model. The most common example of this practice, and the precedent for model-driven integration, is the use of Unified Modeling Language (UML).
The Historical Precedent: UML
As development languages and environments proliferated in enterprises, developers accustomed to varied, and often proprietary, systems and languages found themselves lacking a common way to express concepts such as workflow, relationships between entities, interaction, and other abstractions inherent in development.
In 1997, the Object Management Group (OMG) established a standard language for expressing these concepts. UML contains many types of models that represent standard presentation for a type of information. Hence, UML represents a metamodel, a language for constructing and relating diverse models.
The Unified Modeling Language (UML) gained rapid acceptance and set off a new trend in application development. Several companies created software tools that implement the UML standard to create graphical tools that enterprises of all types can use to “design” applications. Figure 1 contains a UML class diagram depicting a portion of an application.
Figure 1: UML Class Diagram
This model contains standard constructs; anyone versed in UML can review the model and understand its contents. By modeling applications in UML before writing actual code, the enterprise garners many benefits:
The Platform Independent Model
Using UML, an enterprise can build a Platform Independent Model (PIM) that captures the design, business logic, and data requirements of each particular application. Independent of any platform, the model describes the goals of the application without referring to a specific operating system, hardware configuration, or even programming language.
The Platform Independent Model could describe an application written in Java, running on the Sun Solaris operating system on an Alpha or the PIM could refer to a VisualBasic application running on the Windows XP operating system on a quad Pentium IV server.
Automatic Code Generation through Tools
Certain new UML modeling tools can generate platform-specific code from a UML model. Using these tools and UML saves many enterprises a great deal of programming effort—and expense.
Simplified Application Migration
When an enterprise needs to migrate an application from one platform to another, a UML model makes the process straightforward. The UML model documents workflow and objects outside of the comments within the code and the memories of developers. Without the model, enterprises can find both occasionally unreliable for a complex task.
PIMs and System Integration
While enterprises have applied UML models successfully to application development, UML has not yet significantly affected the problem of systems integration. While Platform Independent Models offers excellent return on investment (ROI) for large enterprises, enterprises have not yet applied the lessons learned with PIMs and UML to the challenge of systems integration.
The Model in Information Integration
The success of the Platform Independent Model in application development leads to the use of PIMs for enterprise information sources. Enterprises use PIMs to ease their systems design and application design; in the same way, enterprises can use model-driven integration to solve their information integration challenges, which lies at the heart of their systems integration.
To integrate their information sources, enterprises can produce PIMs of diverse information stores. However, different enterprise information systems have different methods of storing information. A single metamodel, the abstraction that describes the structure of models, cannot accommodate all possible variations. With that, the meta-metamodel was born.
Differences in Metamodels
When constructing PIMs for software applications, enterprises needed only one metamodel, provided with the UML specification, to describe the models the enterprise created. However, because information sources, including relational databases, hierarchical databases, object databases, files, streaming information, and many others, can have radically different structures, enterprises will need more than one way to describe the models they need to create. These enterprises need more than one metamodel.
With that end in mind, the Object Management Group created the Meta Object Facility (MOF) standard, extending UML to apply it to modeling diverse information systems. The Meta Object Facility standard describes diverse metamodels, essentially abstracting a form and structure to describe metamodels. Figure 2 describes the Meta Object Facility’s structure.
Figure 2: MOF Standard
The Object, Relational, File, and XML information sources have individual structures, described in the model (M1 in the figure). Each information source has a metamodel, which determines not only the structure, but also the relationships between the entity types in a model (M2 in the figure). The meta-metamodel (M3 in the figure), then, is a metamodel that describes the contents of metamodels, in this case the types of entities shared by information systems of all types. The MOF standard describes the methods of describing all information system types, and can extend to include systems beyond the four described above.
Modeling Data with Metadata
When modeling an information system, the enterprise captures the essence of the information within its systems—including technical aspects of the data, which describe the structure of the data, and business aspects of the data, which describe the way the enterprises uses the data. This captured essence is called metadata, data about the data. The Platform Independent Models include both the technical and business metadata, but remain platform independent because their contents remain descriptive in nature.
The Platform Independent Models contain metadata that describe the data in the physical sources. For example, in the relational metamodel, this includes table and column names, data types, keys, and foreign keys. This metadata is called physical metadata.
Figure 3: Physical Metadata Model
To simplify application development and to achieve information integration, enterprises need to describe, in a common way, the metadata in the disparate physical sources. For instance, system 1 has a column named cust_id with the data type of string. System 2 has a column of equivalent information named cust_num with the data of integer. The metadata model that can represent a column named CustomerNumber, which transforms and maps the relationship of cust_id and cust_num, would go a long way to solving the information integration problem. This metadata, which describes the data as the enterprise, or its data-consuming applications, use it, is called virtual metadata. Figure 4 displays a portion of a Platform Independent Model containing virtual metadata.
Figure 4: Virtual Metadata Model
The Tool for Modeling Metadata
A graphical tool for modeling diverse information sources through different metamodels should:
Modeling Data Presents a Partial Solution
However, modeling data as metadata only presents a partial solution to information integration by offering the conceptual integration. Ultimately, the enterprise needs a common way to access all of its systems using the information contained in the Platform Independent Models.
True Model-Driven Integration
To achieve integration at the information level, the enterprise must take the abstract and make it real, much as it takes the UML diagram and actualizes it into an application, created and compiled in a particular programming language. The enterprise needs to use the information within the metadata Platform Independent Models and create a specific, platform-based means of data access.
From PIM to PSM
Remember, the Platform Independent Models describe the information sources. Platform Specific Models (PSMs) must couple the design-time metadata constructions with platform-specific information that contains actual parameters and connection information for the enterprise information systems described in the PIMs.
Platform Specific Models contain technical information from the PIMs coupled with actual connection details for a particular data source and its system. Deployed within an integration platform, these PSMs provide the key for information integration through a Model-Driven Architecture.
The Tool for Integrating Information
The ideal information integration server requires several features:
Implementing an MDA Solution
Figure 5 describes the process of modeling metadata and creating Platform Specific Models using the a Model-Driven Architecture software solution.
Figure 5. Creating PIMs and PSMs
Information Integration, using Model-Driven Architecture, follows these steps:
Model-Driven Architecture for information integration offers a number of benefits for the enterprise that impact the enterprise’s bottom line. These benefits include:
A Model-Driven Architecture Solution, by using a modeling and implementation process that many enterprises already use for application design, extends a familiar methodology in a new direction. Enterprises can integrate their existing information systems easily without costly or confusing conversions and can access the information in those systems in real-time.
© Copyright 2002 - MetaMatrix, Inc.
Michael A. Lang, Sr. - Michael A. Lang, Sr. is Co-Founder and Chairman of Revelytix, Inc., a semantic technology company that has developed an ontology-based collaboration framework for vocabulary development and community knowledge management. Prior to founding Revelytix, Michael co-founded MetaMatrix, an enterprise information integration company that was sold to Red Hat in 2007. Earlier on, he was President of NSSI, a CAD software company purchased by Network Imaging. Prior to NSSI he worked in the financial information industry for Bridge Information Systems and Reuters Information Systems, where he ran strategy for financial information and analytic products. He is a noted consultant in the areas of data integration and semantic technologies. He as assisted in various areas of product strategy for such companies as BEA Systems. He is currently Deputy CTO for Vitria, a BPM and exception management software company. He is a graduate of Washington College, with a BS in Chemistry.
Brian J. Noggle -