Meta Data Themes - Part One - The Basics
Published: June 1, 1998
Imagine what it would be like if there was an inexpensive one-size-fits-all solution to the management of meta data.
Imagine what it would be like if there was an inexpensive one-size-fits-all solution to the management of meta data. Imagine what it would be like if it was easy to move meta data from one tool or platform to another and that tool vendors collaborated on standard cross-product meta models. Imagine meta data as an integral part of the business intelligence offering in your company. Imagine if pigs could fly.
I can picture a lot of data architects and repository administrators are nodding their heads in unison. These individuals are most likely to know what a "dream" it would be to have such an integrated IT environment. The integrated environment would make it much easier to improve data quality services, improve data standard compliance, reduce data redundancy, and offer a higher level of understanding about corporate data. A dream; Webster's dictionary calls it a "wild fancy or hope". The scenario described in the first paragraph is a dream. It is not reality. Not yet. The time may be coming but it is not around the corner.
What can we do to make the best of the current situation? Should we wait around for the few vendors with resources to provide solutions to battle out the issues and pick a single standard? This may take more time than we have (or it may never happen). We have to act now.
The purpose of this two part article is to demonstrate, in easy-to-understand terms, a manageable way of selecting the meta data to manage that will prove it's value and build a foundation for future data and meta data management efforts. The goal of the first article will be to discuss basic themes of meta data. The second article will discuss advanced themes of meta data.
Getting the Ball Rolling
For companies that I have worked with, getting started is one of the hardest parts of the meta data project. Companies that require return on investment in hard dollars for meta data management often have a difficult time cost justifying the need for an enterprise meta data framework or strategy approach. It is possible to apply dollars to the ROI (through reduced unproductive work, rework, research time, incorrect decisions, and more) however, these dollars are often viewed as inflated and become based on assumptions that the companies will change the way they manage their data. Companies that require hard ROI are fully justified in wanting to know what they are getting for their money. My point is that sometimes it becomes a hurdle that can become insurmountable.
Many companies that move forward with meta data management projects understand the need to improve data resource knowledge and do not required a defined hard dollar return on investment. These companies believe that leveraging other investments (data warehousing, package implementations, system development, e-commerce, …) is only possible through improving the understanding and management of data through meta data.
Neither of these organizations are wrong. In the first type of organization, getting the ball rolling takes more time and, in the day of reduced budgets, becomes more and more difficult. In these companies, the data architects and data resource management staff hope that management will allow them to improve the organizations understanding of their data. These same individuals hope that their company will not wake up years behind their competition.
For the companies that require ROI definition all is not lost. There are things that can be done in small tightly focused projects that do not require a substantial amount of resources.
Enterprise meta data does not need to be delivered all at once. In fact, it is difficult (if not impossible) to deliver enterprise meta data all at once. Therefore the first step of the project should be to understand the big picture of enterprise meta data management as a back drop for smaller and tightly focused meta data projects. Get the ball rolling, show value and move on from there.
Themes in Major Projects
Which of the following projects are significant projects at your company?
You say ... all of these are important. Certainly these projects are important to a large number of companies that apply a large portion of their IT budget to these types of efforts.
It is unreasonable to believe that companies will fund a project or apply resources to a project to address the meta data concerns of all of the projects listed above right out of the gate. For all of the projects above, meta data can be a key component to success and sustainability.
So … How do we select which meta data is the right meta data to manage as it pertains to the most important projects in our company? By looking for themes of meta data. When analyzing the basic meta data that is important to each of these projects, we uncover three underlying meta data themes.
Meta Data Themes - Basic
Much of the meta data for the types of projects defined above are based on similar themes. These themes are:
Purists may tell you that there is a lot more to meta data than just these three themes. This is true. However, for the purposes of this first article, I will discuss the extremely basic themes of meta data and I will focus on the items listed above and defined below. The basic meta data themes offer you an easy-to-understand starting point for meta data management.
The table below demonstrates the types of meta data that are related to the basic themes:
Figure 1 – Basic Meta Data Themes
Data Model Meta Data
Many projects start with the business definition of data through the use of a data modeling tool or tools. There are a variety of tools available for project data modeling and enterprise data modeling. By analyzing the meta models of several modeling tools in established repository environments, common objects appear that define the basis of data modeling meta data.
Common to most modeling products and projects are the definition of business entities, business attributes, domains, allowable values, and the key attributes and business rules that are used to relate entities to each other.
There are other types of meta data that are important to the modelers and the modeling tool. However, when looking at how the data modeling meta data would likely be used away from the tool, the items listed in the table above will most likely answer the needs of the business and technical meta data users.
When business data models are forward engineered to physical database designs, additional meta data is captured in the modeling tool (mapping of the logical model to the physical database). This information will be used to relate the business definition of the data to the physical environment.
A basic diagram of the modeling meta data discussed above is shown in Figure 2.
Figure 2. - Basic Data Model Meta Data
Physical Database Meta Data
If you are using a relational database management system such as a DB2, Oracle, Sybase, SQL Server or Informix, the DBMS catalog serves as a meta data repository for your database environment. All of the information found within the catalog is meta data. Not all of this information is useful to individuals who are not database administrators. DBAs often have direct access to the physical database meta data through database management tools. In that case, DBAs may tell you that they do not require integrated meta data to help them with their jobs. They already have access to the meta data that they need to perform their jobs.
Most likely, the meta data in the DBMS catalog that will be important to individuals other than the DBAs will include information about the database, tables, views, columns, and indexes. There is a tremendous amount of meta data in the DBMS catalog, but when individuals are looking to use the data, the physical characteristics listed in this paragraph cover their most basic needs.
Legacy applications often store data in files defined by copybook members (used to define the data in flat files, VSAM files, and IMS segments). Copybook members do not contain the level of detail about the physical data as the DBMS catalog. But, since data is still stored that way (and will continue to be stored that way), it is important to consider these types of physical data definition in the basic themes of meta data.
The types of meta data found in copybook members include copybook names, record names, field names, group names, and the structure of the data (i.e. picture clauses).
A basic diagram of the physical database meta data is shown in Figure 3.
Figure 3. - Physical Database Meta Data
Data Movement Meta Data
Data does not stand still. Data moves from business unit to unit, function to function, and database to database. Data is, perhaps, the most dynamic asset of the company. As data is recorded and processed, mappings and transformations take place that determine how data should be interpreted.
In many companies, during data movement processes developed for package implementation and/or decision support environments, mapping and transformation meta data is often captured in data transformation tools that automate the data movement process. The information about how the data in the package or warehouse was created is important to the warehouse builders. BUT …that meta data can also be important to business users who's job it is to interpret the data.
Companies that don't use data movement and transformation tools should pay special attention to documenting how data is selected, extracted, mapped, moved, and transformed. This information can be typically (but not easily) be found in the meta data that exists 1) in the source code that is written to perform these functions or 2) in external forms of documentation such as word processing documents, spread sheets, and personal databases. Either way, the data movement meta data becomes important for the proper interpretation and use of the data.
A basic diagram of data movement meta data is shown in Figure 4.
Figure 4. - Data Movement Meta Data
Ways to Use Basic Theme Meta Data
To this point we have defined three basic themes for meta data management: Data Modeling Meta Data, Physical Database Meta Data, and Data Movement Meta Data. The meta data gained from these three pieces of the data architecture, can be used to answer a large number of questions about the IT of your organization.
Doing a good job of managing the basic meta data is not an easy task. There are many considerations to keep in mind when implementing meta data management for the three basic themes. Identifying the meta data, selecting the meta data, mapping the meta data to a common target, moving the meta data to a centralized repository (or separate data store) and keeping the meta data up-to-date.
However, the hard work can pay off if meta data is available to answer specific questions about the corporations data assets. A list of typical questions appears below:
This article addressed the first step of meta data management; the identification of basic meta data that can be helpful in many organizations. By managing the meta data in the data modeling tools, physical database environment, and the mapping and transformation of data, organizations will be better prepared to deliver information about the projects and data that are the most important to the organization.
The second article will discuss advanced meta data themes that include data access, data quality meta data, business rules meta data, and information accountability meta data. The advanced meta data, in conjunction with the basic meta data, provides an easy-to-understand mechanism for moving forward with enterprise meta data management.
Recent articles by Robert S. Seiner
Robert S. Seiner - Robert (Bob) S. Seiner is recognized as the publisher of The Data Administration Newsletter, LLC – www.TDAN.com – an award winning electronic publication that focuses on sharing information about data, information, content and knowledge management disciplines. With 2013, TDAN.com enters its 17th year. Mr. Seiner speaks often at major data management and meta-data management, business intelligence and knowledge management related conferences and user group meetings across the U.S. He can be reached at the newsletter at firstname.lastname@example.org or 412-220-9643.
Mr. Seiner is the President and Principal Consultant of KIK Consulting & Educational Services, LLC – www.KIKconsulting.com. KIK, celebrating its 12th year, is a company that focuses on knowledge transfer and consultative mentoring in the fields of data governance and data stewardship implementations, metadata management, master data management and data architecture. Beyond knowledge-transfer-focused consulting, Mr. Seiner offers two-day in-house and public courses on how to build and implement data governance / stewardship programs and metadata programs. Contact Mr. Seiner at KIK at email@example.com.
Editor's Note: View his blog, more articles and resources in Bob's BeyeNETWORK Expert Channel.