TDAN: The Data Administration Newsletter, Since 1997

THE DATA ADMINISTRATION NEWSLETTER – TDAN.com
ROBERT S. SEINER – PUBLISHER

Subscribe to TDAN

   > home
 Printer-friendly
 E-mail to friend

Data Semantics Management - New Book Just Announced
Interview with Michael M. Gorman of Whitemarsh

by Michael M. Gorman, Robert S. Seiner
Published: October 1, 2008
Bob Seiner, the Publisher of TDAN.com, sits down with Mike Gorman to discuss his new book titled Data Semantics Management.

Robert S. Seiner (RSS): Good morning Mr. Gorman.  Thank you for taking the time from your busy schedule to discuss your newest book, Data Semantics Management.  Can you please summarize the two volumes and tell us why this book is well worth the investment to purchase and read?

Michael M. Gorman (MMG): The key point of the two-volume book, which can be ordered from the Whitemarsh website, is that Data Semantics Management failures are very, very expensive. One U.S. DoD agency spends, in 1995 dollars, $167 million every year for ETL (extract-transform-load). Another DoD Agency spent over $500 million across 12 years searching for the Silver Bullet and found only blanks. A third DoD agency spends close to $200 million every year fixing badly constructed logistics transactions. But, a fourth DoD agency, through engineered data semantics, reaped a tremendous ROI. It was in the Billions.

RSS: You just said that the savings were in the $Billions (with a “B”) of a government organization because of engineered data semantics. How so, and you can share this example with the TDAN.com readers.

MMG: This DoD agency saved well in excess of several BILLION dollars in IT costs because they got their “data right.” The savings here were NOT in data per se, but the ability to delete redundant IT systems, computers, and the like, and the ability to prevent the creation of a large quantity of different IT systems. The ROI is not data alone. Mainly it’s from IT system optimizations.

Here’s what they did. First, senior management made funding contingent on getting “data right.” Second, they established a very well-engineered data-centric metadata management infrastructure. Third, they established and executed highly organized Data Interoperability Communities of Interest. Fourth, they made success very public. If there’s any agency that’s walked the talk on “right data” better than this DoD agency, I just don’t know what it is.

RSS: Absolutely incredible.  Tell us about the book, what it’s about.

MMG: This book is the fourth in the Whitemarsh Data Management Series. All the books are focused on a four core principles: Do more, for less, at a lower cost, with reduced risk. This book consists of two volumes. The first is: Rationale, Requirements, and Architecture. Volume 2 is Deployment. Sort of a talk and walk pair.

Volume 1 has eight chapters across about 400 pages. This volume begins with the requirements and rationale for shared and commonly understood data. This volume also details and examines the common data semantics failures, and the lessons learned are applied throughout both volumes.

Volume 1 describes the five Enterprise Data Architecture layers. This volume also sets down three additional critical foundational components: Semantics Hierarchies, Value Domains, and Business Fact Cases. Semantic Hierarchy words, allocated to business-facts enable automatic name and definition construction.

Value Domains enable the engineering of the hidden semantic layer: restricted value sets. Gender is an obvious example. Value domain management also includes the ability to map among different collections.

Finally, Volume 1 identifies, describes, and prescribes how all the different classes of business-fact cases should be handled. For example, simple facts, derived facts, compound facts, paired facts, related facts, and others.

Volume 2 is all about Deployment. It also has eight chapters over about 700 pages. This volume contains all the “action” chapters. These “can–do” chapters provide the step-by-step processes to walk the data semantics management talk set from Volume 1.

By the way, readers can request a free copy of the Metabase System. Just go to the Free Metabase page on the Whitemarsh website and request one. All the metadata in the Metabase is reportable through ODBC based report writers such as Crystal.

The main chapters in Volume 2 address the five Enterprise Data Architecture layers. These are: Data Elements, Data Structure Templates, Database Models that are DBMS independent, Database Models that are DBMS dependent, and Data Interface Models. There is also a chapter on Database Object Classes that enable the creation and management of data models in an object-oriented data, process, and state manner within traditional project-teams, while, at the same time, supporting the creation and management of enterprise-wide semantics and data models.

The first data-layer is Data Elements. It was well established by Mike Brackett that data elements are used over and over throughout a project, an organization, a functional area, and the enterprise as a whole. In fact, the larger the domain is, the greater the re-use. That simply means that Data Elements are semantic templates. They are NOT columns, fields, XML elements, etc. By having Data Elements as the first data-layer, you are put in a great position to develop common understandings, automatically generate names and definitions, and to manufacture database data models.

The second data-layer is Data Structure Templates. This layer can be built top-down or, or through Reverse Engineering. Each template becomes a data structure that can be used to manufacture databases in a repeatable, reliable manner. Here are some simple template examples:  Address, Person’s full name, Stock Numbers, UPC Codes, and Telephone numbers. A more sophisticated example is header and detail for requisitions, orders, and invoices. Other examples include data auditing, and stewardship. Whole functional templates can be built for finance, HR, inventory, manufacturing, distribution, and the like.

The triple for this Data Structure Template layer is Subject, Entity, and Attribute. It is supported by relationships within and across subject-based entity collections. Every attribute is relatable to a Data Element. Names and definitions can be automatically generated.

The third data-layer is the Database Model layer that is not DBMS bound. This layer is constructed through importing Data Structure Templates, or through promotion from the fourth layer in support of Reverse Engineering. Just establish the Schema and then, through import buttons, grab collections of data structures of entities and attributes to construct your tables and columns. There can be multiple 2nd layer entity-based data structures in a given table, an entity-based data structure can be deployed across tables, and finally, an entity-based data structure can be deployed across schemas. Just think about the Data Structure Template examples above and you’ll see that your database models all reflect this strategy.

What the step-by-step chapter processes do is greatly facilitate this work. Of course, because of the allocation of predefined semantics, names and definitions can be automatically generated. The triple in this data-layer is Schema, Table, and Column. It is supported by relationships across tables within a single schema. The relationship between Attributes and Columns is many-to-many. Because of the Data Element layer, you can generate reports based on core names, allocated Semantic Hierarchy words, data types, and the like. That’s real data semantics management.

The fourth layer is the Database Model layer that is DBMS bound. This layer can be constructed in two ways: Importing from the 3rd layer or Reverse-Engineering from operational databases. Once a collection of operational database models are constructed, the process of relating these operational data models into enterprise canonical models can begin.  With these processes and the metadata management system these models are possible. Before, they were a far-off never-funded dream. Now, they can be a close-in, funded, and a practical reality that can be created incrementally.

The triple in this layer is DBMS Schema, DBMS Table, and DBMS Column. It supported by relationships across tables within a single schema. The relationship between Columns and DBMS columns is many-to-many.

The fifth layer is the IT system interface layer. That is, XML, or SQL Views, or an SOA-based transactions.  This layer enables the creation interface specifications between database models and business information systems. It does no good to just have data models. Bringing data to life requires business information systems. This chapter supports the specification of the interfaces between data and process.

You asked how this book came to be. It took nine full months of work. Actually, it took 40 years. I started in database (not DBMS) in 1969. It’s almost been my entire professional life. There have been only two other life-long obsessions. The first is ANSI database standards. I’ve been its only secretary for the past 30 years. All this self-funded ANSI effort has enabled me to work with the heads of DBMS research from all the major DBMS companies. We meet six or more times a year.

My other life-long obsession is my family. First there’s my wife, Maxine, of 44 years, and then there are my 7 children, and 23 grandchildren.  Maxine deserves a life-time endurance award. Can you imagine listening to me talk about database for 40 years at dinner?

RSS: In other words this is the results of a “labor of love” (or a “love of labor”)?

MMG: Well, I would say that this has been both. My objective for this book is that it is employed on every day projects. It’s focused on the here and now, and the practical. I saw no real value in just a Volume 1 theory book. Volume 2 is the proof of the pudding.

RSS: As most TDAN.com readers know, you (we) have been publishing your articles on the TDAN.com newsletter almost since the beginning in 1997.  Why do you think the TDAN.com readers have such great interest in reading your works and how has your published material evolved over the past 12 years?

MMG: I think I have about 25 articles. As to the quality and usefulness of the content, when I go to our DAMA conference “watering hole” every year, invariably several people come up and provide me assessments of what I’ve written. In that regard, I’m still very much alive and am not limping so I guess the content isn’t all bad.

Over the past 12 years, there have been four key developments: the Knowledge Worker Framework, Database Object Classes, the fleshing out of the Enterprise Data Architecture five layers, and the Metabase System.

RSS: You are a regular speaker at DAMA events and other events nationally.  How much of the content of these books come directly from your experiences at clients and at these events?

MMG: That’s an easy question to answer. 150%. It’s over 100% because I often get it wrong. Both DAMA attendees and clients quickly beat me over the head, which is hard and thick since I’m German-Irish.

RSS: You and I have had a long friendship and I have always respected you for many things including your “passion” for the subjects that you address.  Mike, … Some of that passion has rubbed off on me and people describe me as showing a lot of passion for the subjects I speak about.  Where does the “passion” come from and how does that differentiate you and your business from the others in the industry?

MMG: Well, my father was always said to have been vaccinated with a phonograph needle. I suspect that that’s an inherited trait. I’m a natural optimist and always try to look for and expect the best from people. I know that almost everybody’s trying to do their best. I try to find the essential “common sense” in most everything. While it’s a very uncommon sense, once it’s found, it’s easy to remember and capitalize upon.

RSS: What types of work do you typically get involved in and how important is that experience in putting together the Data Semantics Management book?

MMG: Mainly, I get involved in enterprise data management efforts. These efforts almost always cause the creation of new materials and updates to the Metabase System.

RSS: Mike, Thank you very much for taking the time to answer my questions.  I hope that TDAN.com will be permitted to publish excerpts from the book and new articles from you in the near future.  Any last words for the TDAN.com readers?

MMG:Sure, of course, and yes, I do have words for the TDAN.com readers. “Everybody: Send an email to Bob Seiner thanking him for all the hard work he does in bringing us TDAN.com.”

RSS: Thank you again and I look forward to connecting with you again soon.  Best wishes for success with your new book.

- - - - - - - - - -

The two volume book can be ordered at:http://www.wiscorp.com/printed_books.htmlThere’s a special $20 extra off the two volume book when you enter a coupon code, TDAN2008.

Go to Current Issue | Go to Issue Archive


Recent articles by Michael M. Gorman


Recent articles by Robert S. Seiner

Michael M. Gorman -

Michael, the President of Whitemarsh Information Systems Corporation, has been involved in database and DBMS for more than 40 years. Michael has been the Secretary of the ANSI Database Languages Committee for more than 30 years. This committee standardizes SQL. A full list of Whitemarsh's clients and products can be found on the website. Whitemarsh has developed a very comprehensive Metadata CASE/Repository tool, Metabase, that supports enterprise architectures, information systems planning, comprehensive data model creation and management, and interfaces with the finest code generator on the market, Clarion ( www.SoftVelocity.com). The Whitemarsh website makes available data management books, courses, workshops, methodologies, software, and metrics. Whitemarsh prices are very reasonable and are designed for the individual, the information technology organization and professional training organizations. Whitemarsh provides free use of its materials for universities/colleges. Please contact Whitemarsh for assistance in data modeling, data architecture, enterprise architecture, metadata management, and for on-site delivery of data management workshops, courses, and seminars. Our phone number is (301) 249-1142. Our email address is: mmgorman@wiscorp.com.

Robert S. Seiner - Robert (Bob) S. Seiner is recognized as the publisher of The Data Administration Newsletter, LLC – www.TDAN.com - an award winning electronic publication that focuses on sharing information about data, information, content and knowledge management disciplines. TDAN.com celebrated its 12th anniversary in 2009.  Mr. Seiner speaks often at major data management and meta-data management, business intelligence and knowledge management related conferences and user group meetings across the U.S. He can be reached at the newsletter at rseiner@tdan.com or 412-220-9643.

Mr. Seiner is the President and Principal Consultant of KIK Consulting & Educational Services, LLC – www.KIKconsulting.com.  KIK, celebrating its 7th anniversary, is a company that focuses on knowledge transfer and consultative mentoring in the fields of data governance and data stewardship implementations, metadata management, master data management and data architecture. Beyond knowledge-transfer-focused consulting, Mr. Seiner offers two-day in-house and public courses on how to build and implement data governance / stewardship programs and metadata programs. Contact Mr. Seiner at KIK at rseiner@kikconsulting.com.