
|
Practical Data Administration
Published: April 1, 2001
Published in TDAN.com April 2001 In the magical land of Logical Nirvana there lives Good Queen Normalization III. In this land the Black Night Redundancy lives in shackles in the basement prison. Redundancy shares his cell with his squire Urgent Deadline. The old village of Legacy Data has been toppled and the new city of Enterprise Data has sprung up in its place. The blind alleys and dead ends have been replaced by an efficient highway system. “Because we’ve always done it that way!” has been erased from the city archway. Queen Normalization III appoints you Supreme Judge of Data. The application developers come to your court and bow down begging you to model their data. “Take as long as you need and please, no denormalization!” they cry. “Please make sure it depends on the key, the whole key, and nothing but the key!” they beg. You have the power to instantly behead anyone avoiding business keys or creating bundled elements. In Logical Nirvana all the RDBMS’s and application development languages are developed too only work with normalized data. Each night you sleep soundly knowing that you have successfully defended Logical Nirvana from all evil. Unfortunately Logical Nirvana is not where we live. We must deal with redundancy, deadlines, legacy data, and constraints of RDBMS’ and development languages. Most of the executives we know do not have a lot in common with Good Queen Normalization III. So how do we live in today’s world of deadlines and performance constraints? As data analysts I believe it is our job to help applications develop solid systems, on time, while ensuring that the new data entities are reusable and fit in the enterprise. This is not an easy task. Data analysts often find themselves caught between the urgent need for new application data stores and the commitment to an enterprise view of the data. There are two traps we can fall into. The most common trap is becoming too theoretical in the pursuit of “Logical Nirvana”. It is more important that what we produce is useful then that it is technically perfect. Second, we can not give in to every application level demand at the expense of the “big picture”. We must find a way too balance the two. We must give the application developers what they need to do their job while not undermining the enterprise view of the data. I believe the goal of every logical model should be to normalize to 3rd normal form. Much to the horror of many life time data analysts, many of whom have no practical experience, I also believe there are exceptions. Logical models should graphically represent the business in a way that is useful and understandable. I recently worked on a model that would be used to store data about business recovery. The table in question contained the business activities that are essential and need to be quickly replaced if the building blows up or burns down. Two of the attributes needed were the employee id of the disaster recovery plan owner (the person responsible for the activities plan upkeep.) and the employee id of the plan approver (typically but not always the plan owner’s manager). When looking at this data from a logical standpoint the two employee id’s are a repeating group and therefore should be normalized to a separate table (see example 1). The other choice, not as logically pure, would be too denormalize the id’s to the business activity table (see example 2).
Example 1
Example 2
When reviewing this I asked myself the following questions:
Based upon the about answers I fail to see a practical benefit of creating a separate entity. Another example concerned the representation of business units. We needed to develop a model to support Division, Department, and Unit (a sub-group of department). One school of thought involves creating a “generic” business unit table with a recursive relationship (see example 3) as opposed to creating 3 separate tables (see example 4). I agree with the intelligent use of recursive relationships. But we must be careful that the increased flexibility is not at the expense of the models usability. If a model is technically perfect but is so generic that it increases the complexity of accessing it, we should question that approach.
Example 3
Example 4
If we combine Division, Department and Unit into 1 entity with a type code we would increase flexibility but greatly decrease usability.
Now that I have all modeling purist cursing me, I would like to state that I am a firm believer in the need for good logical models. They are the corner stone to a good data enterprise. There are some points in which we should be uncompromising. Some things we should allways enforce are :
I understand that all of this is confusing. Logical Modeling is more of an art then a science. Until we can tear it down, we need to walk the fence between the enterprise view and the application view. Maybe someday we will find Logical Nirvana. Maybe someday Data Redundancy and Urgent Deadline will no longer haunt us. Until they do we must continue to press on. Keep the big picture in focus. What real, defendable benefit is this decision providing? That is the best we can do.
This article was previously published in the December, 2000 issue of the Journal of Conceptual Modeling (www.inconcept.com/JCM). |