|
The Denormalization Survival Guide - Part I
Published: April 1, 2002
Published in TDAN.com April 2002
Articles in this series - Part 2,
This the first of two articles on the Denormalization Survival Guide, adapted from Chapter 8 of Steve Hoberman's book, the Data Modeler's Workbench (Order this book through Amazon.com Today!). This first article focuses on the dangers of denormalization and introduces the Denormalization Survival Guide, which is a question-and-answer approach to applying denormalization to our data models. The second article will discuss the questions and answers that comprise this guide when modeling a data mart. The Dangers of DenormalizationIn her book entitled, Handbook of Relational Database Design, Barbara von Halle has a great definition for denormalization: “Denormalization is the process whereby, after defining a stable, fully normalized data structure, you selectively introduce duplicate data to facilitate specific performance requirements.” Denormalization is the process of combining data elements from different entities. By doing so, we lose the benefits of normalization and, therefore, reintroduce redundancy into the design. This extra redundancy can help improve retrieval time performance. Reducing retrieval time is the primary reason for denormalizing. My favorite word in von Halle's definition is “selectively”. We have to be very careful and selective where we introduce denormalization because it can come with a huge price even though it can decrease retrieval time. The price for denormalizing can take the form of these bleak situations:
“What if an employee has two home phone numbers?”
“How can we store more than one email address for the same employee?”
“Can two employees share the same work phone number?”
After receiving a blank stare from the designer, I realized that denormalization was applied before fully normalizing, and therefore, there was a significant lack of understanding of the relationships between the data elements.
Being aware of these potential dangers of denormalization encourages us to make denormalization decisions very selectively. We need to have a full understanding of the pros and cons of each opportunity we have to denormalize. This is where the Denormalization Survival Guide becomes a very important tool. The Denormalization Survival Guide will help us make the right denormalization decisions, so that our designs can survive the test of time and minimize the chances of these bleak situations from occurring. What Is the Denormalization Survival Guide?The Denormalization Survival Guide is a question-and-answer approach to determining where to denormalize your logical data model. The Survival Guide contains a series of questions in several different categories that need to be asked for each relationship on our model. There are point values associated with answers to each question. By adding up these points, we can determine whether to denormalize each specific relationship. If our score is 10 or more after summing the individual scores, we will denormalize the relationship. If our score is less than 10, we will keep the relationship normalized and intact. When you are done asking these questions for each relationship, you will have a physical data model at the appropriate level of denormalization. There are two main purposes to using the Denormalization Survival Guide:
SummaryThis first article focused on the dangers of denormalization and introduced the Denormalization Survival Guide. The next article in this series will discuss the questions and answers that comprise the guide, including these questions when determining whether to denormalize a relationship in a data mart:
Go to Current Issue | Go to Issue Archive Recent articles by Steve Hoberman
Steve Hoberman -
Steve Hoberman is a world-recognized innovator and thought-leader in the field of data modeling. He has worked as a business intelligence and data management practitioner and trainer since 1990. Steve is known for his entertaining, interactive teaching and lecture style (watch out for flying candy!) and is a popular, frequent presenter at industry conferences, both nationally and internationally. Steve is a columnist and frequent contributor to industry publications, as well as the author of Data Modeler’s Workbench and Data Modeling Made Simple. He is the founder of the Design Challenges group and inventor of the Data Model Scorecard™. Please visit his website www.stevehoberman.com to learn more about his training and consulting services, and to sign up for his Design Challenges! He can be reached at me@stevehoberman.com. |