|
SELECT * FROM Celko – December 2008
A Rose by Any Other Name
Published: December 1, 2008 Joe Celko discusses the confusion that arises when various databases and other data sources don't quite agree on definitions.
A rose by any other name would be a pain to find in the system’s data dictionary. But even worse is when the various databases and other data sources don’t quite agree on exactly what a rose is, even when they use the same name. As an example of what I mean, consider a data warehousing project for a major airlines that was attempted a decade or two ago (yes, I am being deliberately vague). It was a classic data warehouse situation. Disjointed, uncoordinated silos of data had been created all over the company, and it was time to try to bring them together into a data warehouse that would give a consistent, single view of the enterprise. Good people were trying to do the right thing. Since this was an airline, the two most important things they deal with are passengers and freight. Freight was a secondary business and largely handled by outside companies. About all the company did with freight was to announce that it had a certain number of cubic meters and a total weight available for use on a given flight. The nice thing about freight is that it comes in a container. It stays there and does not talk; it does not worry about seating arrangements, meals, drinks or upgrades. A stewardess friend of mine in the 1970s remarked that she wanted to have Gidney and Cloyd’s Scrooch Gun from the ROCKY & BULLWINKLE cartoon show. This was a ray gun that would turn people into statues for a fixed amount of time. The duration she had in mind was one plane trip. Turning passengers into cargo would have made life so much easier. You would assume that an airline would have a definition of a passenger. Wrong! They had about fifteen definitions. Each definition leads to different results in various reports and differences in how customer service is handled. Just run through a few cases that are fairly common and relatively simple:
Suddenly, there was a revelation. A person and a “passenger” are almost – but not quite – the same thing! The example I used was ISO sex codes. When you start a project, you think that gender comes in {male = 1, female = 2} as the only options. Since you have not done business with a multi-sexual Martian, you think you are done. Then you hire Alex Mack (note for geeks: Google “The Secret World of Alex Mack” and the ambiguous name) and suddenly you need {0 = unknown, 9 = lawful person} to handle ignorance and corporations. Some data elements are simple because they are well defined. They are industry standards with narrow scope and a committee behind them. But the really basic data elements are harder to define because they are so basic. Is a rose not always a rose? Go to Current Issue | Go to Issue Archive Recent articles by Joe Celko
Joe Celko -
Joe is an Independent SQL and RDBMS Expert. He joined the ANSI X3H2 Database Standards Committee in 1987 and helped write the ANSI/ISO SQL-89 and SQL-92 standards. He is one of the top SQL experts in the world, writing over 700 articles primarily on SQL and database topics in the computer trade and academic press. The author of six books on databases and SQL, Joe also contributes his time as a speaker and instructor at universities, trade conferences and local user groups. Joe is now an independent contractor based in the Austin, TX area. |