Data Wise Perspective: We've Been Out Of Focus
Published: June 1, 1998
Engaging in a data warehousing implementation is fraught with a variety of missteps and unexpected surprises - much of which is the direct result of "advice" from the field.
Engaging in a data warehousing implementation is fraught with a variety of missteps and unexpected surprises - much of which is the direct result of "advice" from the field. There are a lot of "experts" who leave trails of conflicting directives. There are opinions, there are testimonials, and there is evidence that would never be admissible in a courtroom (mostly from vendors). All the while, there are dedicated technologists busy in the background (often overnight) desperately trying to keep things running.
There is a rumor that there is a 50% shortage in the industry of viable candidates to execute a data warehousing project. There seems to be an even greater shortage of consistent advice. Pick a seemingly respectable industry resource; any resource. Let's take a look at what's available from an industry analyst perspective.
One predominant provider of industry research "adjusts" their findings to fit the structure of their organization. They have different analysts, in different groups, to cover what I would consider the breadth of data warehousing. They have analysts who are supposedly aligned to data warehousing, but there is a separate group of analysts who cover something they refer to as business intelligence (has this reached an oxymoronic state yet?). Analysts from both camps don't see eye-to-eye on a number of issues; they conveniently avoid letting us learn from their divergent views by bringing them out in the "open" (opting instead to ignore those things they disagree on).
This "condition" is repeated throughout the data warehousing industry, reflecting a very deep-seated problem. Everyone seems to be caught up in "turf" protection. There is a lot of posturing to protect one's designation of "expert" - a very conditional title that goes into decay the moment it is conferred on someone.
With all of this position taking you'd think we'd see more positions being taken to make clear distinctions between industry terms. Though used synonymously in many instances, a data warehouse (an architecture designed for the collection and distribution of informational data) is quite distinct from data warehousing (the related industry and practice of delivering informational data infrastructures, of which a data warehouse is a sub-component). Only recently am I seeing a trend to make clear distinctions (and they are significant) between a data warehouse and a data mart.*
In emerging industries the early mantra seems to be: "Got a problem? Call an expert." Physicist Werner Heisenberg made an observation that I find significant: "Many people will tell you that an expert is someone who knows a great deal about his subject. To this I would object that one can never know much about any subject. I would much prefer the following definition: an expert is someone who knows some of the worst mistakes that can be made in his subject, and how to avoid them."
I propose that we have a great deal to learn from physicists like Heisenberg and many other great "thinkers" in other fields of research. It turns out that although we are part of a relatively "infant" industry, even "senior" disciplines are finding new reasons to rethink some of their basic premises. Thank goodness we haven't even really reached agreement on ours; perhaps what we finally come up with will be even better.
The very basics of our industry focus on linear thinking (if; then). In the near recent past we decided that separating the data from the process would provide additional flexibility. Object orientation decided to recombine the data with the process (I've never really understood the rationale there). Then we became obsessed with meta data. There's a limitation with all of these approaches: they focus on the parts rather than the whole. There's a term which identifies the tendency to disassemble everything into parts: reductionism.
Our "parts-envy" past has not prevented us from making significant progress. The entire evolution of information science has depended on it; the future of it cannot.
We disassembled the data from the process to increase flexibility. What we failed to realize is that we were faced with managing a balancing act between order and chaos. We moved so far away from the "restrictive" order provided by "hard coded" applications that we landed straight into chaos. We tried to return some order to the chaos by focusing on corporate data models and naming standards, with limited success. We took ill-designed parts and broke them up into smaller pieces somehow thinking that reassembling them would provide usable results. And we continue to struggle to find the right glue.
A number of newer scientific theories have been challenging the basic principles that most "modern science" had been based on. It's not so much that Newton was wrong, it's just that his observations are now found only to apply to very limited circumstances. Most of the natural world does not behave according to the gospel of Newton.
Newton's theories were based on linear thinking: absolutes. The limitations of this approach is most obvious when considered within the field of economics. The problem with (and the reason for the failure of) most of our economic theories, postulates, and oft-wrong estimations is that they are based on linear models. Applying an algorithm to make an estimation assumes that the whole world stands still for that very moment in time and never changes. A world without change? Reality doesn't agree with the assumption: "...the pancreas replaces most of its cells every twenty-four hours, the stomach lining every three days; our white blood cells are renewed in ten days and 98 percent of the protein in the brain is turned over in less than one month" [from The Turning Point - Science Society, and the Rising Culture, by Fritjof Capra, 1981, www.amazon.com]
Our disciplines of analysis have taught us to identify and isolate problems to adequately solve them. By doing this, what we often focus on is the symptom of the problem and not the problem itself. This is a similar challenge imposed by the disciplines of western medicine. The doctors we typically seek medical attention from have been trained to isolate problems and fix them. This methodology works especially well for emergency medical treatment, when lives are threatened. It is not so successful in identifying systemic conditions which collectively may be causing a particular failure. The disciplines of eastern medicine are more appropriate for evaluating issues with the systemic whole. So we seek medical treatment from individuals who have been trained to respond well to certain conditions - any problem? Only that less than 5% of the need for medical attention is specific to life-threatening situations. Shades of Newton...
In our own problem solving situations, we often complain that we seem to be wasting so much time fighting fires and being "stuck" in reactionary mode - we've been duped. We really haven't been trained to do anything else. We build solutions which exacerbate the situation because they don't solve anything, they merely keep the patient alive.
There are a number of research fields which are attempting to expand our solutions view. They all recognize one another's work and sometimes overlap in purpose. Such efforts are classified under a number of distinct topics (these are but a few): complexity theory (including, complex adaptive systems), systems theory, and self organization (see respectively, http://pespmc1.vub.ac.be/CAS.html, http://pespmc1.vub.ac.be/SELFORG.html, http://pespmc1.vub.ac.be/SYSTHEOR.html).
There's a lot of research to be done to understand just how all of this can be applied to our industry, but there's a lot of evidence to support the need for it. "The great shock of twentieth-century science has been that systems cannot be understood by analysis. The properties of the parts are not intrinsic properties but can be understood only within the context of the larger whole. Thus the relationship between the parts and the whole has been reversed....the essential properties of an organism, or living system, are properties of the whole, which none of the parts have. They arise from the interactions and relationships among the parts." [The Web of Life, by Fritjof Capra, 1996, www.amazon.com]
It's a Matter of Balance
What was previously believed to be random is now understood to have order. Just because we can't always see the patterns of order which bind systems together doesn't mean that they don't exist. It also means that we don't have to impose visible means of structure in order to maintain order.
Perhaps it is just that we've misunderstood what order is all about. "...we have created trouble for ourselves in organizations by confusing control with order. If organizations are machines, control makes sense. If organizations are process structure, then seeking to impose control through permanent structure is suicide. 'In life, the issue is not control, but dynamic connectedness.'" [Leadership and the New Science - Learning about Organization from an Orderly Universe, by Margaret J. Wheatley, 1994, www.amazon.com]
The binding factor which provides any semblance of order in most of our organizations can be directly attributed to the people who make it up: not the products, not the services, and certainly not the written policies and procedures which attempt to direct it. Yet we seek continuously to control the abilities of our most valuable resources and don't allow them to optimize both their potential and the potential of the organization.
Where and how does optimization occur? In the powerful and tumultuous space of the "in between". When surfing, holding back too far behind the wave provides less than optimal results (controlled order); trying to over-capitalize the crest of the wave slams the rider onto the ocean floor (chaos). Optimization comes in the balance between the two (complexity). The strength is in the middle. Fabric is made strong from the weaving of the warp (vertical) and the weft (horizontal) together. Medicine will truly be useful when the disciplines of western medicine can be combined with the insights and practices from eastern medicine. An over-focus on successes without allowing for an equal number of failures, misses the opportunities to be gained from the middle: creativity.
So we're obsessed with parts and pieces. We thought we had the world of science licked when we discovered molecules and atoms; then we discovered protons, electrons and neutrons - quantum theory and subatomic particles. When we got done disassembling the universe to what would have seemed the smallest "piece" possible, we discovered that it wasn't the part that was important at all: "Molecules and atoms-the structures described by quantum physics-consist of components. However, these components, the subatomic particles, cannot be understood as isolated entities but must be defined thorough their interrelations. In the words of Henry Stapp, 'An elementary particle is not an independently existing unanalyzable entity. It is, in essence, a set of relationships that reach outward to other things.'" [The Web of Life]
So we pulled the data out from the code. We've moved it out of the optimized, keyed structures necessary to support operational transactions. We've put it into warehouses. What have we accomplished? We're still focused on managing raw materials, not finished products.
This is the era of customization. Why can't the customization of finished products be afforded to the hands of the customer? Not everyone is a craftsman; not everyone wants to learn to be one. Not everyone knows the rules of assembly; no set of rules can adequately cover all of the possible combinations.
Look up a word in the dictionary. How many times is there just one definition? What do repositories ask for - a single definition. We've disassembled reality and then we want to reassemble it with some made-up sense of reality.
We talk about insisting on open systems, yet everything we create is closed. Where are the feedback loops? Where is the context layer? Why can't the participants decide the definition? Why can't definitions be adaptive, able to change as conditions and needs change? Where would we be if the substance of our brain did not regenerate itself once a month?
And if the most important thing is the relationship between the parts, where the hell is some function called a relationship manager?
Recent articles by Paula Thornton
Paula Thornton - Most recently serving as the Information Architect for warehouseMCI - http://www.dbpd.com/9712Grim.htm, Paula Thornton works to act as an "industry facilitator," directed by an irreverent but respectful attitude toward progress in the industry.