TDAN: The Data Administration Newsletter, Since 1997

THE DATA ADMINISTRATION NEWSLETTER – TDAN.com
ROBERT S. SEINER – PUBLISHER

Subscribe to TDAN

TDWI
Dataversity
Business Analysis Conference Europe 2014
Data Governance Financial Services Conference
Data Modeling Zone
Data Governance Winter Conference

   > home > newsletter > article
 Printer-friendly
 E-mail to friend

Architecture Made Easy, Part 11
Data Governance: The New Philosophy of Data Governance

by Sean Kimball, James Luisi
Published: September 1, 2010
Jim Luisi and Sean Kimball look at the benefits of having an appropriate balance of architecture and governance.
Architecture is the first step toward organizing and simplifying anything complex, and governance is essential for managing and controlling either development or change. While it is obvious that we need architecture and governance, which architecture and what form of governance is not at all obvious, and the wrong combination can result in failure.

Determining the optimal architecture and the ideal form of governance are not. If they were easy, everyone would be doing it and experiencing the many benefits. But exactly what are the benefits?

The Benefits of Balance

When the appropriate balance of architecture and governance is achieved, the business is rewarded with a competitive advantage through:
  • Effective business solutions
  • Self-service wherever possible
  • Rapid time frames and agility
  • Reduced complexity
  • No threat of regulatory violations
  • Lower expenses and lifetime cost

    For example, when Application Architecture establishes a framework for error handling within the major types of applications, resulting efficiencies recur since error handling represents approximately one-third of the overall design and implementation effort.
For many of these benefits to achieved, however, the business must have an advocate in the architecture and governance process placing business needs above all others. As we shall learn, without business representation, the interests of the various IT, regulatory and compliance stakeholders will never get properly addressed.

Moreover, if transparency of data and its associated metadata is not at the core of the solution, you need to completely start over again as all the councils and committees in the world aren’t going to do anything but harm your ability to conduct business; and your regulatory violations will never subside, never mind cease. To help understand why, let’s take a quick look at where it all started.

Brief History of Data Governance

Before it was ever called “data governance,” much of the industry had relatively good data governance. Early on, data was centralized on the mainframe often organized into nice subject areas of data, such as customer, vendor, and contract master files, not to mention that data was generally secure, relatively clean and accurate, and business users often kept paper copies of their data for them to do whatever they wanted or needed.

For example, securities traders and brokers kept their own customer and contact lists, transaction journals, and portfolio positions, because their livelihood fully depended upon that information. If they lost their job, got a better offer down the road, or if the computer lost their data, the prepared business user would survive.

With data volumes rapidly expanding, mid-range computers for departmental computing emerged to co-exist with mainframes that housed the books and records of the enterprise.

However, for the first time it became easier for business users to have their own electronic copies of their customer and contact lists, transaction journals, and portfolio positions redundantly on departmental computers so that they could rapidly meet their needs in a more localized environment. Copies of master files proliferated across the computing landscape as data governance entered the dark ages of information architecture. As many IT departments tried to lock down uncontrolled access to data, the situation continued to run out of control, only to get worse with the advent of the personal computer and Windows servers.

With the advent of desktop computers and servers that could fit under desks or in closets, the business continued to do what it needed to do, including copying their production data to portable thumb drives to safeguard their data. They found the most reliable data they could find from among the myriad of redundant data sources and continued to conduct business as best they could.

History shows that no matter how much IT may attempt to lock down data access across any of the three tiers of computer platforms, the business will continue to get around every measure to meet their needs for data and information. Data governance councils and architectural committees are merely momentary obstacles as business users find their way around their controls. And for the record, thank goodness they do, because the moment the business users are kept away from their own data, the company invariably will go out of business.

The New Philosophy of Data Governance

When we finally realize that the ingenuity of business users to overcome any of the obstacles placed between them and their data is almost without limits, and once it is realized that if IT could successfully separate business users from their data it would be fatal to the business, a viable solution becomes possible.

The first principle of the new data governance philosophy is that IT must do everything it can to empower the business users to achieve relatively unfettered access to reliable data in a self-serve model.

Yes, IT power brokers, middlemen, and control freaks will go nuts, and there will be a hundred excuses for every suggestion that might be helpful to support this principle; but as difficult as it may be for some to accept, embracing the new philosophy is the only way to give IT the opportunity of meeting the myriad of regulatory rules.

That said, empowering business users does not mean that we should hand over the keys to the kingdom including all of the applications, compilers and databases. In fact, that would not empower the business at all. Instead, according to this new philosophy, empowering business users means putting an infrastructure in place that makes it easy for business users to get to the most reliable data conveniently on their own. Many within IT will claim that this is simply not possible, and they will most assuredly point to the complex tools and data landscape as proof.

Necessity is the Mother of Invention

There are those rare moments when IT can demonstrate the level of ingenuity that rivals business users. As one would hope when presenting IT with a complex challenge, there should be someone from among those expensive, highly experienced and educated IT individuals that should be able to come up with something, even if it means looking at what their kids may be doing on their Internet browsers. One such example is “mashups.”

Mashups1 are combinations of media that can be created simply using the mouse to drag and drop two things together. Mashup is a type of web application that uses content from more than one source to create something new that can be either displayed or listened to.

The term “mashup” originally comes from pop music, where users seamlessly combine music from one song with the vocal track from another to create something new that was mashed together. This does not mean that users will be able to set their production data to their favorite tune, but they will be able to associate it with their favorite form for visually illustrating it.

Now you may ask what mashups have to do with empowering business users to self-serve their own ad hoc reporting needs, never mind data governance, and the answer is surprisingly simple.

Mashups and Data Governance

Business mashups defined by using data from across the enterprise are easy to define and manage by simply dragging a “data mashup” to a “widget.”

The component referred to as the “data mashup” is a query that renders a particular set of fields available for use in an ad hoc report. The component referred to as a “widget” is simply a prefabricated form to provide data visualization, such as a list, chart, bar graph, pie chart, or map.

When a business user combines a data mashup with a widget, they link fields of interest from the data mashup to the fields of the widget that are used to create a display and the result is called a mashup. They are easy enough to create that our kids assemble them on their laptops, smart phones, and iPads.

Under the new data governance philosophy, the role of IT is to facilitate and encourage mashup creation by business users while non-intrusively managing the selection of the reliable data sources, minimizing the replication of data, and providing data security, privacy, and reliable response time performance.

Business Approach to Data Governance

With the proliferation of data across a wide array of mid-range UNIX and Windows-based servers, not to mention personal computers, the sheer number of databases and database columns has become staggering. Many large companies have thousands of databases, and tens of thousands of MS Access databases and spreadsheets.

Any approach that starts from the many columns across the data landscape, such as by reverse engineering each database, is relatively certain to fail simply because of the volume of database columns that have to be manually analyzed to determine each field’s business meaning, and depending upon the sparseness and quality of the data content, the business value that it represents to business users.

An easier approach, one that gives an organization a fighting chance, is to begin with the fewest number of parts from a top-down perspective. This means starting from a logical data architecture2, which identifies data subject areas, and conceptual data models that visually illustrate the business data glossary in a meaningful and useful way.

As an alternative, one may also begin with a simple business data glossary that collects a basic set of “business” metadata about each business field, which is much more useful than what one often finds inside a CASE tool.

For example, a typical CASE tool data dictionary will feature a field name (e.g., coupon rate), an abbreviation (e.g., CPN_RT), and a field description that is often without any value to anyone (e.g., A coupon rate is the rate of a coupon.). In comparison, a proper business data glossary would likely include the following business metadata:
  • Business data glossary field name
  • Trustee for the data glossary field
  • Business definition
  • Synonyms by business area
  • Labels used in applications, forms and reports
  • Purpose of data field
  • Business processes that originate the data content
  • Business processes that maintain the data content
  • Business area(s) and business processes that read, report or analyze the data content
  • Related data glossary fields
  • If original, the source of the content, if calculated, the derivation
  • Basic business data class (e.g., date, amount, quantity, rate) and required level of precision
  • Business data field content owner
  • Whether the data lineage includes the GL and or external organizations
  • Data glossary field sensitivity, such as confidential for customer privacy or for the enterprise
  • Logical data architecture data subject area associated with the business data glossary field
  • Mapping to external standards, such as ACORD or MISMO
  • Regulatory or compliance rules that are deemed applicable
  • Complete or sample business values or domain values
In a business data glossary, the business definition is intended to communicate business value (e.g., A coupon rate of a security – a bond, note or other fixed income security) is the amount of interest paid by the issuer per year expressed as a percentage of the principle amount, or face value of the security, to the holder of the security, which is usually paid twice per year.)

When managed properly as a business data glossary, the business data glossary field names can guide business users to the data mashups that meet their business need, or at a minimum it can help business users to communicate exactly what the business needs are to someone in IT who is helping them establish the necessary data mashups.

Leveraging Enterprise Standards

At a more detailed level, there is a wide array of specialized data related disciplines, such as: risk management; data stewardship; regulatory compliance; data quality; data security; test data generation; data encryption; data masking; data accessibility; master data management; data virtualization; taxonomy; ontology; data archival; data administration; data abstraction; data normalization and de-normalization; logical and physical data architectures; current state architecture; future state architecture; data frameworks; data blueprints; reference architectures; terminal services architectures; conceptual, logical and physical data models; enterprise service bus architecture; ETL architecture; XML; various types of files and databases; access methods; referential integrity; operational data stores; data warehouses; data marts; star schemas and snowflakes; data mining; data visualization; data analytics; statistical and non-statistical data analysis; and business intelligence.

Governed by the standards from the enterprise, these disciplines help put the appropriate processes and standards in place to ensure the appropriate use, care, protection, and accessibility of data.

Often there is an enterprise information architect who is usually responsible for developing a well formed set of frameworks, standards, and data architectures that are useful across the enterprise.

These enterprise artifacts may then be adopted and often further detailed by each line of business to facilitate integration across the various lines of business. Once adopted, capabilities such as business intelligence, data analysis, and data visualization involving cross area business data becomes attainable with much less effort and expense than would be otherwise possible.

A common approach for promoting the artifacts of the enterprise is to place responsibility on the enterprise information architect to sell their ideas, as opposed to imposing them. The enterprise information architect then often assists the line of business to adapt the artifacts to meet their business priorities.

Top Down from Business and Enterprise

While data governance artifacts, such as frameworks, standards, and logical data architectures flow top-down from the enterprise, the creation of the business data glossary also flows down from the business.

The subject-matter experts in the business sometimes referred to as “data stewards,” advocate for their business users within their area of expertise. As advocates for the business, data stewards help business users get in touch with data that can provide them valuable insight into the books and records of the business and its operational workflows.

At the lowest level, data stewards take control of business data glossary field names, ensuring that they are clear and unambiguous with the many other data fields that also exist in that area of the business. In addition, information discovered by a data steward during his/her research should be easily recorded and readily accessible to the business users they advocate for.

Data Governance Portal

A data governance portal is the window into the data assets of each business area across the enterprise. It contains the data that is present within the business and the results of the ongoing research that is conducted by the data steward.

Imagine a Google-like interface that business users could use to search for business data glossary fields entries, mashup reports, and mashup components that they could easily assemble or use directly from within the data governance portal.3

As an example, if someone were looking to find data mashup for ad hoc reporting that compares LIBOR to thirty-year fixed jumbo rates, the list of data mashups involving those business data glossary fields would be displayed with their business definitions and associated business metadata.

As such, the data governance portal would provide business users with an integrated self-service ad hoc reporting capability that would simultaneously allow IT to support the regulatory compliance needs of the corporation.

In the event that the business user requires data not already supported by existing data mashups, then the data management portal could route the inquiry to the data steward responsible for that area of data. Once researched and created, the incremental information provided would then be available thereafter on the portal to business users simply for the asking.


Implementation Tips

When architecting a data governance portal and mashup capability you will probably need to leverage data virtualization to help stay aligned with your logical data architecture.



In the most detailed representation of your logical data architecture, each subject area of data is then expanded into a conceptual data model that business users and IT staff can easily relate to.



Even after having accomplished this ease of understanding, a number of challenges are likely to exist, particularly when the current environment is laden with data quality issues.



In such a situation, there is need for a good data virtualization4 strategy. A suitable data virtualization strategy will save large amounts of rework within mashups and other forms of reporting, as it can centrally address data scrubbing, value standardization and consistent formats across the data landscape.

To make matters even more challenging, let’s say that the goal of your organization is to migrate to a future state environment.

What is particularly useful is that a good data virtualization approach can render the data in the form that business users can more readily relate to from the perspective of your business data glossary and or logical data architecture. The logical data architecture, in fact, should be an illustration of your future state data architecture.

As you start creating an inventory of mashups, particularly useful is the ability to protect your investment in mashups by insulating your growing inventory of data mashups and other reports from the migration in your data landscape toward your future data architecture.



Perhaps most important to your ability to meet the needs of your business and your regulatory requirements is an efficient process of supporting this new philosophy for data governance, as it can go a long way towards eliminating the desire to develop MS Access applications and Excel spreadsheets for ad hoc reporting needs across the business community.

Summary

Proper data governance should eliminate, and not add additional layers of process.

When optimized, the business and IT community alike should have two basic capabilities:
  • It must be quick and easy find what data exists across the data landscape (e.g., Data Governance Portal)
  • Business users must be able to self-service their ad hoc reporting needs with the proper controls already built-in (e.g., data mashups)

Depending upon the extent to which your industry is regulated and what the plausible financial damages and fines could be, not to mention the potential damage to brand reputation, operating either without data governance or with the wrong form of data governance may be construed as irresponsible. Without the right data governance, the interests of the various stakeholders across the enterprise and regulatory bodies will not be represented.

Data governance is optimized when the cycle of use and data improvement is a self-sustaining repeatable process that openly renders the information owned by the business back to them on demand with self-service.

Empowering business users with increased access to their data will help to remove the motivations that contribute to back-door solutions that place the firm at risk.

In addition to some IT ingenuity, starting and maintaining this cycle will require the application of personnel, tools, and processes to monitor, track, and fix data issues, but as data quality issues are addressed so that they do not recur, the ability of the organization to make informed decisions predicated on sound information will unleash the competitive capacity of the company.

In contrast, while operating without data governance may not be critically evident to executive management, the absence of data governance may be the best recipe to be slowly overtaken by competitors that find ways to empower their business users with useful data from across the enterprise. Turning the tide on reporting complexity and reducing infrastructure costs to compete better in the marketplace is quickly becoming the path to business survival.

Please don’t hesitate to let me know which articles in the “Architecture Made Easy” series are useful to your organization. In addition, corrections, enhancements, and suggestions are always welcome and are requested.

[Data Governance: The New Philosophy of Data Governance is the 11th article in the AME Series created by James V. Luisi.]

End Notes:

  1. Mashups are an emerging technology, also referred to as mash-up tools, where individuals can easily locate the data they need to view, referred to as mashable assets, and then drag it to the display object of their choice, referred to as visualization widgets, where the user can drag and drop the data mashable asset to automatically display whatever data they need on whatever widget format they desire, such as a list, chart, map, graph, or report.

    Data usage is then monitored with usage reports and made secure with the use of filters and standard database security. However, in any large conglomerate the single biggest challenge facing any member of the business or IT community is finding what information exists across the “line of business,” where it is, and who can help create the mashable assets for it.




  2. A Data Governance Portal and Mashup suite may be seamlessly integrated by using a software combination like WebSphere and IBM Mashup Center.
  3. An example of a good data virtualization technology is Composite Software.

     

    Go to Current Issue | Go to Issue Archive


    Recent articles by Sean Kimball


    Recent articles by James Luisi

    Sean Kimball - Sean has twenty-five years of experience within the largest financial conglomerates. He was formerly the Chief Enterprise Architect for a large U.S.-based financial conglomerate and is one of the most innovative executives who can be found speaking at select industry conferences.
    James Luisi - Jim has thirty years of experience in information architecture, architecture and governance within control and information systems in the financial and defense industries with information in LinkedIn.com. Feel free to send him a link. Jim is an author, speaker at conferences, and enterprise architect at a large financial conglomerate in New York area.