|
Database Architectures (Part 1)
Published: March 1, 1998 This two-part article will help readers have a better understanding of the various database architectures that are available for application development.
Introduction This two-part article will help readers have a better understanding of the various database architectures that are available for application development. Part one will contain the introduction and discuss the personal and workgroup database architectures while part two will contain information on enterprise and mainframe architectures. In addition, part two will include formal guidelines and checklists to help you select the correct architecture for a particular application. If you have any questions on selecting the correct database architecture for an application, please feel free to contact me. Choosing the correct architecture (hardware platform, operating system, database) is critical to the success of any new database application. This decision was simple when the mainframe was the only architecture available. But architecture selection is not as clear-cut now that enterprise, departmental and personal architectures as well as the mainframe are available. Application developers and end users have more hardware and software choices available to them than ever before. The key to success is choosing the right ones. This document does not favor one architecture over another, it's intent is to help readers choose the architecture tier that best fits an individual application's requirements. The final architecture decision should not be made in a vacuum or based on personal preference. It is important for all large IT shops to create a part-time architecture selection team consisting of members drawn from the following areas: LAN/WAN communications, O/S support, operations, architectures and database administration. This team should be responsible for determining the best architecture for a particular application. Document Contents The major focus of this two-part article is database architecture tiers. It contains an in-depth description of each of the four database architecture tiers (mainframe, enterprise, departmental, personal) followed by several pages of database architecture tier selection criteria. The information regarding each tier will be broken up into the following categories:
Attachments The following attachments are included at the end of this document and will be included in part two of the article as well:
TPC Benchmarks I make numerous references to the TPC (Transaction Processing Council) and TPC benchmarks. The TPC is a non-profit corporation founded to define transaction processing and database benchmarks and to disseminate objective, verifiable TPC performance data to the industry. Many consumers rely upon the TPC to provide them with unbiased performance information. For more information on the TPC and TPC-C and TPC-D benchmarks, please refer to Attachment 3 in part two of this article. The Selection Process Architecture selection is the most important decision that is made during the application development process. A correctly chosen architecture will allow the application to perform to expectations, have the desired functionality and be easily monitored and administered. Incorrect architecture decisions may cause one or more of the following problems to occur: poor performance, limited functionality, high total cost of ownership, complex administration and tuning, lack of scalability, poor vendor support, poor reliability/availability and so on. Extensive up-front analysis is critical during the architecture selection process. The architecture tiers comparison table and associated worksheet contained in part two of this article can be used to determine what information must be collected and evaluated before the correct architecture can be chosen. The following questions must also be answered during architecture evaluation:
The Importance of the Database The database engine is taking on a much more strategic role in most organizations. It provides the mechanism to store physical data along with business rules and executable business logic. The database vendor also provides the communication programs for both client and server. The entire application environment (data storage, business rule enforcement, application program storage, communication, system monitoring) can now be controlled by the database. Over time, the database engine will store more information related to the understanding of the business, the meaning of the data stored (meta data) and the mechanisms to control and track versions of the database, access programs and related software. Database Environments When selecting a database product for a particular application, the hardware platform and operating system the database executes upon must also be taken into consideration. The operating system, hardware platform, organizational issues and database product can be collectively called the database environment.In order to correctly choose which database tier is the best architecture for a given application, we must compare database environments and not just individual database products. It doesn't matter how many processors the hardware platform has if the operating system or database product can't support the hardware's parallelization provisions. We must also understand that with so many different vendor products available in each individual tier, correct product selection within the tier is also critical. Choosing the correct hardware platform, operating system and database combination is crucial to the success of any application, regardless of the tier chosen. Database Architecture Tier Overview Database processing environments can be divided into four database architecture tiers (see figure below). The figure provides a few examples of hardware platforms, operating systems and databases that could be used in each of the four tiers.
Although the concepts of operating system/hardware tiers have been loosely defined for some time, the popularity of non-mainframe databases has over time, formalized and solidified these concepts until they have become a concrete set of enterprise computing architectures. For many years, the only two architectures that existed were mainframe and non-mainframe (with the mainframe being the only architecture that was truly formalized and structured). The concept of multi-tier architectures began to evolve as the popularity of client/server, non-mainframe systems began to increase. The majority of these client/server applications used relational database management systems, not flat file architectures as their data storage mechanism. If it were not for the popularity of these database products, non-mainframe architecture tiers would still be a loosely defined, unstructured group of hardware vendors, operating systems and third party tools and applications. Competition's Affect on Tiers
Competition is now beginning to make the architecture tier environment more fluid in nature (see figure above). A more competitive market arena forces all vendors to continuously adjust pricing and accelerate the release of new products as well as enhancements to existing products. Administration is currently a key factor when evaluating architectures. The higher the architecture is on the hierarchy tier chart, the more robust and mature the administrative environment becomes. But competition is forcing vendors to accelerate the development of administrative tool sets as well as the infrastructure software required to support distributed environments. Processing power was once a major factor in the architecture tier selection process. But CPU and raw I/O performance are increasing on all architecture tiers, relegating horsepower to a somewhat less prominent position in the evaluation criteria hierarchy. Consumers can expect the improving hardware performance trends to continue. PC-based personal and departmental systems are catching up with larger enterprise servers from Sequent, IBM, Sun and HP. Intel's multiple chip motherboards are allowing PC architectures to provide SMP capability without the extra cost of proprietary SMP technology. Enterprise vendors are also beginning to push their performance envelopes. Experiments with clusters (multiple hardware, O/S environments lashed together to provide a single system image) have shown mainframe level performance. Mainframes, no longer able to compete on horsepower and reliability alone, are becoming smaller, smarter, more cost effective and easier to use. They are offering more openness, more operating systems and easier connectivity. In the future, the major differences between the departmental and enterprise architectures will be primarily in peripherals, specialized operating system and database features (to justify the costs of the more expensive enterprise architectures) and the level of service and support provided by the vendors. Personal Tier Description Personal database environments are single user applications using a non-server PC workstation, a PC operating system and a personal database management system. Applications built using Microsoft Access do not adhere to the client/server architecture. There is no active database server component that is separate from the application component. The user must log on to the personal database to access the screens that interface with the database management system. Personal Oracle does adhere to the client/server architecture by allowing access to the database via ODBC. Personal databases are often chosen for "client/server" applications because of their low initial cost, ease of use, and user familiarity (developers and end users oftentimes have the same products installed on their home PCs). Trouble begins to occur when developers begin to use personal databases for multi-user OLTP applications, especially when users are not on the same LAN segment. These small, single-user personal database applications often grow to a point where: 1) they can no longer be managed in a relatively easy fashion or, 2) performance degrades to a point where response times become unacceptable. The owners of the database application have no alternative but to upgrade to a higher-level database tier. The departmental architecture tier (next tier) often becomes the logical choice due to its low cost, higher performance and ease of administration. Hardware
Operating System
Database Software
Benefits/Drawbacks
+ Familiarity (same hardware we use at home)
+ Easy to use, maintain and upgrade + Inexpensive - Not architected for high availability (limited redundant capabilities) - Not architected for high performance - Not architected for multiple, concurrent user access - Limited upgrade path - Poor vendor support
+ Familiarity (same software we use at home)
+ Inexpensive + Easy to use, maintain, upgrade and install third party software + Many single-user third party applications available - Not architected for high availability (limited redundant capabilities) - Not architected for high performance - Limited tools available (job scheduler, security packages, etc.) - Limited enterprise-wide administrative tool sets available - Limited multi-user third party applications available - Poor vendor support
+ Familiarity (same database software we use at home)
+ Inexpensive + Easy to use, maintain, upgrade and install third party software + Many single-user third party applications available (Access) + Tight integration with other PC software - Poor locking model and security structures (Access) - Not architected for high availability - Not architected for high performance - Limited enterprise-wide tool sets available - Limited multi-user third party applications available - Limited vendor support for all products - Limited central IT support for personal database products
+ Inexpensive
+ Familiar software and hardware (same environment we have at home) + Easy to use, maintain, upgrade and install third party applications + Limited bureaucracy (limited number of people, units to deal with) - Multiple implementations are complex to administer (multiple operating systems, hardware platforms and databases must be supported) - Limited disaster recovery options - Limited tools available (job scheduler, security packages, etc.) - Limited enterprise-wide administrative tool sets available - Vendor support may be a concern (Access), limited central IT support Typical Applications
Numerical Ratings (0 - least favorable, 10 - most favorable)
The PC environment is the least expensive of all architectures. Most PC product's initial price tag and support costs are very low making personal architectures an attractive choice for
application development. Although it may be the least expensive, it is not the most cost effective. The old adage "you get what you pay for" rings very true in this case. The low costs are
responsible for limited vendor support, fair reliability and availability, and limited database functionality.
PC hardware platform vendors focus their offerings to the single user market. As a result, PC hardware platforms offer a limited upgrade path. The hardware is not architected to provide high
performance for multiple concurrent users. But Intel is beginning to change the rules of the PC environment by marketing hardware to allow PC architectures to contain multiple chip configurations
and address larger amounts of memory. In the past, multiple chip architectures could only be provided by the hardware vendor's use of proprietary SMP technology (see previous paragraph on
Competition's Affect on Tiers).
PC operating systems are not architected as high performance multi-user operating systems. There are two methods of multi-tasking application programs: cooperative (event driven) or pre-emptive.
Pre-emptive multi-tasking divides the computer into virtual machines. Each virtual machine is unaware of the other virtual machines contained on the computer and thinks that it has its own video
board, floppy disk and so on. All programs running in the virtual machines are given a slice of the total CPU and memory, allowing them to execute concurrently. Conversely, cooperative operating
systems must wait for an event to switch from one program to another. The triggering events in Windows are mouse clicks or keystrokes. Windows 3.1 is event driven and not a pre-emptive
multi-tasking architecture. This means that multiple processes are unable to run concurrently on the Windows 3.1 operating system. Operating system resources are swapped between application
processes as required. Windows 95 & 97 are known to be marginally multi-tasking. Although Windows 95 and 97 are based on a pre-emptive multi-tasking architecture, there are many instances where
they are unable to concurrently process multiple application requests. NT Client, Netware and OS/2 are also based on pre-emptive multi-tasking architectures and as a result are able to execute
application processes concurrently. It is important to understand that although these operating systems do process multiple processes concurrently, they are not architected to manage more than a
handful of concurrent application processes.
Personal databases are low cost products geared towards single user applications. Locking and performance mechanisms are unsophisticated at best. Personal databases are easy to manage because their
architecture is simple and unsophisticated. Predicting personal database performance can be a problem because performance trending for these products is not linear in nature.
Personal database users can not assume that if their database processes 100,000 row in 5 minutes it will take 15 minutes to process 300,000. This unpredictability affects all aspects of performance
measurements - concurrent users, rows processed, number of tables joined, database size, etc.. Performance may be satisfactory until another set of users or another 10,000 rows is added to the
tables being accessed. Personal database servers are unable to take full advantage of multi-chip architectures as well as large memory allocations. As a result, adding hardware rarely solves
personal database performance problems.
Reliability and availability is also a concern for personal database environments. Personal hardware platforms, operating systems and databases are not architected for high availability. Although
100% system availability is not a requirement for the vast number of personal database users, low cost is. Vendors must balance total system cost with the costs required to provide a reasonable
amount of reliability.
Ease of use is the key to market share for products geared towards the personal database tier. The majority of these products will be used in the home by users that have no formal background or
education in application development. Vendors understand that the hardware platforms, operating systems and databases must be easy to use to be competitive.
Although a single PC database environment can be easily administered, there is a direct correlation between the number of personal database implementations and the complexity and cost of their
administration. As the number of implementations increases so does the cost and complexity of administering the environment as a whole. Multiple hardware platforms, operating systems and database
products must be monitored, tuned and maintained. Limited enterprise-wide tool sets do not allow database, operating system and hardware administrators to view and administer the PC database
architecture as a single entity.
Departmental Tier Description The departmental tier, also known as the workgroup tier, is the most recent tier to become available for use as an application development environment. Although the departmental tier is the newest of the four tiers, it has quickly become the most popular. Workgroup server databases were responsible for the majority of database product sales in 1997. This popularity is a result of two converging trends (both driven by competition) in the database hardware and software market:
This growth of departmental database servers will continue to be fed by the rapidly improving price/performance ratio of server hardware as well as falling departmental server software prices. Combined, these two factors will continue to produce a continuous downward shift in the cost of departmental information systems architectures. Database server pricing fell dramatically in late 1994 to early 1995, largely in response to Microsoft's success with SQL Server 4.2 for Windows NT. Microsoft SQL Server 4.2 was a full function relational database server with a very attractive price tag. The traditional database competitors (Oracle, Informix, Sybase, IBM) realized that in order to compete in the departmental arena, they must begin to market a scaled down (and cheaper) version of their enterprise databases. Some of the workgroup database server products currently available (Oracle's workgroup offering is one example) are simply the vendor's flagship enterprise database repackaged or repriced to appeal to a different group of customers. In addition, the increasing popularity of personal database servers (previous described tier) is also adding to the growth of departmental tier architectures. These small, single-user databases often grow to a point where: 1) they can no longer be managed in a relatively easy fashion or, 2) performance degrades to a point where response times become unacceptable. The owners of the database have no alternative but to upgrade to a higher level database architecture tier. The workgroup database server often becomes the logical choice due to its low cost, higher performance and ease of administration. Hardware
Operating System
Database Software
Benefits/Drawbacks
+ Easy to use, maintain and upgrade
+ Inexpensive + Architected for multiple user access + Moderate amount of high availability capabilities available - Limited upgrade path - Limited I/O throughput - performance can be a problem when multiple users request large result sets (lots of data) to be returned - Limited number of mature, enterprise-wide administrative tool sets available - Limited disaster recovery options available
+ Familiarity (same environment in use for desktop operating systems (NT) or for file and print servers (Netware))
+ Inexpensive + Easy to use, maintain, upgrade and install third party software + Architected for moderate to high performance + Moderate amount of high availability capabilities available - Performance must be closely monitored as number of concurrent users increases - Limited tools available (job scheduler, security packages, etc.) - Limited number of mature, enterprise-wide administrative tool sets available
+ Inexpensive (very competitive environment)
+ Can take advantage of low cost memory to easily increase performance + Easy to use, maintain, upgrade and install third party applications + Enterprise wide administrative tool sets are available (enterprise database managers) + Abundance of multi-user third party applications available - Advanced performance options (parallel query, load, bit mapped indexes, asynchronous read ahead) not available in most products - Advanced database options (spatial data management, video/audio storage and retrieval) not available in most products
+ Inexpensive
+ Familiar software and hardware (same environment in use for desktop operating systems (NT) or for file and print servers (Netware)) + Easy to use, maintain, upgrade and install third party applications - Multiple implementations are complex to administer (multiple operating systems, hardware platforms and databases must be supported) - Limited disaster recovery options - Limited enterprise-wide administrative tool sets available Typical Applications
Numerical Ratings (0 - least favorable, 10 - most favorable)
Low cost is a major advantage of the departmental architecture tier. Pricing for departmental hardware platforms, operating systems and databases are much lower than their enterprise and mainframe
counterparts. Consumers pay PC prices for CPUs, memory and DASD when base system upgrades are required. Microsoft's attempt to dominate this architecture tier with NT Server and SQLServer is
forcing all vendors to adjust pricing and accelerate the release of new products as well as enhancements to existing products. This high level of competition between departmental vendors allows
consumers to take advantage of a high level of functionality at a reasonable cost.
The departmental tier is geared towards applications that require more horsepower than the personal tier can provide, but don't need the processing power (or cost) of the larger architectures.
Cost is the primary advantage of departmental hardware servers. If the application requires more processing power, CPUs, memory and DASD can be purchased without making a major impact on the
application's operating budget. A major disadvantage of departmental hardware platforms is their somewhat limited I/O throughput capabilities. I/O throughput is not a concern for most OLTP
applications, but is a major concern for DSS that request large amounts of data to be returned. Benchmark tests I have performed have shown that although departmental server platforms can handle
small to medium sized OLTP applications, performance can become a problem when multiple processes request large result sets (high data volumes) to be returned. How much data is too much? It depends
on the database environment (hardware, operating system, database software) chosen for the application being benchmarked. Benchmark testing is the only fool-proof method of determining any database
environment's performance.
The three most popular operating systems for departmental servers are UNIX, NT and Netware. Based on performance test results from the TPC, UNIX narrowly leads NT and Netware in benchmark
performance on the majority of platforms tested.
Workgroup databases are most often just scaled down versions of the vendor's enterprise database product. As a result, workgroup databases are able to take advantage of multiple processor
configurations as well as address hundreds of megabytes of memory. The performance of the database is not restricted by the product itself, but by the hardware platform and operating system it
executes upon. The fastest database title moves from vendor to vendor due to strong vendor competition and fast paced advancements in hardware and operating systems.
Compared to their mainframe and enterprise hardware vendor counterparts, the majority of departmental hardware vendors have only recently begun to expend resources on providing highly available
architectures to consumers desiring fault tolerant systems for mission critical applications.
The reasons for this lack of focus are:
UNIX is known to be moderately to highly fault tolerant, but lacks market share with less than 18% of departmental servers utilizing UNIX as the operating system. Sun, the first true departmental
server vendor, was marketing highly available servers before the departmental server tier became popular with consumers. This maturity provides Sun with a distinct advantage over its competitors.
Sun is believed to have the most mature product set for fault tolerant departmental servers and as a result leads all other vendors in sales of highly available departmental servers.
Departmental server ease of use is totally dependent on the operating system selected. Departmental servers can be as easy to administer as personal architectures or as complicated to administer as
their enterprise counterparts. The NT Server operating system is more easily administered than UNIX and Netware. NT Server uses the same type of interface as the NT Client operating systems. NT
administrators are able to perform a large percentage of their administrative duties through a familiar point and click GUI interface. NT operating system manuals use familiar PC terminology when
discussing concepts of general administration. Some flavors of UNIX have GUI administrative tools, but the majority of administrative duties are still performed via command line interface using
rather cryptic operating system commands (grep, awk, ps -ef, cat). Although UNIX is more complex to administer compared to NT, it is also known to be a reliable and high performance operating
system. Netware also uses somewhat cryptic commands and is administered by attaching to the server by an administrative interface on a separate platform.
Database vendors realize that workgroup servers oftentimes exist outside of the traditional information processing support framework. As a result, general ease of use is not a competitive advantage
but a competitive requirement. Departmental database products can be characterized by their easy installation, ease of use, limited tuning options, and good documentation. All vendors offer strong
visual administration and monitoring tools that do not require the administration skill and intimate knowledge of the database required by their enterprise and mainframe counterparts.
Conclusion In part one, I have described the personal and workgroup architecture tiers. In part two, I will conclude by describing the enterprise and mainframe architecture tiers. In addition, part two will include formal guidelines and checklists to help you select the correct architecture for a particular application. Attachment 1 - Architecture Tiers Comparison Table The table below can be used to quickly compare the different architecture tiers. Each of the architectures are graded from 1 to 10 with 10 being the most favorable grade. A 0 rating means that the architecture is unable to provide that criteria. The importance of a particular criteria depends on the application being evaluated. For example, is having the ability to use a transaction monitor important to a departmental application that has 60 concurrent users? Probably not. Is it important to an application that has hundreds of concurrent users? Probably so. Each criteria must be judged on a one by one basis for the application being evaluated. Rank the criteria that is important to the success of the application higher than others during evaluation.
Attachment 2 - Database Architecture Worksheet This worksheet can be used to help determine the optimal architecture for a given application. The final decision of the architecture should not be done in a vacuum or based on personal preference. As I stated previously, the safest and easiest way for any application to choose the correct environment is enlist the help of an Application Architectures Team. This part time team should contain individuals who have the expertise necessary to choose the correct architecture for any given application. A : is placed under each of the architectures that generally meets the requirement listed on the left. A blank means that although this architecture may not be the architecture that best meets the requirement, it should still be considered as a viable alternative. A h placed under the architecture means that if the requirement being evaluated is truly important to the application, that architecture must no longer be considered as a viable alternative and should be removed from the evaluation process. Add the : that are important to the success of the application to get a general rankings of the different architectures.
Attachment 3 - TPC (Transaction Processing Council) Benchmarks This document contains many references to the TPC (Transaction Processing Council) when discussing database architecture tier performance. The TPC is a non-profit corporation founded to define transaction processing and database benchmarks and to disseminate objective, verifiable TPC performance data to the industry. Throughput, in TPC terms, is a measure of maximum sustained system performance. In TPC-C, throughput is defined as how many new-order transactions per minute a system generates while the system is executing four other transactions types (payment, order-status, delivery, stock-level). In TPC-D, throughput is defined as how many decision support transactions per minute a system generates while processing other decision support requests. In general, TPC benchmarks are system-wide benchmarks, encompassing almost all cost dimensions of an entire system environment the user might purchase, including terminals, communications equipment, software (transaction monitors and database software), computer system or host, backup storage, and three years maintenance cost. Performance or price/performance may be more important, depending on application requirements. If the application environment demands very high, mission-critical performance, then you may must give more weight to the TPC's throughput metric. Generally, the best TPC results combine high throughput with low price/performance. TPC-C and TPC-D results for many popular platforms are available from the Transaction Processing Council's home page of current TPC-C and TPC-D performance figures (http://www.tpc.org). (TPC-COUNCIL) Attachment 4 - List of Acronyms
CICS - Customer Information Control System CMOS - Complimentary Metal Oxide Semiconductor CPU - Central Processing Unit DASD - Direct Access Storage Device DSS - Decision Support System GIG - Gigabyte GUI - Graphical User Interface I/O - Input/Output LAN - Local Area Network MEG - Megabyte MHZ - Megahertz ODBC - Open Database Connectivity OLTP - Online Transaction Processing O/S - Operating System PC - Personal Computer RAM - Random Access Memory RDBMS - Relational Database Management System RISC - Reduced Instruction Set Computing SMP - Symmetric Multi-Processing TPC - Transaction Processing Council TPC-C - Transaction Processing Council transaction processing benchmark TPC-D - Transaction Processing Council decision support benchmark WAN - Wide Area Network Go to Current Issue | Go to Issue Archive Recent articles by Chris Foot
Chris Foot - Chris Foot is currently working for Contemporary Technologies as a certified Oracle trainer and remote database administration consultant. He has worked as a database administrator and distributed
technology strategist for the Mellon Bank corporation and was the Senior Database and Server Architect for Alcoa. Chris has written several articles for Database Programming and Design, The Data
Administration Newsletter and Data Management Review. Chris has also worked part-time for Platinum Technology as a client/server courseware creator and certification instructor. In addition, Chris
has presented several times at the International DB2 Users Group, International Oracle Users Group and the Open Technology Forum.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||