|
Comparing Replication Technologies
Published: September 1, 1999 There are many database replication tools available on the market today, but not all of them are created equal, and how each of them might fit with your application will vary greatly.
There are many database replication tools available on the market today, but not all of them are created equal, and how each of them might fit with your application will vary greatly. How to compare these database replication tools often becomes more of a guessing game, especially to those who are new to the field. This technical paper describes the principal features of database replication tools which a DBA or applications architect should take note of when comparing replication offerings. Cross-Platform Support The most often requested information about a database replication tool is what database platforms it supports: is it limited to a particular vendor's database (eg. Microsoft, Oracle, Informix, Sybase), or does it allow cross-platform replication, eg. between Microsoft SQL Server and Oracle Enterprise Server? The ability to replicate across multiple platforms allows an organization to maximize its investment in its existing database installations, instead of having to spend additional money on standardizing on a single platform. Not only can the cost of this standardization be prohibitive, but it can also mean purchasing new versions of applications which are database-dependent, or worse yet it can mean rewriting the organization's existing applications to allow them to work with the new standard database platform. Most database vendors are not interested in providing your organization with maximum flexibility when it comes to your purchasing decisions; they want to sell you the most of their own software as they possibly can. It is in their interest to restrict your organization's use of competitors' databases, so their replication technology is naturally going to require you to use only their own databases. While some of these vendors now claim to provide cross-platform support, the fine print shows that such cross-platform capability is restricted in how data can flow, usually in just one direction (in or out) of a competitor's database but not both, or with data type, performance, or other limitations. In particular, beware "heterogeneous" replication tools that are based on using gateways, because this method always restricts capabilities and performance. Database platform restriction incurs the additional cost of preventing your organization from deploying the most appropriate database platform in a given situation if it is not from the same vendor. A database vendor independent replication technology is likelier to provide your organization with the flexibility it needs, whether today or in the future, and allows your organization to limit deployment costs. Replication Speed and Efficiency Another commonly asked question is that of throughput: how much data can the replication software move between sites, and how quickly / efficiently will it do so? In order to best answer this question, one has to examine several factors, including the underlying technology used by the replication software. Replication Methods First-generation replication technology involves what are variously called "change capture" or "store and forward", and "log based" methods. These techniques require that the changes to data be captured by the replication tool, stored locally at each site, and forwarded to other sites at replication time. These techniques suffer from a number of drawbacks:
Modern (or "second-generation") replication technologies involve the use of "direct to the database" methods which examine only the net data changes that have taken place since the last replication cycle. These replication tools have the following advantages:
Database Schema Changes Most replication tools available today require modifications to an organization's database schema, usually to include an extra column in each table to be replicated. While this can easily be accommodated for new applications being developed, it can mean a time consuming or expensive reengineering of an existing application to which replication capability is being added. Additionally, if your organization is attempting to add replication to an application provided by a 3rd party vendor (eg. BAAN, SAP), the vendor may not allow schema changes and may no longer support your organization if it went ahead with those database alterations. Modern replication tools do not require any changes to existing tables, allowing your organization to integrate replication capability as a true "plug-in" solution with minimal or no reengineering, and to add replication to a 3rd party vendor's application without loss of support. Ownership of Data First-generation replication tools require that each piece of data to be replicated have a designated owner site (also called "publisher" or "master" site), from which other sites (called "subscriber" or "slave" sites) obtain their updates. If a user at a subscriber or slave site wishes to modify a given data record, the update request must be sent to the publisher or master site, where the change is made to the specified record, but it is only at the next replication cycle that the change is propagated to the subscriber or slave site. Not only can this delay can be significant, making users at remote sites keenly aware of their status as a secondary site, but it also requires the application to manage these central site data changes differently from local changes. In the case of existing applications in which replication is being retrofitted, this differentiation between local and remote changes often requires significant reengineering of the application. Modern replication tools implement the "update-anywhere" (also called "true peer to peer") replication paradigm, which allows all sites to insert, update, and delete records (unless restricted by administrators). This provides local users with the same high performance as if they were using a non-replicated (or single site) system locally at head office, eliminating the "second-class user" syndrome. Data updates take place at local speeds, and with modern tools' continuous trickle replication the updates can be rapidly distributed across all sites. The flexibility of modern update-anywhere replication tools removes design time restrictions on which users can perform what operations where, and any unnecessary distinctions between users at central versus remote sites. This also means that the addition of this type of replication capability to an existing application will require less development effort and will be completed much more quickly and at lower overall cost. Supported Network Topologies Another commonly asked question about replication tools is what network topologies they support: eg. hub-and-spoke, server-to-server, or some combination of the two. Hube and spoke topologies consist of a central server (the hub) which replicates with all other sites (the spokes). Server-to-server topologies allow replication for common tasks such as mirroring for backup or redundancy purposes. As we will see, it is not common for a replication tool to support both these kinds of topologies. Most first-generation replication tools support either one or the other, but not both, because of the limitations imposed by their change capture- or queue file-based underlying technology. Only modern replication tools allow both workstation and server replication in any combination, with automatic replication failover if some sites are temporarily or permanently unavailable, and automatic replication load balancing with dynamic configuration as new sites are added to ensure optimal replication. Data Transmission Efficiency and Security As outlined above, first-generation replication tools make use of inefficient queue files which contain all the changes made to a given data record. Transmitting all these change records requires significantly more network bandwidth (orders of magnitude greater, in some cases, depending on the application and on how regularly replication takes place) than that used by "direct to database" replication tools which transmit only the net changes to a given record. Besides offering efficient net-change operation, including at finer- than-record granularity, modern "direct to database" replication tools also automatically compress data transmissions (typical business data volumes are reduced by a factor of 4 to 1) and encrypt them using industry-standard Triple-DES security to allow safe replication even across public communications networks such as the Internet or wireless connections. The efficiency of these modern tools allows for replication across network connections that could otherwise be unusably slow, including dial-up modem or cellular connections. In some parts of the world, the networking infrastructure of a geographical region precludes the use of first-generation replication tools. Replication Over Public Networks and the Internet A question that comes up is whether or not a given tool will allow for replication across public networks, the Internet being mentioned the most often of these types of networks. Some organizations require the redundancy of being able to use their own internal network as well as the Internet in order to provide for failover capability. The most common concern of these organizations is the vulnerability of their data as it moves across the Internet. Most replication tools vendors do not include encryption as an embedded component of their offering, leaving it to the implementor to provide encryption as an add-on, or sometimes providing non-standard encryption technologies that leave the data transmission vulnerable to sophisticated hackers. There is nothing worse than the unfounded belief that your organization's data is secure when in fact it is falling into the hands of your competitors. Since most organizations implementing replication are distributing their most precious commodity (business data), it is in their interest to obtain the most secure transmissions channels available, which newer replication tools provide using Triple-DES, government-approved standard encryption. Another concern related to Internet is whether or not the replication tool can communicate through firewalls, which are necessary to protect the organization's sites from intrusion. Most replication tools are not Internet standards and leave the organization with the uncomfortable choice of removing their firewalls or abandoning replication completely. Enterprise-class replication tools can operate securely through firewalls because they are registered as Internet standards and are associated with official Internet port numbers which can be opened selectively by network administrators at each replicated site. This allows replication to take place through firewalls yet leaves the firewalls in place to protect the organization's vital systems. Manual vs. Automatic Data Routing Another important consideration is how the replication tool routes data to sites which need to receive updates; that is, how each site determines the path to be taken by its locally stored data changes to other replicated sites. Most first-generation replication tools require statically defined master/slave or publish/subscribe relationships, and support only fixed hub-and-spoke or server-to-server topologies. In such an environment, the manner in which each site connects to others is fixed at installation time or requires manual administration. This includes which pairs of sites will connect with each other, and in which order. Changing this routing, for example when adding a new replicated site or to compensate for a shift in the relative data volumes at the replicate sites, requires manual administrator intervention. Until this manual intervention is performed, overall systemwide replication performance will be degraded, manifesting either as a general slowdown in the distribution of data changes, or as system downtime at the heavily loaded sites. If queue files overflow at the hub site, remote databases are typically not recoverable and will need to be recreated manually from the master site's image, causing data loss (losing any updates made since the previous replication). In comparison, modern peer-to-peer replication tools dynamically determine the optimal route to take in order to distribute data changes to all interested sites using the most efficient paths possible. This flexibility provided by dynamic routing allows administrators to add new sites to the system without any manual intervention: at the next replication cycle, these new sites negotiate with existing ones and automatically add themselves to the network, allowing existing sites to take advantage of the additional connectivity provided by the new site. A byproduct of this dynamic routing capability is that load balancing becomes an inherent property of the replication network. For example, in the case of a hub-and-spoke topology, instead of becoming depending on the single point of failure represented by the hub site, an administrator can drop a new site next to the existing hub site, and at the next replication cycle these two hub sites will automatically divide the connections to the spoke sites between themselves, with no need for manual reconfiguration of any routing data. Since these replication tools also take into account overall network throughput to each site, and CPU performance at each site, the distribution of work across all sites is maximally efficient. As mentioned above, organizations needing failover network capability can exploit this dynamic routing capability since these newer replication tools also detect the presence of available network connections to each site, and choose the most appropriate connection (ie. highest available bandwidth and/or least busy partner) route to each site. Granularity and Collision Handling Many first-generation replication tools provide "record level" change control, ie. they detect and handle data changes at the record level. Some first-generation replication tools go even further in providing "field level" change control, in that if they detect that a single field has changed in a given record, only that value is transmitted to other sites. Field-level change control may appear to increase replication efficiency by reducing transmission data volumes, but it usually also has the side effect of causing data corruption. This can occur when, for example, two users at different sites update a customer's address record simultaneously: one user modifies the city field to New York, and the other user modifies the ZIP code to reflect a Boston ZIP code. The result of permitting both changes, as field-level replication does, is corruption of the customer's address record. Modern replication tools allow an administrator to define which sets of fields in a record should be updated together. In the above example, the administrator would specify that in the customer table all the address fields (including street address, city, ZIP, and country) are defined to be a single "fragment". Changes to any of the fields in a given fragment are replicated together, guaranteeing consistency of data even if other collisions take place. Copyright (c) 1999, PeerDirect Inc. |