|
Issues in Automated E-Commerce and the Semantic Web
Published: January 1, 2004
Published in TDAN.com January 2004
Copyright 2003 Reengineering LLC 1. INTRODUCTIONMethods for requirements specification and for software engineering are widely studied because of their economic importance. For example, Agile Modeling [1] and Extreme Programming [2] are just two of many approaches. As organizations become increasingly interdependent via e-commerce and the web, such methods become even more important. EDI and XML based approaches [3], among other emerging standards, seek to ease the manual interfacing of software systems across organizational boundaries. There are strong economic reasons why systems should be designed to be self-integrating over the web. In distributed manufacturing, and in and other e-commerce activities, there are new opportunities for automated matching of requirements to capabilities. The emerging second generation Semantic Web is expected to play a key role in this kind of automation. This paper looks these issues, as we move from single systems, to manually integrated systems, and towards systems that will be self-integrating. First, we look at some issues in developing single systems. Then, in section 3, we look at the situation when several systems must be manually integrated. In section 4, we mention some difficulties in applying a manual, standards based approach to virtual organizations, and we look at an approach to systems that will self integrate across organizational boundaries. Finally, we conclude with a suggestion that a new method for the specification and implementation of software systems is of interest for each of the situations we discussed. 2. ISSUES IN DEVELOPING SINGLE SYSTEMSThe Economist [4] has this to say about developing a software system: A common cause of disaster in software development is that the end product is precisely what the customer ordered. In a world moving at Internet speed, a customer’s objectives are constantly being revised, so programmers have to be able to hit a moving target. To look at this in another way, according to Standish [5], 51 percent of all [software] projects are over budget and/or late, another 15 percent of all projects fail altogether, and just 34 percent are completely successful. These numbers may provide some comforting context when an IT department has to justify a cost overrun—“Look, we may not be doing so well, but all these other folks are doing so much worse”. However, the situation plainly calls out for improvement. This picture illustrates some of the things that tend to happen in requirements gathering and software development.
Figure 1 Current Requirements Gathering and Software Development Methods
In the picture, business opportunities are lost because the application design cannot anticipate all future needs. Also, a business policy change in imprecise English must be translated into precise code, such as Java. To some extent, programmers must try to understand the business, and business people must try to understand about programming. In my company, Reengineering, we work on a system called Internet Business Logic [6], with the aim of changing the situation pictured above to a more direct scenario like this.
Figure 2 A More Direct Scenario for Requirements Gathering and Software Development
We’ll return to this idea in talking about the Semantic Web in Section 4. 3. ISSUES IN INTEGRATING SYSTEMS MANUALLYOne of the problems in interfacing of software systems across organizational boundaries is that organizations may have different names for items that are the same, or almost the same. Here is an example based on the paper “Semantic Resolution for E-Commerce” [7]. A retailer would like to order computers from manufacturers. In the retailer’s terminology, the kind of computer needed is called a PC for Gamers. A particular manufacturer makes a computer, and in the manufacturer’s catalog, it is called a Prof Desktop. At first sight, there is no match between the retailer’s requirement and the item that the manufacturer is offering. We can think of this as a “semantic distance”. If the retailer uses a search engine, such as Google, to look for PCs for Gamers, the manufacturer of Prof Desktops will not be found. If the retailer asks for a more general search on, say, Computers, the manufacturer of Prof Desktops will only be found if it is listed under Computers, and there will be a very large number of results that are not relevant to the search for PCs for Gamers. In this scenario, the only hope for a match comes if the retailer searches for computer manufacturers in general. With luck, the retailer may then talk to the relevant manufacturer’s sales people, and it may emerge during the conversation that the PC for Gamers and the Prof Desktop match on many features. So, the conversation has negotiated away most of the semantic distance. Then, the retailer and the manufacturer can then ask their respective IT people to make entries in the relevant databases so that a purchase order for a PC for Gamers is taken to mean that Prof Desktops should be supplied. The above kind of scenario may work sometimes in simple cases with two or three organizations, but better ways of matching requirements to capabilities could yield significant economic advantages. So far, we have looked at single systems, and at two systems that must be integrated, and we have seen that there are economically important issues in how the systems are specified and implemented. Next, we look at the situation when many systems must be dynamically integrated across organizational boundaries. 4. ISSUES IN SELF-INTEGRATING SYSTEMSIn manufacturing in particular, and in e-commerce in general, there is a growing interest in the notion of a virtual organization (VO), one that could knit together several component companies to meet a particular requirement. A VO might only exist for a few weeks, till it meets the requirement, then be dissolved. A particular company might be a member of several VOs simultaneously. The manual method of integration described in the last section clearly falls short for this kind of VO. What is needed is a way for potential component companies to post information about their capabilities and capacities in a way that a VO matchmaker can search and then integrate. However, the issue of how things are named now becomes a likely show-stopper. One approach to this issue is to say that we need a standard that spells out how each company should post information on its capabilities. But, as Steve Ray of the US National Institute of Standards has pointed out [8], “[There is] an increase in the number of interconnections among information systems supporting the manufacturing supply chain as well as other businesses. Each of these interconnections must be carefully prescribed to ensure interoperability. However, the sheer number of interconnections and the resulting complexity threaten to overwhelm the ability of the standards community or industry to provide the necessary specifications—a way out of this impasse must be found.” In the paper, Steve Ray outlines the elements that must be developed so that systems can usefully self-integrate. One of the elements listed is “A reasoning or inferencing capability within the communicating systems, to enable the systems to make judgments and draw conclusions about the meanings of terms.” Over the past few years, there has been substantial research and development effort directed at implementing a successor to the web as we know it today. The successor is called the Semantic Web (SW), since it aims to add meaning to the existing web. At a basic level, the SW does this by adding a label to each link in the web. So, if the web has a link from PCs for Gamers to Computers, the SW will add a label with the information “is-a-kind-of” to the link. Then, the SW will also add reasoning and inferencing methods, so that if the SW has links “DroidBlaster500 is-a-kind-of PC-for-Gamers” and “PC-for-Gamers is-a-kind-of Computer”, then it can also reason to conclude that the DroidBlaster500 is-a-kind-of Computer. The actual notations currently used in the SW are rather technical. As indicated at the bottom of Figure 3, one might use the Resource Description Framework (RDF) [9] to write down the labeled links like the two just mentioned, and one might use the OWL [10] programming language to draw the conclusion. Up to a point, this is fine if the RDF and OWL notations are just used by computers to communicate and reason with each other. However, this leads to a picture of the future in which networked computers create, run, and dissolve VOs without any oversight from business people. The issues of requirements specification and software engineering outlined in Sections 1 and 2 above threaten to overwhelm any effort at business control and audit-ability. We can picture the situation like this.
Figure 3 Semantic Disconnects between People and Machines
So, a Semantic Web based on notations that only computers and a few technical people can read and understand seems to lack a key component—one that would put business people and regulators in control. A candidate for a component to fill this gap is a system that we have been working on, called Internet Business Logic (IBL). One aim of our work on the IBL is to replace the picture in Figure 3, with the one in Figure 4.
Figure 4 Removing the Semantic Disconnects between People and Machines
The IBL system supports the writing and running of business rules in English, and for many purposes this can replace programming in conventional languages such as Java or OWL. The system can reason and make inferences, and it can automatically generate and run queries and transactions over networked relational (SQL) databases. So, in the scenario in Figure 4, business people have direct control, in English, over what kinds of reasoning and inference their networked computers will do. As shown in the Figure, the machines can supply English explanations of what they are doing, or even more importantly, of what they propose to do. Of course, within a system like the IBL, there is a complex translation, in both directions, between the English rules that a person specifies and the technical notations that actually run in the machines. However, the translation is an encapsulated, invisible service that the system provides. So, when a business person wishes to change the inferencing, or to get an explanation of something that a VO proposes to do, there is no longer any need to call upon human programmers. Let’s flesh out the example about a retailer and a manufacturer to get an idea of how this can work in practice. Recall that in the retailer’s terminology, a computer is called a PC for Gamers, while in the manufacturer’s terminology, it is called a Prof Desktop. Let’s say that retailer and the manufacturer have each included in their requirements and capability statements that they are interested in something called Worksts/Desktops. The retailer also knows that a PC for Gamers belongs to the class of Worksts/Desktops, although the manufacturer does not know this. Similarly, the manufacturer knows that Prof Desktop belongs to the class of Worksts/Desktops, although the retailer does not know this. So, we are dealing with three kinds of naming conventions, sometimes known as namespaces: one for the retailer, one for the manufacturer, and one that is shared in common between them. In the IBL system, we can write down the above information as two tables, each with an English heading, like this:
for the retailer the term PC for Gamers has super-class this-class in the this-ns
namespace ========================================================================== Computers to order -- retailer Worksts/Desktops -- shared for the manufacturer the term Prof Desktop has super-class this-class in the this-ns namespace =========================================================================== Worksts/Desktops -- shared Computer Systems -- manufacturer
Then, we can tell the IBL system how to reason so as to make a bridge between the retailer’s and manufacturer’s internal ways of naming things. We do this by writing a rule like this:
for the retailer the term some-item1 has super-class some-class in the some-ns namespace The first two lines are premises, and the rule tells the system that if both premises can match up with things in the tables, then the system should reason to conclude that the last line also holds. When we run the system, it can conclude for us that the retailer term PC for Gamers and the manufacturer term Prof Desktop agree - they are of type Worksts/Desktops We can then add further rules and tables, so that the system can reason about the extent to which a Prof Desktop has a fast enough processor, sufficient memory, the right kind of graphics card, and so on. In fact, this whole example (and other related ones) can be viewed and run by pointing a browser to the Reengineering web site [11] 5. CONCLUSIONSWe described how some of the same issues occur in requirements gathering and software engineering
We said that the issues are progressively more important as we move towards virtual organizations that integrate a number of companies for a specific but temporary purpose. We argued that a move away from current software engineering techniques is needed to address the issues, and we suggested that direct specification, in the form of business rules in English, is a useful approach. Examples of the approach can be run live, online, including an example in which a retailer’s requirements are matched to a manufacturer’s capabilities. 6. REFERENCES
[1] www.agilemodeling.com Go to Current Issue | Go to Issue Archive
Dr. Adrian Walker -
Dr. Walker is the CTO of Reengineering LLC, a privately held company. His experience includes:
|