websites and services
multimedia projects
content management systems
online learning
online directories

IT in Retail - Transformation & Globalisation
G. Herman, (c) 1997


IT-related diversification
Carrefour was founded in France by Marcel Fournier and Louis Defforey in 1960. The chain pioneered hypermarket retailing in Europe, opening its first hypermarket at Ste. Geneviéve des Bois near Paris in 1963. A hypermarket is commonly defined as an out-of-town outlet retailing mixed food and non-food products from a sales area of 2,500 sq.metres or more, supported by extensive car-parking. Carrefour s first such store achieved sales of FFr 40 million in its first year of operation and the hypermarket format soon became typical.

Carrefour is now the largest retailer in Europe with annual sales in excess of FFr 100 billion (about £s;12 billion), and its success has been based, to an unusual extent, on foreign expansion and investment, and the introduction of own-brand goods. These two approaches were stimulated by the rapid growth of hypermarkets within France. By the mid-1970s, there were nearly 300 hypermarkets in France and over 2,500 supermarkets. Carrefour itself had 37 stores and faced aggressive competition on both price and location. In 1973, the French Government enacted the Loi Royer which restricted the opportunities for companies to open stores of over 1,000 or 1,500 sq.meters (depending on the intended location). Carrefour s response was threefold: it grew vertically by developing and marketing a comprehensive range of own-brand products under the advertising slogan, Produits libres: aussi bon, moins chers (which can be roughly translated: Freedom goods: just as good, even cheaper ), and horizontally by concentrating on the acquisition of or partnership with smaller stores, and by expanding abroad. Between 1969 and 1977, the company started joint ventures in Belgium, Switzerland, the UK, Italy, Spain, Brazil, Austria and Germany. By the end of the 1980s, Carrefour had divested itself of many of its foreign interests, while continuing to develop its outlets in three markets: Spain, Brazil and Argentina. In the late 1980s, it embarked on a number of ventures in the US and Taiwan, and while the Asian partnership has proved successful (with five stores trading in 1993), the failure of two wholly-owned American stores led to the company discontinuing these operations.

The company's fundamental approach has been the same in all regional or national markets: to combine low prices, wherever possible, with a specialist focus on all its product lines (so-called multispecialisation ), and managerial autonomy at the lowest possible level (typically, the individual store). In foreign territories, the company has displayed a critical awareness of cultural variables and the importance of good relationships with local suppliers.

Carrefour's greatest foreign success has been in Spain - which became the company's most profitable market in 1991. Its Spanish subsidiary, trading as Pryca ( precio y calidad - price and quality), is now the largest food retailer in Spain - an achievement based on a strategy of regional autonomy, discounted pricing and strong branding. In Latin America, market volatility has undermined price discounting, but Carrefour retains a commitment to strong branding and locally focused management. Operations in Latin America are more reliant on sustained partnership arrangements with local suppliers than in France or Spain, while branding strategies have been tuned to local buying patterns and consumer preferences. Almost all growth in foreign territories has been self-financed.

Carrefour's use of IT has been evident in the emphasis the company has given to the development of financial services. A company trading across borders has an inevitable interest in treasury management, which - in Carrefour s case - has reinforced its commitment to medium-term investment, most frequently in innovative retailers. In 1984, Carrefour's Dutch investment subsidiary acquired an 18% stake in Costco, the growing US-based wholesale club retailer. The company has also invested in a chain of French DIY outlets, a US office equipment retailer, and a number of companies operating in different countries selling, variously, sports goods, frozen foods, discount shoes, furniture, household appliances, electrical goods and carpets. Managing this portfolio, and the various currency deals associated with it, has required a significant investment in IT and also (more importantly, perhaps) in financial expertise. As a result, financial services have become an increasingly significant component of Carrefour's activities - particularly important since in other respects the company remains committed to decentralised management.

In 1981, the company introduced its first in-store credit card - the Carte Pass - through a six-store pilot. By the end of the decade, some 700,000 cards were in circulation. Card services are managed by a Carrefour subsidiary, S2P, which was converted in 1985 into a joint venture with a leading French financial services company, Cetelim. Carrefour controls 60% of S2P, with Cetelim controlling 40%. Financial services expertise has allowed Carrefour to expand its range of offerings, which now includes personal loans and savings plans, and cash withdrawal using automated teller machines (ATMs). A wide area network has been established to handle in-store transactions.

The introduction of ATMs sets Carrefour's Carte Pass apart from other examples of in-store credit cards. The ATMs themselves can be used for a range of purposes - selling a variety of different financial services, as point-of-information displays, and as an advertising and promotions medium. With a large and diverse constituency, hypermarkets find it comparatively difficult to targe consumers with any accuracy. The ATM can provide a painless means of extracting customer data to help in profiling consumers, positioning stores and mounting effective promotions.

Since introducing Carte Pass, Carrefour has extended the concept to Spain with the launch of the Pryca card in 1989 and the formation of a Spanish financial subsidiary in 1990. A Carrefour card has also been piloted in Brazil. The emphasis on investment and financial services is a strategy which can offset uncertainty in Carrefour s primary retail markets, but it does not provide a short term solution. Disappointing results in 1992 preceded the unexpected departure of Carrefour s chairman of two years, Michel Bon, the first person to hold this post outside the families of the company s founders. In 15 years, the company had increased its asset value from a third to a half of annual sales; now it appears to be selling assets. The expansion of IT-related financial activities may help the company grow beyond what seem to be natural limits, but it will almost certainly be a very different company as a result.

Networks provide the infrastructure for data communications. Fundamentally, there are two kinds of network - the local area network and the wide area network, although you may often hear of campus networks or metropolitan area networks which are neither LAN nor WAN but somewhere in between. Networks are also either broadband and baseband. A broadband network uses carrier signals - radio waves, microwave, or audio tones on conventional telephone lines - while a basebound network directly encodes data onto the transmission medium. Baseband is commonly used in LANs and broadband in WANs, although the association is not absolute. The main advantage of broadband is that it allows a large number of signals to be transmitted simultaneously by a process known as multiplexing. This is why it is commonly used for WANs configured as backbone networks linking a number of different LANs.

LANs are typically restricted to a single building or a closely situated group of buildings, although they can be linked together by a variety of devices called bridges, routers or gateways. LANs are used to share resources or to allow different computers to send data to each other - for example, connecting back office PCs and workstations to each other, and to specialised machines such as laser printers, or database servers storing details of current stock. Another LAN might connect EPoS terminals with a computer holding price information and product codes. Typically, the EPoS terminals, like the back office PCs themselves, will be machines using Intel or Intel-type central processors running the MS-DOS operating systems. Increasingly, they are linked on the network to one or more Unix servers - powerful small computers which may hold local product, transaction, staff and customer databases. These in turn will be linked, using EDI or directly over a WAN, to servers in other stores, a central mainframe, or other computer systems run by suppliers, distributors, or banks.

Structured Cabling
LAN infrastructures are based on well-established and, in the main, non-controversial technologies. The main issues are installation cost and expandability. In a large retailer, the realistic choices might be between Thicknet coaxial, shielded twisted pair (STP) or cheaper unshielded twisted pair (UTP) cabling. Increasingly, users introducing new premises wiring are opting for UTP-based structured cabling which makes it easier to physically reconfigure a desk or store layout by using a grid pattern with many connection points for the wiring on a single floor and connecting floors via an optical fibre or high speed copper backbone which could run, say, up a lift shaft.

Wireless LANs
In the future, structured cabling will undoubtedly be augmented or even replaced by wireless LANs, among whose important benefits will be to allow staff carrying handheld computers to receive or transmit data from anywhere in the store. In fact, many stores do this already - notable among them being Wal-Mart. This sort of facility simplifies the common practice of monitoring shelf-stock levels using portable bar-code scanners attached to a small computer and late uploading the data to the store s main system from a convenient connection point. Computers monitoring shelf-stock levels may be linked to warehouse systems to trigger shelf restocking as necessary.

Electronic Shelf-Edge Labelling
Electronic shelf-edge labelling, in turn, could use small radio devices, linked to microprocessor controlled shelf-edge LED or LCD displays, to receive signals from a central computer setting prices. For most supermarkets, setting prices is still effectively a manual operation. It is done, for example at the beginning of the week before the store opens on receipt of information from central office. It takes between three and five hours to adjust pricing information in an average sized supermarket store. Electronic shelf-edge labelling allows this process to be effectively instantaneous. By eliminating the manual element, it minimises the risk of error and ensures - where necessary - that prices are consistent throughout a chain. In the same way, networked shelf-edge labelling facilitates flexible promotional pricing, allowing price changes to be introduced for limited periods in order to move particular items or to target particular shopping times or days (along the lines of bar happy hours or differential ticket pricing for rail or air transport).

At present, electronic shelf-edge labels are too costly, at around $15 each, to be practical. Wiring costs would make even that look cheap. But, given lower-priced technology and wireless LANs, imagination is the only limit. Some commentators have even suggested that shopping trolleys might be fitted with miniature microcomputers to receive and display promotional messages while the shoppers push them round the store. Presumably, the microcomputers will be disposable.

Wireless LANs may use radio waves or an optical transmission medium, while line-of-sight microwave connections allow LANs in nearby premises to be bridged without the need for extra wiring. Linking stores to a regional, national or international centre requires reliable wide area connections. These are commonly based on existing telephone networks - analogue or digital - using circuit-switched lines dedicated to data communications, or private leased lines, both of which provide end-to-end connections like ordinary telephone calls. Organizations with heavy data communications requirements often use packet-switched connections, usually based on the X.25 standard, which offer more flexible and cheaper communications.

Packet switching takes advantage of the fact that in any reasonably complex network there will be many routes which a message can take to get from its source to its destination. Actual messages are separated into packets of a fixed length which are sent, separately, from network node to network node by the first route available, until they all reach their eventual destination and are reassembled in the correct order. Packet switching demonstrates an increasingly important feature of communications - that the physical route taken by a message (or, indeed, the medium which conveys it) should be invisible to the user. That said, packet switching is prone to error and delivery failures, while leased lines or even private cable links are expensive and inefficient. Retailers with a large number of stores are increasingly turning to satellite communications.

V-SAT stands for 'Very Small Aperture Terminal', a satellite-based communications technology that offers numerous advantages over any other form of wide area networking when their is a need to communicate between a central location and a number of widely dispersed outlying sites. Terminals may be separated by hundreds of miles and communicate by sending radio signals to a geostationary satellite which shifts the frequency to that of a tuned receiver and re-transmits. Hughes Network Systems is the leading supplier of V-SAT systems and the best-know retail user to date is, again, Wal-Mart. In the UK, BT has recently won a highly publicised contract to supply the Argyll Group with what will be Europe s largest interactive satellite data network.

In the US, Wal-Mart has been using satellite linkage for over three years to provide a cost-effective and flexible means of facilitating data transmision between widely separated outlets and a central data centre. Without V-SAT it is unlikely that Wal-Mart could have so successfully combined growth on a national scale with effective management control. Wal-Mart s V-SAT application is typical, involving the movement of high volumes of data over large distances and the simultaneous transmission of data to multiple remote locations, to effect daily updating of inventory and pricing and the routine polling of EPoS terminals. Information collected by the terminals is communicated rapidly to head office, where it is stored and analysed, and any decisions taken as a result are communicated back down to the stores or on to suppliers and distribution centres. In the early 1990s, the Harvard Business Review awarded Wal-Mart s replenishment system the accolade of being a model for all organizations seeking to compete effectively in the coming decade.

In a one-way V-SAT network, signals are broadcast from a central site and each terminal can only receive. In two-way V-SAT networks, all nodes can receive and transmit. A single satellite may support up to 300 customer networks averaging 500 V-SATs each communicating at a rate that will support basic videoconferencing. The main limiting factor on data rate is antenna size - less than one metre across being preferred for remote terminals - although data compression techniques are enhancing V-SAT capabilities, while recent developments have seen the introduction of smaller terminals called U-SATs (ultra small aperture terminals) with comparable performance but up to 50% cheaper to install.

V-SATs offer several advantages over the more conventional dial-up modem-type links:

cost is unrelated to distance;
response time is much lower thanfor dial-up links;
V-SATs are more reliable than dial-up links and ensure an estimated 99.5% network availability; and
bandwidth is greater.

Recent developments promise efficient LAN-to-LAN internetworking using V-SATs as bridges, although only larger networks can really justify the initial outlay involved in setting up satellite links. In the US, the average size of a V-SAT network is around 100 terminals; in Europe it is about 50. Projections of the growth in the V-SAT market [Table XXX] suggest that the technology is due for a boom before the end of the century, although this view is counterbalanced by the belief that terrestrial networks will improve sufficiently to impinge on the demand for V-SAT.

Bar Code Readers and Scanners
Bar codes are a simple symbolic system for representing numerical values in a machine-readable form. The idea is that patterns of light and dark registering on a sensitive scanning device (often using laser light reflected on to a charge-coupled device - CCD - like those that form the photosensitive cells in video cameras) generate a code in the form of electrical pulses which can be interpreted as digits. There are a number of different coding systems: Code 39 (used in health, defence and the automotive industry) is the commonest, but the commonest in retailing is the Universal Product Code/ European Article Numbering system, UPC/EAN, which is a subset of auniversal product-coding system developed by the Article Numbering Association (ANA). Each code identifies a unique product and supplier and the bar codes themselves are easy to print out onto labels or any convenient surface.

Bar code readers are used by store staff to scan stock items on display or in a warehouse area. They come in a variety of forms - contact devices, also known as wands, are actually brushed across the bar code; and non-contact devices using laser or infra-red beams which may be stationary so that the device has to be moved over the bar code (readers), or moving so that the device is merely pointed at the code (scanners). Scanners are best suited to high volume data capture.

Modern EPoS terminals often include readers which can register bar codes on articles passed across the beam of a stationary laser. The beam is split and directed in such a fashion that the scanner can determine and account for direction and speed at which an article is swiped.

In January 1994, an experimental product called the Supertag was demonstrated by the British Technology Group and a South African research and development organization, CSIR, causing some speculation that bar codes will be replaced by microprocessor-based radio tags which emit signals that can be picked up and interpreted by a receiver attached to an EPoS terminal or computer. Radio tags would obviate the need for scanning or swiping each item purchased, and this would inevitably save time at checkouts. They could also be used to detect attempts to remove goods without paying, but there are three problems with radio tags. First, every item in a shopping basket would have to be tagged (which means stores would have to introduce the costly procedure of weighing and bagging fresh fruit and vegetables, or other loose items, before the checkout). Secondly, the tags would have to be practically irremoveable in order to prevent theft, or staff would have to check each item in a shopping basket visually. Thirdly, tags are expensive at around $1 each and, compared to bar codes, seem likely to remain so. They could be made reusable, but then they would have to be easy to remove.

The bar code is an elegant solution to a problem whose target was essentially productivity and accuracy. It has, however, become a key ingredient in the information chain within retail outlets, and continuous improvements are sought. The most promising development is in the bar coding itself, which may eventually be replaced by two-dimensional coding systems (known as symbologies ), which are rather like squashed bar-codes stacked up in rows. These stacked symbologies can code a great deal more information than is available using a one-dimensional bar code and may, therefore, help retailers collect more accurate information about customer preferences and shopping habits.

Multimedia, CD-Rom and Virtual Reality
The common definitions and descriptions of multimedia are as remarkable for their variety as for their hyperbole. They are conflicting, vague, overstated and often quite meaningless, but the imaginative power of the rhetoric is such that marketing efforts have been dissipated by confusion and product development, and promotional strategies have often been infected by internal dissent and abrupt changes of direction. Multimedia is an umbrella concept - but not all the technologies and applications described as multimedia can fit under the umbrella at any one time. Accordingly, the fashions in multimedia shift rapidly and ceaselessly, and multimedia has been one of the most oversold concepts in IT.

The first question about multimedia asked by most people is: What s the difference between multimedia and video or television - after all I can get text, sounds and images on my TV set? The answer is superficially simple, but actually rather complicated: multimedia is digital and television and video are analogue. This is not the place to go into just what this means, except to say that we use the word multimedia to refer to text, sound and images which are created, controlled or reproduced by means of a computer. The UK Interactive Media in Retail Group, jointly managed by James Roper and the accountancy firm and consultants, Touche Ross, gives a broad definition of the concept in its 1994 report:

'Multimedia [is]... the convergence of many different information media and delivery platforms including telecommunications, computing, consumer electronics, publishing, television and video.'

In fact, the most important aspect of this convergence is not the combination of media but something called interactivity , which allows users to move from one combination to another in a programmatic sequence under their apparent control. Multimedia, as the expression is used today, has two defining characteristics: the combination of media and user control of sequence.

In a retail context, multimedia has a number of applications. It can be, and is, used for customer information, stock enquiries, in-store promotions, sales training, cataloguing, or distance selling. Point-of-information displays - rather like standalone automatic teler machines using keypad or touch-screen input - can be found in numerous locations. Airports, railway stations, tourist information offcies and hotels in many countries now offer interactive guides to local facilities, route planning and even ticket sales. Increasing numbers of art galleries and museums make use of interactive multimedia guides, estate agents show property in this way, and some stores have begun to introduce electronic shopping units, mostly showing stock items that can t be displayed in-store - as with Pride & Joy, a subsidiary of the UK company Sears, or the Curry's cooker catalogue. The underlying technology varies, sometimes using analogue video images imported into a digital framework, sometimes using exclusively digital technologies. A popular feature of most of them is the ability to print some piece of useful information - financial options for customers, a price list or whatever.

Digital multimedia was first conceived of in the mid-1980s as a consumer product. The idea was essentially to create a sort of interactive television. CD or VCR type players would plug into a domestic TV and hi-fi unit and would allow users to amble round the Uffizi gallery, play golf with Jack Nicklaus, learn Japanese, listen to the latest Dire Straits album and see the words at the same time, or play adventure games with lifelike (or rather TV-like) visuals. Institutional, business or public access applications were considered the province of hybrid analogue/digital IV, or relatively unsophisticated digital approaches to mixing text and graphics like computer-based training (CBT) and videotext (like Prestel, Oracle or Ceefax). For applications like customer information and sales training, videotext, basic graphics, and interactive video are convenient, trusted and cost-effective technologies. The promise of the future, however, is the use of multimedia as a sales tool - either for home-shopping or in public locations.

It is widely considered that home-shopping will see current printed mail-order catalogues give way to CD-Rom catalogues allowing text and images to be displayed on a computer screen. Given the price of producing and mailing CD-Roms, this seems inevitable as long as home-shoppers have access to some sort of player with which to view the catalogue. As yet, there are not enough CD-Rom players around to justify the publishing of CD-Rom catalogues except in a minority of high-value, probably professional markets. Large mail-order companies could take it upon themselves to supply CD-Rom readers to their customers and potential customers, but the costs are prohibitive and will remain so for some time.

Then there is the question of making the sale. Even using a CD-Rom catalogue, customers would have to make a conventional paper or telephone transaction to actually buy the goods. It would naturally be preferable if the purchasing transaction was somehow tied to the catalogue: this would be faster, more reliable and minimise errors in order-taking. A particular problem arises when customer information, enshrined in the pits of a CD-Rom, differs from the retailer s - for example, if a price change takes place between CD-Rom mail-outs, or if a customer uses an old CD-Rom to order from.

The ideal would be to transmit product information over a telecommunications link and it is certain that interactive multimedia will eventually feature real-time image and sound capture from network and broadband sources. At present, the technology is still largely inadequate, but it is improving. In the US, for example, Yamaha is working with cable companies to run a CD-based catalogues of motorcycles on their networks. The companies receive commission on sales, while customers are offered discounts and can use the ineractive features of cable networks to book test drives. This technology is proprietary - using Philips CD-i (CD-interactive), but in principle any suitable storage or transmission medium could be used. An experimental service in New York, called Cellular Vision, offers cable services like the Yamaha CD-i catalogue over a cellular phone network.

An obvious problem is the lack of international standards which could confine on-line multimedia to national or even smaller markets: CD-Rom, digitaltelephony, data networks, and cellular phone systems are all riven by competiting suppliers, technologies and putative standards. The market is potentially too lucrative and the technology too expensive to imagine that competition will just disappear. The introduction of fast networking technologies using optical fibre will complicate matters further, stimulating the development of local multimedia applications, although stunted wide area services like videophone and video-conferencing are already beginning to appear using the relatively slow digital telephone service known as ISDN (integrated services digital network) . The lure of global data communications networks is driving suppliers into technology camps: for example, MCI and BT, McCaw and Microsoft, and AT&T, Novell, France Telecom and Deutsche Telekom. Within three to six years, international high speed communications networks using so-called Broadband or B-ISDN will allow high quality interactive multimedia to be delivered to individual homes, received on television sets featuring digital high definition television (HDTV) technology. It is probable that different systems will co-exist with bridges and gateways allowing communications between them - not one global data superhighway, but many.

Interactive multimedia offers two-way communications, and shoppers will be able to examine a catalogue and order their purchases during the same session, using their television sets at home or at a public access kiosk like those now being developed by IBM and Olivetti, among other companies. The benefits can be considerable. The UK multiple Argos, Europe s largest catalogue store operation with more than 300 UK outlets, launched a trial state-of-the-art customer purchase point in eight stores late in 1993. The units (supplied by ICL) use touch screen input. They are linked to each store s computerised stock control system and to a CD-Rom unit which holds digitised photographs of stock items. They use catalogue numbers to identify products, although in other types of store bar-codes would do just as well. They will take customer orders, display photographs and descriptions of the products, and accept debit or credit card payment. They also accept money-off vouchers and Argos Premier Points savings cards. Argos has used customer stock enquiry terminals for some years, but this system uses ATM-type technology and a multimedia display to allow viewing, purchasing and payment from the same unit.

The benefit for the retailer is improved customer service and a reduced staff requirement. In catalogue shops the benefits are particularly easy to achieve, because the systems already exist. The catalogue itself does the selling, and staff merely support the catalogue. But in particularly busy periods - Christmas, for example - staff numbers can double. Kiosk selling can handle many of the simpler transactions and even enhance customer perceptions of service levels. And, as the banks have already realised with ATMs, there are geographical benefits associated with kiosks: they don t have to be situated in expensive shop-fronts. Rather than attract customers to the store, kiosks can be taken to the customer.

Ultimately, that will mean implementing interactive home-shopping, using wide area networks and multimedia technology. Already, home-shoppers on cable TV can use a button-push unit to purchase items appearing on their TV screens. Teleshopping will eventually allow individuals to browse products, and order and pay for goods wherever they may be. Customers will be able to talk to remote sales staff, whose faces might be visible in a window on the screen. It may even be possible to try out or buy certain goods remotely - audio CDs, computer games or videos are obvious examples (already the US video store, Blockbuster, is collaborating with IBM on the development of such a system). At the other end of the telephone line, the retailer will be using a call-centre system (probably inviting customers to ring a toll-free 800 number) with computer-telephony integration (CTI) allowing calls to be managed and redirected anywhere in the world, orders to be processed and marketing data to be captured. Meanwhile, academic research is already being undertake into the electronic transmission of smells using an analysis of fragrances into five primary components.

Virtual reality techniques, which use computer generated images combined with movement sensors to convey the illusion of physical existence, could even allow shoppers to feel that they were exploring supermarket aisles and handling goods rather than scanning a catalogue. Within ten years, we shall certainly see the arrival of the first virtual store, with customers wandering a virtual mall located in cyberspace, pointing at virtual commodities in order to purchase them, and paying with a virtual smart card. Orders will be transmitted down a broadband link to a central warehouse where they will be verified and despatched. Retailers may become more like TV production companies or film studios than shop-keepers.

Back Office Database Systems
Multimedia and virtual reality may see a catastrophic reduction in store numbers, but back office systems will always be necessary to manage conventional trading relationships and analyse data. The key ingredient in both these activities is the database, defined by computer scientist, C.J.Date, as:

'a collection of stored operational data used by the application systems of some particular enterprise' (An Introduction to Database Systems, Addison Wesley, Reading, Mass., 1982).

A number of factors may be significant in storing, retrieving and updating such data: among them, the size of the collection, the physical and logical form of storage, the location of the store or stores, and the nature of the application. To suit different circumstances, different database technologies have evolved, and these encompass both hardware and software. It can be difficult to separate the hardware and software aspects of a database system, and this should be borne in mind when considering the claims of different vendors that their hardware is suitable for practically any software or vice versa. The fact is that hardware and software usually work well together only in certain optimal combinations.

In IT, a broad distinction is drawn between transaction processing (TP) and decision support systems (DSS). Both use databases, but in different ways. TP generally involves the interrogation, creation, or updating of simple records - usually one at a time. EPoS is a classic example of TP, and like all TP may be operated in batch or on-line mode - for example, checkout transactions may be instantaneously transmitted to a central database (on-line) or they may be collected and uploaded after the day s activity is over (batch). DSS can also be batch or on-line, but involves operations on a number of records: sorting, comparing, rearranging, and performing calculations on them. The object is not to record anything but to extract useful information from them: hence, decision support . DSS operations generally require users to have some idea of what they are looking for (for example, the number of male customers who buy washing powders), but they have been joined recently by data-mining, a set of techniques to help detect interesting and potentially useful patterns in large volumes of data without any prior knowledge.

The problem with TP, and DSS and data-mining is that they for optimum efficiency they require different, and often incompatible, types of database. There are conventionally two broad families of database - static and dynamic. Often these are further subdivided into hierarchical and network types (static), and relational, binary and independent logical file types (dynamic). Real databases are rarely so simple that they divide cleanly along these lines, and the picture is further complicated because of a tendency to confuse databases with database management systems (DBMSs). The DBMS is, of course, the software that manipulates the data in a database. While the form in which data is stored very often determines the ways in which it can be manipulated, this is not always the case. Nor is it always the casethat a given DBMS can only handle data in one form. In fact, the rationale behind the development of computer databases in the first place was the need to structure data independently of the computer file system which managed it, in order to be able to make additions, deletions and updates more easily.

The two main groups of database in practice are hierarchical and relational, catering for the vast majority of applications. Hierarchical databases organize their data as tree structures - each record element is owned by one and only one other element a level up the tree. In a hierarchical customer database, for example, the address of the customer may be owned by the customer s name and may in turn own a credit limit. Typically, records are accessed by use of an indexed key (for example, a code formed by the first four letters of the customer s name and their date of birth). Data records are highly structured and may be complex with many branches to the tree. By contrast, only the simplest queries in a predetermined format are allowed. Searches are very rapid and updating is fast and easy, so hierarchical databases are most frequently used to support systems where frequent simple transactions are made by a large number of users - for example, on-line EPoS systems.

Relational databases organize data as lists - record elements have no structuring apart from their sequence. A complete file of records is referred to as a relation and each record is a tuple. The tuples all have the same set of fields (or attributes) and are combined in notional tables, so that any item within a tuple can be accessed by specifying a row (tuple number) and column (attribute). Different tables make different relations.

Relations are created, updated, and queried by writing statements in a query language (commonly, SQL, the standard Structured Query Language). Typically, logical operations are performed on attributes or groups of attributes - union, product, intersection and so on - and complex queries can be built in this way. With a hierarchical database, the user is restricted to accessing givens - the owner, member or members of a record element, the next record in the database, and so on - but with a relational database the user can access dynamically all the relations, tuples or attributes it is possible to specify with logical operators. The simple table structure supports the ability to make complex so-called ad hoc queries, but the trade-off is that relational databases are slow, and require elaborate query validation and query parsing, and support for referential integrity (in the first case, to screen out illegal queries that might, for example, monopolise the database and, in the second case, to make sure that data is consistent throughout the database at all times). Since updating can be slow and complex queries take time to answer, relational databases tend to be used where infrequent complex transactions are made by a small number of users - in other words, for decision support and wider management information systems.

Most organizations today have to cope with growing pressure to make better use of the data at their disposal. For example, retailers need to determine consumer trends in order to justify increasingly large-scale investment in new outlets and new technologies. Mistakes can be expensive. But conventional corporate database management systems are not suited to the task of elaborate querying or predictive analysis, and relational databases by their very nature sacrifice performance for flexibility. This may be acceptable when the amounts of data involved are small, but databases have been growing alarmingly in recent years to the point where it may be impractical to use a relational database at all.

Even a simple query (such as Get the records of all customers whose names begin with K ) can generate millions of low-level instructions. Complex ad hoc queries like Get the names and addresses of all customers who spent less than 5 on non-English cheeses in the first three weeks of September will be an order of magnitude more demanding. With large databases (tens of gigabytes), this inevitably leads to degraded performance, inefficiency and errors. And large databases re now measured in tens of gigabytes, very large databases in hundreds of gigabytes or even terabytes (1 terabyte is 1,000 gigabytes).

Massively Parallel Processors
Linear speed-up and linear scale-up in database technology are difficult to achieve - simply doubling hardware speed or capacity does not generally halve the time to undertake a database operation, or allow twice as large an operation to be undertaken in the same time. Limitations to speed-up and scale-up stem from the nature of conventional computer design which means that there are always bottlenecks in any system architecture - places where data transfer grinds to a halt like the junction between a motorway and an urban road system.

Different approaches to the problem have been tried over the years with both hierarchical and relational databases. The current favourite is the use of massively parallel processing (MPP) database engines - specialised computers using a large number of individual processors running in parallel. They are based on different design principles to conventional computers and offer high capacity, high performance, and linear speed-up and scale-up. MPP systems are still realtively uncommon, but they have been popular for scientific applications for some time, and will probably provide the core storage and processing technology for broadband interactive multimedia (particularly, the much-hyped video on demand experiments announced in 1993 and 1994).

Typical database applications for MPPs coming on stream now are to be found in:

o the retail and financial services sectors, where MPPs are used to analyse patterns of consumer spending or borrowing;
o market analysis (for example, tracking fads or assessing the impact of advertising campaigns on the sales of named products);
o traffic analysis in telecommunications and transport; and
o as simple data repositories for large manufacturing concerns.

Vendors stress decision support, trend analysis and 'knowledge work'.

Activities in this area involve a necessary collaboration between software and hardware suppliers - who may or may not be part of the same companies. Among the leading players are Oracle, producer of the eponymous relational database management product, working together with hardware supplier nCUBE, and AT&T Global Systems which supplies the pioneering Teradata MPP machine (which uses proprietary software) and a range of machines developed by NCR before AT&T bought the company. There has been some interesting experimental work from ICL, together with Siemens and Bull, which resulted in the introduction of the Goldrush systems, as well as relatively low-cost products from Kendall Square Research, Meiko, Parsys and White Cross. These systems are, like everything else in IT, reducing in price, and it is now possible to buy an MPP machine for as little as the price of a top-of-the range workstation or as much as a mainframe.

Oracle - whose system has been tested as the heart of a video-on-demand experiment in Florida - claims one customer in car rental (EuropeCar), and others in banking, auto and aerospace manufacture. AT&T Teradata, with a long-established user base, remains the leading supplier to in retailers with Wal-Mart and K-Mart in the US, and GUS, Grattan, and Littlewoods in the UK, along with a number of customers in banking, car manufacture and telecommunications.

Data-mining, Executive Information Systems, Rule Induction and Neural Nets
Decision support remains a hit and miss affair, despite the ability to manipulate large amounts of data, and over the years a variety of technological approaches have been proposed to help it. The latest, as mentioned above, goes by the suggestive name of data-mining . Data-mining is not a single technique but a combination of different techniques. It is most frequently described as a combination of the approach utilised by executive information systems (EIS) and artificial intelligence.

As with many new technologies in IT, data-mining is actually a repackaging of established technologies, some of which have - in their previous guises - been less thanenthusiastically received. Executive information systems were popular for a few months in the late 1980s. They were intended to allow naive computer users to draw out useful information from raw data, using graphs and charts linked to corporate and commercial databases. In fact, as a number of critics have observed, they just made data look pretty.

What was missing from EIS was interpretation, and this is an issue that has become increasingly relevant as the demand has grown to make predictive use of data. In many cases, simple statistical analysis can achieve much, but advocates of the data-mining approach point out that statistical analysis fails to account for underlying mechanisms and logics, fails when it comes to dealing with exceptions, and cannot cope with catastrophic changes that mark taste and fashion. As the apocryphal marketing manager says: I know half my advertising budget is wasted, but I don t know which half.

Data-mining adds rule induction and neural network technology to the armoury of decision support tools. These are well-established areas of artificial intelligence - itself an area of computing science which has, in the past, promised more than it could possibly deliver. Suppliers of data-mining tools include German software company, Software AG; the French computer company, Bull; Logica, ISL and Neural Technologies in the UK; and Digital and Oracle in the US.

Rule induction, pioneered by the British computer scientist, Donald Michie, is a technique for discovering rules of behaviour from sample data. It remains a controversial technique with proven successes in analysing the underlying logic human beings use to control complex environments such as an aeroplane in flight or a working chemical plant. Many critics argue that it can only cope with situations which appear to be complex but are at root very simple. Neural networks, on the other hand, can allegedly handle much more complicated situations.

In fact, neural networks address much the same problem as rule induction and in ways which are, formally at least, very similar. If anything the difference is one of scale and style rather than substance. A neural network is a computer simulation of a simple nervous system which can be trained to recognize certain patterns by associating known inputs with known outcomes. The net starts as a tabula rasa, an unstructured set of connected points or neurons, which develops a structure through the training process. By associating known data sets at the input to the net with their associated known outcomes, the net s structure can be developed to represent the mechanism by which the one gives rise to the other. For example, the inputs might be understood as the light values associated with the points of an image, and the output might be a unique code associated with the thing represented in the image (a name). One can adjust the net s structure so that it eventually transforms the inputs into the output by itself, and this can be considered teaching the net to recognise the image.

Neural net technologies are really an organised form of trial and error in which the error is progressively reduced. Because of their empirical nature, many people believe they provide a more flexible model of learning, particularly appropriate when trying to derive regularities from human behaviour. The one thing neither they nor rule induction can do is simply take raw data, sift it and come up with a set of stable rules describing its structure.

Retailers are enjoined to adopt data-mining techniques in order to uncover hidden patterns in the masses of data they acquire from their in-store systems. The problem is that these techiques can only be useful if their users have at least some idea of where to look for the patterns and what patterns to look for. This is a problem exacerbated by the fact that the retail environment is affected by a large number of variables - such as the tactics of competitors, fashion, or the economic climate - that can t easily be factored in to the equation. Human intuition and an ability to engage with customers remain vital to any attempt to extract useful information from raw sales data.

There are no examples that we know of data-mining techniques being ued for anything more elaborate than gauging demand for a single product or narrow type of product - domestic gas, insurance or holidays, for example. One much vaunted retail application, at the Nottingham headquarters of the UK drugstore, Boots, is actually used for property management, to help predict the likely future performance and value of retail outlets. Statistical techniques can be and are used in these areas, and data-mining is frequently a supplement to statistical procedures. If it can save or make a few percentage points by more accurate targeting of mailshots or by tightening up predictions just a notch, it will be worth it, but it is not - yet - an answer to the retailer's prayer of knowing just what the market wants and when.

Monday, September 10 2001