Bing J., Let there be LITE : a brief history of legal information retrieval, in European Journal of Law and Technology, Vol. 1, Issue 1, 2010.

LET THERE BE LITE : A BRIEF HISTORY OF LEGAL INFORMATION RETRIEVAL

Jon Bing [1]

Cite as: Bing J., “Let there be LITE : a brief history of legal information retrieval”, in European Journal of Law and Technology, Vol. 1, Issue 1, 2010.

Abstract

This paper charts the growth of national and global legal information systems including both commercial systems such as LEXIS, state funded national systems and free access systems such as those represented under the WorldLII umbrella. It suggests that the vision of an integrated national electronic information system was always likely to remain unfulfilled because of changes in technology and complexity of jurisdictions. In the circumstances the paper suggests both pluralism and the development of global co-operation.

1. Texts and Lawyers

During the First World War, large guns were used by both parties. They were firing without much accuracy. The United States Ballistic Research Laboratories of the ordnance department of the US Army became the leading institution in ballistic science, i.e. the calculations of the trajectories by projectiles fired from the guns. The BRL became located at Aberdeen Proving Ground, Maryland. To calculate the tables, BRL acquired and used from the mid 1930's one of the mechanical differential analysers developed by Vannevar Bush, the scientist who is remembered for convincing Roosevelt in 1941 that the Manhattan project was necessary. By 1943, the analysers were no longer adequate, and BRL placed an order at the Moore School of Engineering to build a more powerful device, which would become known as ENIAC, the first electronic computer (1945). [2]

This is only to remind the reader of the domains seen as appropriate for computers in the early years. Use of computers for processing natural language texts would be exotic indeed. Computers moved towards business life through accounting (computerising the older alphabetic tabulators for punched cards), and were seen as possible tools for any other schemes based on a numeric approach. Among these one found the library classification systems such as Dewey's, and information systems were developed based on the systematic tables, perhaps supplemented by keywords or even brief abstracts. These systems became useful tools for many disciplines. Chemical Abstracts, one of the subject-oriented abstracting services which started in 1907, and the medical MEDLARS information system (1964), graduating to the online version Medline, are examples of systems successfully utilising computer technology to offer an information service.

Such schemes seem to rely on there being a distinction between content and text: The chemical process verified may be set out in an abbreviated form without the "information" being lost or necessarily distorted. In law, it is often not possible to make this distinction, "the medium is the message". This is most evident in statutes. The text of a statute is not a vehicle communicating a message from one person ("the legislator") to the reader, who may be a lawyer. It is a text constructed after a process usually regulated in great detail. Several persons and institutions may contribute to this text, and the final form of the text may rely on decision processes like voting in parliament. It becomes in principle meaningless to look at the text as a message from one person to another - rather the text is offered as a regulatory instrument, from which one may argue on the existence or detailed content of a rule part of the national legal system, to be backed up by courts and law enforcement agencies. The text has become an object independent of conveying a message between persons; it has become a resource in itself.

Perhaps this simplistic view may serve to explain why lawyers from the very beginning gave preference to a system which would grant access to the authentic text of the legal sources, the text as formed by the agency within the legal system authorised to issue such sources, like the statutes adopted by parliament, regulations issued by the government, or case law by the decisions of the courts.

This paper makes an attempt to highlight the growth of national legal information systems. The constraints of length makes it somewhat episodic or anecdotal, though an attempt has been made to focus on important features of the development, and also what may be of general interest.

2. A Retarded Child and its Impact [3]

In the late 1950s, a bill was passed in the legislature of Pennsylvania. Part of the bill was to change a term in the health law - the phrase "retarded child" should be replaced by the more neutral phrase "exceptional child". This may seem an example of legislative manicure, but the amendment indicated a new political attitude to this group of persons, and the political importance should not be underestimated. There are in all jurisdictions examples of such amendments in the legislation that heralds changes in the policies within a certain area.

Pennsylvania adhered to the principle of regulatory management called "textual replacement". It dictates that any amending regulation must exactly identify which sections and sentences in the existing body of regulations should be amended. One may picture amending regulation containing explicit wording which could be cut out and pasted into the specified parts of the identified existing regulations, giving as a result the new text of each amended regulation. An alternative to this principle is the "omnibus principle", where it is seen as sufficient that an amending regulation contains a section which dictates that all former regulations containing the phrase (or even the rules, which can be specified by different words) - wherever they may occur - are to be deemed amended, without specifying the relevant locations.

However, having the principle of textual replacement, the legislators of Pennsylvania had to identify where the phrase "retarded child" - or a variation of this phrase - actually occurred. This represented a tedious task, and the legislators turned towards the Graduate School of Public Health at the University of Pennsylvania for a solution. Here Professor John F Horty had been working on a manual of hospital law, and had developed indexes to support his work at the Health Law Center. Accepting the contract, Professor Horty set out to solve the problem in the time-tested way of professors: He hired a group of students to read through the legislation and indicate all passages containing the relevant phrase. The result was predictably conventional: The professor found the quality of the work wanting. He hired a new group with an equally depressing result.

It was at this stage that he turned towards the Data Processing and Computer Center, which had been established in 1955, and gained co-operation for a more radical approach: Solving the problem using text retrieval. To appreciate the boldness of this approach, one should consider the level of computer technology at this time. For the project, there were available an IBM 650, which was based on vacuum tubes and a drum storage of 2,000 words, and an IBM 7070, which was a transistorised version of the IBM 650, having a magnetic core storage containing 9990 numbers of ten digits each. One may compare the capacity to current examples of information technology, like a digital watch or a pocket calculator. Random access memory units like magnetic disks were not available; data not placed into the central storage units mentioned above, had to be stored on sequential tapes.

In principle, the system Professor Horty developed processed an input text to create two files. One was a "text file", containing the original text with an additional index, which gave an internal address for each element of the text - like "section 2, paragraph 3 starts at location n on the magnetic tape". The other was a "search file", [4] where all the different words occurring in the text were sorted in alphabetical order, giving for each occurrence the internal address of the word.

The search file could be used as a very extensive index to the text itself: Looking up any term in the search file, the internal address was specified, and the computer system could use this in accessing the index of the text file, and retrieve the word in context from the text file. The user had the impression of searching the "full text"; specifying a word like "child", the system would return with the information that this occurred, for instance, in two sections of the statutory text of the database. And if the user asked to have these displayed (or rather, printed out), they would be retrieved, using the internal addresses as the key linking the search and text files.

Sorting the words of a text in alphabetic order can be compared, perhaps, to ordering books by authors' names in a bookcase. Anyone who has ventured to do this will know that new books frequently have authors, whose last names start with a letter early in the alphabet, requiring you to move the books of authors, whose names start with a letter further into the alphabet, working yourself back towards the place where a space for the new book is needed. This metaphor may give some indication of the practical problems facing the early developers. And, of course, they did not have online systems, but had to deal with batch processing, using punched cards for input and printouts for output.

The system developed by Horty did make it rather facile to identify in which provisions of the Pennsylvania Health Law the word "child" and "retarded" (or grammatical variations of these) co-occurred, and the original contract could be successfully concluded. But it was rather obvious that any words in the stored provisions likewise and as easily could be retrieved. It is therefore justified to see this as the first successful text retrieval system, and as such it was demonstrated for an American Bar Association conference in 1960. In 1963, the technology was used to build the first computerised legal information service, the LITE [5] system of the Air Force Staff Judge Advocate in Denver, Colorado. The technology also provided the basis for Aspen Systems Corporation, established in 1968, which served a large number of states in maintaining their compilations of regulations in force during the early 1970s.

There are many roads to follow from Horty's initiative. In practice, it started the development of computerised legal information services, which today are provided in any jurisdiction, and with major international examples as Reed Elsevier's LEXISNEXIS service, or Westlaw and other services of the Thomson Group. But impact on research was also major, and the two major examples are European.

But before leaving the beginning, one may point out that though lawyers are not known for being technological avant gardists, text retrieval was actually developed by lawyers and for lawyers, due to the need to consult the authentic text for legal interpretation. The search engines of the Internet today harvest what was sown by the early efforts of the legal community.

3. European Initiatives

Bryan Niblett was a nuclear research physicist with the UK Atomic Energy Authority. [6] He spent 1966-67 on sabbatical in California, primarily to learn about computer programming. But as he had been called to the English Bar, [7] he also spent time digging into US research in computers and law. He came across the work of Horty, and planned doing something similar in the UK. On his return, he had already worked out the acronym STATUS (for Statute Search), and was determined to develop a machine independent program written in a subset of FORTRAN. Having produced the first version of the program, he ran into trouble - the Lord Chancellor advised the UKAEA that to put all the statutes into the system would be an ultra vires act, infringing the monopoly of Her Majesty's Stationary Office (HMSO) under Crown Copyright. Therefore, the STATUS system became limited to the atomic energy regulations. It was never impressive as a database; its importance was the program itself and the underlying philosophy of the search language. It was significant that the program was machine independent, which could be compiled for different computers, FORTRAN being one of the high level languages with acceptable portability. It provided initiatives in other institutions, and a better understanding of retrieval strategies and limitations. Niblett's collaborator, Norman Nunn-Price - the former submarine officer - also became influential in the development of European legal information services, especially for the European Union.

Status based activities were also started in Australia, Holland and Norway. The Norwegian Research Center for Computers and Law (NRCCL) started its NORIS research program in 1970. LEXIS and West both consulted the NRCCL throughout the 1970s. The research program gave many important theoretical results, but also furnished the basis for the national legal information service, Lovdata, still a successful operation.

Another major European example is Colin Tapper. [8] When working at the London School of Economics 1961-65, he also became aware of the research by John Horty, and initiated the studies that have become known as "The Oxford Experiments", [9] as the bulk of the work was conducted after he joined Magdalene College, Oxford (from which he retired as a professor). The value of Tapper's work is not only the very valuable results he provided on the design and performance of retrieval strategies, but also the academic attitude he brought to the field. His major objective was not to get a system up and running, but to understand how text retrieval worked, and how it best could be utilised to access the type of source material which mainly suffered from the shortcomings of paper-based solutions: Case law. Also, he pioneered the work on using case citations for improving performance.

One will note that both these European examples have a certain academic flair. They represent an interest in how text retrieval works, and of the relation between natural language texts and a search language mainly based on Boolean logic. It is justified to observe that the academic interest in text retrieval and computerised legal information services was mainly European. There may be several reasons for this; one probably is that as US services grew commercial, the companies operating the services offered research environments, complete with databases, and challenges which attracted those interests. But for commercial reasons, these environments were less open.

Also Europe had a technological environment less mature than the United States. The computers (many of which at this time were manufactured in Europe) did not perform at the same relatively high level, and were less accessible. At the same time, the educational systems for lawyers in many European countries included practice periods that made available well-educated persons for trivial tasks such as intellectual indexing. These two elements may explain the tendency in European systems to rely, to a higher degree than in the United States, on indexes, thesauri, etc. Indeed, the first operational computerised system in Europe, the Belgian CREDOC [10] (1967), was wholly based on intellectual indexing. However, CREDOC remained somewhat of an odd example among European systems.

4. The Legal Information Crisis

In Europe, however, another aspect was rather prominent. [11] In 1970, Professor Spiros Simitis published his book Informationskris e des Rechts und Datenverarbeitung (Karlsruhe). The main argument in the book is based on the consequences of a shift in European welfare states away from discretionary award of social benefits on the basis of need to the establishment of a legal right to social security. This meant that decisions became legal in nature, and that an applicant could appeal. The appeal had to be processed according to the legal ideals found in how courts addressed complaints. There was a growth in specialised appeal agencies, such as administrative tribunals. Also, in jurisdictions where there was a system of general administrative courts, their caseload increased. The appeals needed to be tried on the basis of the relevant legal sources. Few such sources applied to these cases apart from the prior decisions of the decision-making institution itself. Such sources were not typically included in the traditional legal publications, but were only available through the manual files of the institution. These were cumbersome to search, and consequently the time for processing appeals increased.

Admittedly, this is a very crude rendering of the arguments of Simitis, but the point should be clear: There was an acute need to improve the performance of legal research in order to meet the requirements of the modern welfare state. And the solution was available in the form of legal information systems. This was strongly advocated by academic lawyers like Spiros Simitis, Wilhelm Steinmüller and Herbert Fiedler; and the 48 th Deutschen Juristentag in 1970 recommended:

Die ständige Deputation halt als für dringend geboten, über das Stadium der theoretischen Vorüberlegungen eines Einsatzes datenverarbeitender Maschinen auch für die Rechtspraxis hinaus sic nunmehr am de praktische Vernwirklichung, mindestens durch de Schaffung von Datenbanken, zu bemühen, wie dies in Ausland schon weithin geschieht.

Already in 1967, the Bundesministerium der Justiz had started planning such a system. This is an amazing example of a systematic approach, living up to the best ideals of German praxis, where the administration was supported by professors like Fiedler, Simitis and Klug, ending up in a major report of 1972 - Das Juristische Informationssystem - Analysis, Planung, Vorschläge. On this basis, the JURIS [12] system was implemented, a system still very much alive today. The first services of this system addressed social law (the decisions of Bundessozialgericht) and tax law (the decisions of Bundesfinanzhof), illustrating the point of the need to address the problems of the welfare state.

We will not dwell on the development of JURIS, but note that it was followed by a remarkable academic activity. In the 1970s, Germany was by far the most active country in the area of computers and law. Professor Fiedler headed both Institut für Datenverarbeitung im Rechtswesen at the Gesellschaft für Mathematik und Datenverarbeitung, and Institut für Juristische Informatik at the University of Bonn. At Regensburg, Professor Wilhelm Steinmüller developed his basis for a general theory of computers and law, Professor Fridtjof Haft was active at the University of Tübingen, and Professor Wolfgang Kilian established his Institut für Rechtsinformatik in Hannover. There are several more names that could be added to this impressive catalogue of lawyers taking an active interest in computers and law, developing its many aspects, and contributing to a rich literature.

The German example could be used as an index to what happened in many European countries. I am acutely aware of not being able in this context to even very summarily indicate these developments, but perhaps two more examples may be given.

First, in Italy, a similar pressure towards decisions taken by the administrative courts was felt. Here, the lead was taken by the Corte di Cassazione. Renato Borruso, one of the judges at the court, suggested a system in 1968 based on the traditional massime or abstracts of the decisions of the court, and the use of a thesaurus. [13] The design of the system pursued the solutions in more traditional library-type systems, which also made it possible to realise the solution without the massive computer facilities required by the US services. The ITALGIURE-FIND system of the Centro Elettronico di Documentazione of the court grew to become an impressive and extensive system under the inspired directorship of Vittorio Novelli, it became a general driving force in Italy with strong policy effects. For instance, a dedicated communication network for ITALGIURE was established between Italian courts.

And there was a broad interest. Vittorio Frosini at the La Sapienza University in Rome had published his Cibernetica diritto e società [14] in 1967, in which he emphasised administrative law much stronger than in the Anglo-American literature. In 1969, Mario Losano at the University of Milan [15] coined the term Iuscibernetic a for the field of Macchine e modelli cibernetici nel diritto. [16] The National Research Council established the Istituto per la Documentazione Giuridica [17] in Florence, which engaged in an active strategy of publications and conferences. The Corte di Cassazione started in 1976 a tradition, which was upheld for twenty years, of huge, international conferences spanning the whole width of the expanding area of computers and law, the proceedings published in several volumes.

Second, in France, Professor Pierre Catala at the University of Montpellier in 1965 organised a working group with the objective of developing a legal information service, which in 1967 was formalised as Centre d'études pour le traitement de l'information juridique (IRETIJ). This is - as far as I know - the oldest academic institution within the area of computers and law. It was associated with the problem of accessing the decisions of the appeal courts, which were not subject to any systematic publishing in France. IRETIJ developed a system called JURIDOC, and started documenting appeal court decisions. The system was inspired by the work of Michel Bibent, whose doctoral thesis also probably is the first within the field. [18] It may be fair to say that the efforts, especially after Professor Catala left for Paris, were somewhat drained by the needs of an operational system to the disadvantage of academic research. [19] And in Paris, there was another working party established in 1967 on the initiative of Lucien Mehl, a conseilleur d'Etat and the grand old man of computers and law in Europe. [20] The Conseil d'Etat also has some functions as an administrative court, and the initiative led to the establishment of an information service which from 1970 became an independent organisation, Centre de recherches et développement en informatique juridique (CENIJ), which through a series of changing names and mergers with other services has become the current French information service, Legifrance. Though it is somewhat fuzzy, France again offers an example of the needs of the administrative law being a driving force behind the developments rather than the business opportunities which in the United States motivated ventures.

The national development of legal information retrieval will be left at this point. It is unfair to the developments that were to follow - for instance the Swedish Ministry of Justice, which pioneered systems with integrated functions (for instance for printing and retrieval), and the Swedish Law and Informatics Research Institute, which, directed by Professor Peter Seipel, became so very influential, or to the innovative Vienna system and the work by Robert Svoboda and others in Austria. It is also unfair to those institutions most active within this area today, for instance Professors Jos Dumortier and Marie--Francine Moens at ICRI, Leuwen or the Norma project at the University of Bologna.

It will be excused that the author spends a couple of paragraphs on the Norwegian Research Center for Computers and Law (NRCLL), Faculty of Law, University of Oslo. In the period 1971-1980, the NRCLL conducted a research program known as NORIS, [21] which investigated performance of text retrieval systems. [22] The experiments were designed to find the more efficient functions for retrieval performance. Several corpus of legal texts were established, legal issues were identified by researchers working with the collections, and these researchers also went through the corpus to establish which documents were 'relevant' [23] to the issues ('target sets'). Then different research strategies were applied to retrieve documents, in which the search requests were kept constant, but where different principles for retrieval and ranking were applied.

For experimental purposes, a modified STATUS system [24] was employed for Boolean requests, but a vector-based system was integrated. Also the use of citations for retrieval was explored; Colin Tapper brought his research from Stanford and Oxford, and concluded this in Oslo. [25]

The results gave an understanding of how characteristics of texts and search requests were related to performance, a strategy known as 'conceptor based retrieval', combining ranking with Boolean Operators. [26] This was taken as a basis for the specification of a new text retrieval system, incorporating a relational database, known as SIFT [27] developed by the Directorate of administration and information technology. This system was also the engine for the national legal information service, Lovdata, which offered its first legislative databases to the public in 1981.

The NRCCL discontinued the NORIS program till about 1990, but went on with research in knowledge-based systems for decision support, text retrieval and legislative support.

5. Challenging the Legal Publishers

The history of legal information retrieval has many aspects, and there may be different views of what may be the more important. But there cannot be any doubt that Ohio is one of the important places to start. At the end of the 1960s, there were numerous attempts at developing information or documentation systems. In 1964, the Ohio Bar Association created a working group for considering the adoption of a computerised system. However, the group concluded that no satisfactory solution was available, and recommended that a new system should be developed. They established a corporation, Ohio Bar Automated Research Corporation (OBAR), which contracted Data Corporation of Dayton to look into the problem.

Data Corporation had in 1964 developed a system for the retrieval of Air Force reconnaissance documents. In late 1968, it is told that two neighbours got talking across their fence, one being a partner with Data Corporation and one being the chief executive officer of Mead Corporation, a forest products, paper processing, pulp making company. But the two neighbours saw some possibilities of future synergy, and Mead acquired Data Corporation, including the OBAR project. They brought in Arthur D. Little to give advice on restructuring; one of the consultants was Jerry Rubin. The advice was to carve out of the corporation the Information Systems Division, and concentrate on the legal business. In February 1970 this was spun off as Mead Data Central with Jerry Rubin as a vice president. [28]

LEXIS was launched with flair. Jerry Rubin became the front figure; LEXIS established its own high-speed network connection to New York and Washington DC, over time developing into MEADNET. It brings to mind the network established around the ITALGIURE system in Europe more or less at the same time, and though the two front figures - Vittorio Novelli and Jerry Rubin - were very different as persons, they both had a vision, and were able to communicate this vision to others and nurse enthusiasm.

From the beginning, LEXIS had an extravagant feel to it, like the use of colour terminals in 1970. One of the challenges for text retrieval is determining which of the retrieved documents are relevant. Even when a search request is adequate, there will be a certain share of the retrieved documents which are not relevant. These have to be discarded, and it will take too much time to read through the documents in full to make this judgement (though this is finally the test). Therefore, one traditionally adds to the document an abstract, this provides an efficient strategy for making relevance assessment. But LEXIS did not in its original version have any editorial material, only the authentic text of the cases, regulations etc. Writing abstracts would represent a huge investment and long delay. Rather, the user was offered a keyword-in-context (KWIC) format, where the search term was highlighted and displayed with leading and following lines (much like the snippets giving the results of a current search engine). In its 1970 implementation, LEXIS used the colour blue for this highlighting. It was seen as rather extravagant to use an expensive colour monitor only to highlight terms. Richard Giering remembers that people laughed at the Association of Computing Machinery demonstration in New York 1970. [29]

The establishment of the legal information service LEXIS was a huge operation. There was a historic backlog of cases, which had to be entered by keypunching. LEXIS outsourced this to contractors overseas, where the cases were double-punched (to ensure high accuracy) by operators not knowing English. At the same time, new decisions had to be collected at home, which in principle implied a contract with each individual judge. LEXIS brought the approach of a modern computer system to this endeavour; it was also not constricted by a web of traditions. The vision was for the end user to operate the system, not any middle-person or paralegal. Based on this philosophy, LEXIS brought out the UBIQ terminal, a special purpose terminal for lawyers that had the help-text engraved on its keys: Press the key [next case], and the next case would be displayed. The red UBIQ was designed to sit on the desk of a partner in a big law firm.

LEXIS as a commercial system was launched in 1973. And at the end of the 1970s, LEXIS announced that all the big law firms of the United States were their clients. By "big law firm" was meant all with more than 100 partners. This very clearly illustrates the difference between the United States and Europe. In Europe, there was in 1980 hardly any law firm with 100 partners, and in many countries there were regulatory restrictions on the size of law firms.

It is my belief that at this time LEXIS was mainly used as a research tool. The user would have to walk up to the terminal, which typically would be in a library. He or she would type in the search request, and determine which cases might be relevant in a dialog with the system. But he or she would not print out the cases on the cumbersome and noisy line printer connected to the terminal, which would result in folds of pyjama-striped printout. Rather, the user would turn to the extensive libraries that any of the large law firms would have. LEXIS had provided the identification of the cases; the books would be collected for the cases to be read and studied in the conventional way. I believe this integration between computer research and extensive libraries is the clue to the success for LEXIS in the 1970s.

LEXIS challenged the largest legal publisher in the United States, West. In 1980, West employed 2,500 persons, among them 150 legal editors, and had a weekly export out of their warehouses in St Paul, Minnesota of approximately 250,000 books. It maintained the national reporter system, and its key index scheme was integrated in the legal system, part of the training of a legal mind. Though starting computerising typesetting in the middle of the 1960s, West had been slow to respond to the possibilities offered by computerised retrieval, and only when LEXIS had demonstrated that there was a market, did West turn towards it.

There were interesting differences between the companies. LEXIS was rather glamorous, sparking off the ideas and the enthusiasm of new technology, while West was encrusted with experience, legal know-how and tradition. LEXIS was based on the programs originally developed by Data Corporation, West found its software across the border.

Since the early 1960s, a treaty project has been going on at Queen's University, Kingston, Ontario. The moving force behind this project was Professor Hugh Lawford, and in 1968 he initiated another project to support his collection and annotation of the treaties of the British Commonwealth, the Queen's University Institute for Computers and Law, which was given the acronym QUIC/LAW. Late in 1968, he had an exchange of letters with IBM for a joint project to explore the possibilities of computerised legal information retrieval. The basis was an in-house IBM program known as INFORM/360 at the corporate headquarter in Armonk, New York. It is believed that the program was developed to meet the need for litigation support in the major anti-trust proceeding to which IBM was party (and which contributed to the unbundling of software). One of the interesting features of the program was the use of ranking algorithms as alternatives to a plain Boolean query language. Richard von Briesen of QUIC/LAW further developed these into rather sophisticated strategies.

The QUIC/LAW system was from the start conceived as something larger than the Treaty Project of Professor Lawford, it was to be developed into a national legal information service. But the development period was rather stormy; several of the original supporters withdraw, among them the Federal Department of Justice after a test in 1973. The result was the establishment of a new company, QL-Systems Ltd with Professor Lawford, von Briesen and Canada Law Books Ltd as the original shareholders.

One of the first ventures of the new QL-Systems was to sell their program to West. I believe IBM also used INFORM/360 to develop STAIRS, a general text retrieval system which became the workhorse for many legal information systems, the first installation probably being the PRODASEN system of Brazil in 1972.

We return therefore to the United States, where West in 1975 launched its own computerised legal information service, Westlaw, based on the QL-System's program. West had many advantages, including its long established relation to the judiciary and the legal community. But West made at least one dubious choice in entering the market, the database only included the editorial headnotes. The headnotes were written by the editors, and it was believed that in restricting retrieval to these, retrieval performance would be enhanced. This was a presupposition contrary to known facts; such a document design would impair recall, though it might have a positive effect on precision.

I believe that West looked towards the use of the LEXIS system, where the computerised system was mainly used as a retrieval tool, while the cases were read from the books of the conventional library - books which actually were to a great extent published by West. West believed that by offering a superior tool for researching the headnotes lawyers were used to, they would in the computerised system open their conventional reporter system through a more efficient channel. West did not appreciate that though LEXIS was used as a research tool, the relevance function depended upon the ability to dip into the case at several points. Restricting the access to the headnotes, did in some way "blind" the user.

Therefore, it came as no surprise that West changed its policy in 1978 and included also the authentic text of the cases. Since then, Westlaw and LEXIS have competed in the market with comparable services. The services are different in detail with respect to coverage and features. But the monopoly of West in the paper-based world has been broken, there is not a duopoly - and there are many specialised services.

The remarkable success of LEXIS also impressed operators in other markets. LEXIS decided to move into the French market in 1982, and with considerable success, but also with a lesson learned: The whole database had to be converted to a character representation permitting the French accents. LEXIS had then already moved into United Kingdom, [30] and this was to some extent controversial. Butterworths, a major British publisher, was contracted to co-operate with LEXIS. One of the directors of Butterworths was Professor Colin Tapper, who had pioneered computerised systems. One might have expected that a joint project with West would have been an obvious solution, as both companies were legal publishers and with a somewhat similar culture. The co-operation with LEXIS was therefore a surprise. I have learned that Butterworths in fact approached West and suggested a joint venture, but was turned down - West would not take any interest in activities outside its home jurisdictions.

In the UK market, the European Law Centre Ltd had taken an initiative in 1979 for computerised service with one of the originators of the STATUS program, Norman Nunn-Price, as its director. The EUROLEX effort had a European perspective, and in 1981 a new and more aggressive phase was initiated with David Worlock as head of the organisation. The major legal publisher Sweet and Maxwell made an exclusive agreement with EUROLEX in 1982, which also made an agreement with Westlaw for making US material available to European users. The Canadian based international publisher Thompson acquired EUROLEX, and the competition between LEXIS and EUROLEX in the UK market was fierce, but brief. Legal policy arguments favoured EUROLEX, which was a "national" company compared to the LEXIS service, which actually serviced its UK customers out of its facilities in Dayton, Ohio. But overnight the EUROLEX service was closed down by Thompson, as the CEO David Worlock was told about this one hour before the rest of the company. It really brought home that legal information services were no longer things academics or enthusiasts fiddled around with in their spare time, they had become part of the more ruthless world of business.

The international publishing industry has now taken over both the US major services. Reed Elsevier owns LEXIS, and Butterworth is also part of that company. West - which for a long time remained a family company - has been taken over by Thompson, which has interests in a large number of legal information services throughout the world.

6. The Vision Receding

In understanding the early developments in Europe, it is also necessary to appreciate the role played by a small number of institutions. These forged the persons working with legal information services into a rather close-knit community, helped to communicate test results and experiences in an informal way, and played a large part in reciprocal political support for the policies adopted.

First, the Council of Europe played an essential role in the early developments. On the initiative of the "Committee of Experts on the Publication of state practices in the field of public international law", a "Committee of experts on the harmonisation of the means of programming legal data into computers" started its work in 1969. I believe no one will be offended by me saying that the longish name of the committee reveals that it was formed without a clear understanding of its objective or the means to achieve such an objective. And the committee changed its name to the more acceptable "Committee on Legal Data Processing" in 1974. [31] For the rest of the century, this Committee was a central forum for an exchange of ideas and experiences with respect to computers and law. The substantive law was not part of the area for this committee - but it explored legal information services and justice administrative systems as well as teaching in the area of computers and law. Members of the Committee were a mixture of bureaucrats, policy makers and academics - and there would be annual international meetings with rather ambitious programs. Often the success of international committees is measured in the number of legal instruments adopted - the Committee certainly adopted such instruments, [32] but its main achievement was the communication it facilitated between European institutions, not only at the meetings of the Committee itself, but at the annual international events, which were organised in different member countries. Around the Committee grew a loose-knit community of experts within public administration and universities with a strong, though informal, communication.

It is not possible to understand the co-ordinated development of legal information services in the different European jurisdictions without awareness of the exchanges taking place through the network built by this Committee. The Committee also strongly supported academic activity, not least through the adoption of recommendations of making introduction to computerised systems a compulsory part of legal education, and suggesting a curriculum for the teaching of computers and law.

One may see the Committee on Legal Data Processing as the pivot of a wheel with many spokes. Mention has already been made of the congresses of the Corte Suprema di Cassazione, which attracted large audiences. There were also considerable activity and conferences centred on the Istituto per la Documentazione Giurdica in Florence, and the enthusiasm with which the Italian legal community embraced the whole of Europe, inviting them to join the march towards the future of law. In the United Kingdom, the British Society for Computers and Law [33] was founded; its meetings were also of an international nature and included barristers and solicitors as well as lawyers within government - all excited about legal information retrieval and how to bring its advantages to the UK (which would by no means prove easy).

In Germany there were formed societies, which are still very much active, of the same nature, and which addressed policy issues with considerable heat. These meetings perhaps did not contribute as much to the general international discussion - as German was the conference language, this tended to exclude a wider international audience, but it had an integrating effect on the German language areas of Europe.

The main point of this small paragraph is to convey the feeling of enthusiasm and comradeship that developed at this time - from the early 1970s and onwards to 1990. The European developments cannot really be understood without considering this swell of common purpose - carrying us, it was believed, towards national, integrated - and probably monolithic - information services.

This was not realised. The obvious reason was the introduction of the PC and office automation. For the vision of the one, integrated national information service was to a large extent the shadow of the available architecture for computer systems: Mainframes with terminal networks. When office automation was introduced, this did not in the first years stimulate communication. Even the establishment of a local area network was not without its problems. The philosophy led to the development of rather isolated islands, the PC on your desktop might be linked to some local resources like a printer - but not to central files like a national information system. When the CD-ROM was introduced in 1984, systems based on this became popular. Though the storage capacity of a CD-ROM seemed large compared to other media at this time, it was obviously insufficient for a truly national information system. Instead, it was more suitable for sector-oriented systems, for instance tax law. But CD-ROMs were well suited for publishing and management of rights according to the same model as for books, which - it may be argued - made publishers more interested in the field, an interest which carried over into the next phase, as already indicated above.

7. A Changed Technical Context Resulting in New Legal Policies

For the next phase came - communication was sorted out, LANs were linked into wider area networks. And then - at the beginning of the 1990s - the control of the Internet was relaxed, permitting other institutions than those related to research to have access to this international linfrastructure. Nearly at the same time, the World Wide Web was realised within the Internet, web browsers became available and content could be reached from your desktop computer. This was the time when Content was crowned as King - computer technology had matured sufficiently to make vast libraries of text, images and sound available.

But again this did not bring back the vision of integrated, national legal information services. There may be several reasons for this, but one certainly was that as the threshold of publishing material on the web was lowered, many institutions wanted their own home page and to make their own material available through this page rather than supply the material to some central facility.

As the threshold for publishing went down, new parties took an interest in the legal material. The new environment hungered for content. A possibility was to convert existing material for re-utilisation on the web. This strategy had the attractive advantage that a lot of material could be made available in a short time. But there usually would be formalities to be met before such material could be uploaded, an obvious formality - which usually also cost money - was clearing the copyrights associated with the material. However, in the United States copyright was not claimed in the primary, legal sources like statutes, regulations and case law. [34] Therefore, such material was available to furnish a basis for new services supplementing the established services or challenging them in the market place. One of the United States systems was JURIS (an acronym for "Justice retrieval and inquiry system"), developed in the early 1970s to serve the attorneys of the Department of Justice. In launching the service, it was emphasised that "minimal standards of due process and equal protection of law" were to be extended to all citizen, and that "fulfilment of these requirements depends on timely access to reliable and up-to-date information". [35]

The major objective of JURIS was to make available the legal material generated within the department itself - we recognise this need from the origin of the European systems discussed above. In addition, JURIS included the total text of the United States Code from FLITE. And since 1982, under a contractual arrangement with West, JURIS received weekly updates of case law for its federal and digest files which otherwise were only available through the commercial Westlaw service. Unlike the "raw" legal sources, the West material was subject to copyright, at least the material created by their editorial staff, such as the headnotes. West had also successfully claimed copyright in the pagination system [36] and other elements. The contractual arrangement with the Department of Justice was designed to avoid third parties using JURIS to gain access to Westlaw material and in this way avoiding paying fees or in other ways circumventing the policies of West.

The Department of Justice as a federal agency falls within the scope of the freedom of information legislation. Carole D. Hafner, herself a major figure in the history of legal information retrieval, [37] requested in 1991 samples of legislative texts from JURIS for research in computational linguistics. The request was denied. Public interest groups such as the Taxpayers Asset Project (TAP), National Technical Information Services (NTIS) and the American Association of Law Libraries (AALL) queried West on its willingness to make its database available to public access. In a press release of 30 September 1993 West announced that it would not seek renewal of the contract with the Department of Justice. The Clinton administration announced that the National Science Foundation would fund a project to enhance future access to government information. This announcement was made on a Friday, the following Monday the administration announced the permanent shut-down of JURIS from 1 January 1994.

The story is highlighted by a decision of the US District Court of Columbia. [38] After it had become known that the JURIS service would be discontinued, the information service Tax Analysts requested access to parts of the database containing West material. The court concurred with the Department of Justice, and held that "the West ­provided data in JURIS is not an 'agency record' under [Freedom of Information Act] and this Court lacks jurisdiction to compel Defendant [Department of Justice] to disclose the information sought by Plaintiff".

The example of JURIS demonstrates some of the explosive policy power of the web technology, blowing away part of the older infrastructure designed and determined by technological circumstances. The exclusive arrangement between the department and West was discontinued - at least in this respect - and the money which used to go into the maintenance of JURIS would partly be used to purchase legal information services from West or LEXIS in the market place. At the same time, the court decided that a legal source was not an "agency record", and therefore not subject to the freedom of information legislation.

8. The Legal Information Institutes

Another major example of the new possibilities stimulating new initiatives is provided by the Legal Information Institutes.

In 1992, the LII of Cornell Law School was launched by Peter Martin and Tom Bruce. As Martin states:

"The legal information industry in the U.S. in the mid 90s had focused totally on judges and lawyers and hadn't paid attention to the information needs of others….” As Martin states: … "One of our powerful early discoveries was how much demand outside those professional sectors there was - ordinary citizens trying to make sense of laws that impinge on their lives [39]The Cornell LII offers the United States Code, an organised compilation of current federal laws; and the collections of all recent opinions of the US Supreme Court and New York State Court of Appeals … Making information accessible on the web in a manageable format has been a challenge - there are 13 US Circuit Courts, each putting its decisions on the web. The problem is that data structures and formats differ from site to site: researchers need some solution, for instance a search engine that reaches across those structures." [40]

The Cornell Law School LII was the first service for the provision of free legal information on the Web, [41] and Legal Information Institute has become a generic term to indicate a certain type of operation on the Web. [42] There are namesakes as far-flung as New Zealand, Zambia and Kazakhstan.

Partly, the LIIs represent a reaction to a restrictive and protective attitude towards making legal material available to the public. As mentioned, primary legal sources are generally not protected by copyright; this is permitted under the Berne Convention art 2(4).

But there still is often a reluctance to abolish traditional exclusive arrangements with publishers or similar restrictive policies. Primary legal material ought to be available for anyone who would take it as a basis for value-added services, and it should be made available free of charge or at marginal cost. This underlies the EU re-utilisation directive. [43] Perhaps LIIs have been both most successful and most needed in jurisdictions where there has been a formal control of the publishing of legal sources, for instance by applying Crown Copyright. [44]

One of the more remarkable LIIs, is the Australasian Legal Information Institute (AustLII), jointly established by the University of New South Wales and the University of Technology, Sydney with Professor Graham Greenleaf and Professor Andrew Mowbray taking the initiative in 1995. This is an effort with an impressive ambition, and a background in the policies of legal information services in Australia, where the doctrine of Crown Copyright prevails. AustLII is based on the belief that it is in the public interest that authorities should aim to maximise access to the "public legal information" that they control. AustLII argues that unless governments and agencies positively co-operate with non-commercial bodies by providing them with raw data in computerised form, non-commercial bodies are unlikely ever to be able to publish the data in any form. [45]

There are several characteristics of AustLII that make the service remarkable - the scope of the database is one thing, the programs developed to enhance the service, and support search strategies is another. But perhaps most important are the standards AustLII sets itself for making legal resources available in a complete and authentic form, a service to integrate material and to be trusted. [46]

AustLII has also many offspring, one of them being the World Legal Information Institute (WorldLII), a co-operation between the LIIs worldwide. AustLII has taken upon itself to attempt to create a truly international information resource; not only are the materials made available by the LIIs listed under WorldLII, but a search engine has been developed to index legal sites around the world. There is a toolbar available for most browsers, and lawyers should download this - it will provide a reminder on-screen of future possibilities for legal information retrieval.

The enthusiasm for the LIIs should not obscure some important policy tensions. The needs of the professional user of legal sources require an efficient research tool. Certainly, the public should be given as easy access as possible to statutes and other important legal material. But use of the authentic legal sources is not trivial. There may be - and in my mind I am convinced there are - different requirements for a service catering for the public and a service meeting the requirements of the professionals. And I am not at all certain that the specialised tools needed by a rather small number of professionals should be paid for by the public at large.

The services offered by the LIIs are often buffered by the policy of publication legis and a reference to the basic right of all citizens to know the law. This justification is not challenged. But it is challenged that it is wise, or even possible, to satisfy both the needs of the lay user and the needs of the professional user by the same information service. Even though much of the authentic material would be identical, the user requirements for a friendly service do differ. The tension between these two objectives can be discerned in several aspects of services from LIIs. For instance, Cornell LII integrates its services with legal education, and AustLII has several features to help lay users. [47] And for the professional user we would like to see further developments, for instance more sophisticated ways of presenting search results, better integration with in-house services (for instance for litigation support) etc.

9. The Vision Upgraded

Above, with a certain tristesse, it was observed that the vision of a consolidated national information service had been disrupted by advances in information technology, first the introduction of office automation, and then of web services. Of course, the vision was never realistic. A jurisdiction is too complex - there are too many possible perspectives to be contained within one system. The only way to ensure objectivity and a sufficient diversity is to support several systems.

The limiting factor may be the economic constraints within a jurisdiction. We have seen how a large market like the United States may support several large scale and general legal information services like LEXIS and Westlaw. In other jurisdictions, there may be a need for the public sector to provide the necessary economic basis for a national service.

Of course, legal information services are about more than the market, they also raise issues of what services have to be available for ensuring due process and the other ideal policies in a society ruled by law. From a national perspective, one should be sensitive to requirements and restrictions.

But there is also a need to look towards an international solution.

We need to utilise the advantages of global spread of legal information. We need to find possibilities of exploiting the advantage of other jurisdictions having legal material which may be of interest. Current principles of using material across frontiers have been forged in a situation where it has been difficult to exploit case law or legislative reviews from other countries. Today, there are regional legal systems where it would make good sense to access decisions and other material from other countries. The European Union may serve as an example, regulations and directives are issued for a large number of jurisdictions, and it would be useful if the material generated by courts and other institutions in applying these provisions were available for the other countries within the union. There are examples of services offering such solutions, like CaseLex [48] reporting on supreme court decisions relating to European legal instruments.

But these attempts are still in the making. We should be guided by the vision of WorldLII, and look for knowledge-based solutions that seek out and consolidate material upon request of the professional user. And computational linguistics seem to have progressed sufficiently to offer the user the possibility of having the material rendered in a language he or she may understand, at least sufficiently to determine whether that material may be relevant.

If this is realised, we will see that the dynamics of the legal system itself, where a legal argument takes into consideration prior decisions, may over time work itself into a more harmonised view as courts and other institutions puzzle together not only the pieces of their national systems, but also try to make them fit within a bigger, international picture.




[1] Norwegian Institutt for rettsinformatikk, Research Center for Computer and Law, Universitet i Oslo, Oslo. jon.bing@jus.uio.no. Much of the historical background until 1994 can be found, though organised in a different form, in Jon Bing et al, Handbook of Legal Information Retrieval (Amsterdam: North-Holland 1984), also available at http://www.lovdata.no/litt/index.html (visited 15 September 2009). However, I have also relied on personal notes which are not documented elsewhere.

[2] Cf Norman Macrae, John von Neuman, (NewYork: Pantheon Books 1992) 190.

[3] The historical background is set out in Jon Bing et al, Handbook of Legal Information Retrieval, (Amsterdam: North-Holland: 1984), also available at http://www.lovdata.no/litt/index.html.

[4] Also known as 'inverted file' or 'concordance'.

[5] LITE is an acronym for 'Legal Information Thru Electronics', and it was launched on 13 November 1963 under the inventible slogan Let there be LITE! The service was in 1975 renamed FLITE - 'F' for 'Federal'.

[6] The paragraph is based on private communication from Bryan Niblett to the author.

[7] Bryan Niblett therefore combines the two aspects of computers and law - later he became Reader in Law at the University of Kent at Canterbury, going from there to the chair of Professor in Computer Science at Swansea.

[8] For a review of his work, see Jon Bing, "The policies of legal information services: a perspective of three decades" in Peter Mirfield and Roger Smith (eds), Essays for Colin Tapper (London: LexisNexis UK 2003) 147-158.

[9] Cf Colin Tapper, "Legal Information and Computers: Great Britain" (1968) Law and Computer Technology January, 18-19. Here is mentioned the 'Office for Scientific and Technical Information' at Oxford, which was the name of the framework within which Tapper continued his work from the LSE. Colin Tapper is well known for his reluctance to have his photograph taken; it gives me mischievous pleasure to point out that his portrait appears with the article.

[10] An acronym for Centre de documentation juridique, the system was established by L'assemblee des bâtonniers de Belgique and La fédération des notaires.

[11] This is argued in more detail in Jon Bing, 'Legal information services: some trends and characteristics', Colin Campbell (ed), Data Proecessing and the Law, (London: Sweet & Maxwell 1984) 29-45.

[12] Some confusion may arise from the use of the acronym JURIS also used for the US Justice Retrieval and Inquiry System, but the Bundesministerium der Justiz consulted with their American colleagues, who agreed to the German use of the name. The US service is now discontinued, see below.

[13] See his review in Borruso R., Civita' del computer (2 vol) (Sesto San Giovanni: Ipsoa Informatica 1978).

[14] Vittorio Frosini, Cibernetica diritto e società (Milan: Edizioni di Comunità 1967).

[15] He is currently at the University of Piedmonte Orientale.

[16] Turin: Einaudi 1969.

[17] In 2001, the institute was renamed L'Istituto di Teoria e Tecniche dell'Informazione Giuridica (ITTIG).

[18] Bibent M., L'informatique applique a la jurisprudence (Universite de Montpellier 1972).

[19] Though Professor Michel Vivant, whose work in substantive information law is prominent, is also from Montpellier, but not working within the sector discussed here.

[20] Mehl is the first known to have contributed a paper on computers and law in Europe, offered to a conference at the Institut techniques des administration publique 21 May 1957, "La Cybernétique et l'administration".

[21] An acronym for 'Norske, rettsinformatiske studier'.

[22] The program is documented in many papers, and in Jon Bing and Trygve Harvold, Legal Decisions and Information Systems (Oslo: Scandinavian University Press 1975) and Jon Bing (ed), International Handbook in Legal Information Systems (Amsterdam: North-Holland 1984).

[23] The notion of relevance is discussed at some length in both sources mentioned above.

[24] See above, the Norwegian version was known as NOVA*STATUS.

[25] Colin Tapper, An experiment in the use of citation vectors in the area of legal data, CompLex 2/82 (Oslo: Scandinavian University Press 1992).

[26] But pure Boolean operators were demonstrated to be less efficient, cf Jon Bing, "Performance of Legal Text Retrieval Systems: The Curse of Boole" (1987) Law Library Journal 79(2) 187-202. It is therefore reasonable to reflect critically on the use of simple Boolean strategies offered by most search engines.

[27] SIFT was first used by the Council of Europe, Strasbourg.

[28] Cf Susanne Bjørner and Stephanie C. Ardito, "An interview with Richard Giering" (January 2004) Searcher, http://connection.ebscohost.com/content/article/ 1036116093.html;jsessionid=D0437073C6647A193B3E827575CC0AE2.ehctc1 (visited17 July 2008).

[29] Cf Susanne Bjørner and Stephanie C. Ardito, cited above.

[30] This decision was announced at the 1978 conference of the British Society of Computers and Law.

[31] Formally, this was a new committee succeeding the former. I served as chair for this committee 1981-82.

[32] An example is R(83)3 on the 'protection of users' of legal information services.

[33] The Society was founded 11 December 1973 based on an initiative of the Scottish Legal Computer Research Trust, which itself was founded in January 1970.

[34] It is not quite clear how the doctrine of Crown Copyright applies to the different jurisdictions of the United States. It is reported that in 1984, Crown Copyright was used as a basis for state legislation in New York restricting the sale of data from the Legal Retrieval Service of the Bill Drafting Commission of the state legislature to competing services. But this is an exception; in general copyright in primary legal sources is not claimed. Cf Jon Bing, "The policies of legal information services: a perspective of three decades", in Peter Mirfield and Roger Smith (eds), Essays for Colin Tapper (London: LexisNexis UK 2003) 153.

[35] George R. Kondos, "Introduction to JURIS - Justice retrieval and inquiry system", Abidjan World Conference on World Peace through Law, 1973.

[36] LEXIS was paying US$ 50,000 annual in license fees to West for incorporating the pagination system, based on West Pub Co v Mead Data Cent., Inc, 616 F Supp. 1571 (D Minn 1985), aff'd, 799 F 2d 1219 (8th Cir), cert denied, 479 US 1070 (1986). In a subsequent case, Matthew Bender and HyperLaw v West (SDNY 94 Civ 0589, 19 May 1997, United States District Court) Judge John Martin determined that West could not claim copyright in its enhanced versions of decisions as included in its reporters. However, Matthew Bender was acquired by Reed Elsevier in 1998; therefore the decisions were not pursued. It is doubtful whether the copyright in the pagination system would be upheld according to the Supreme Court's interpretation of the copyright originality test in Feist Publications, Inc, v Rural Telephone Service Co, 499 US 340 (1991).

[37] Carole D. Hafner, An Information Retrieval System based on a Computer Model of Legal Knowledge (Ann Arbor: UMI Research Press, 1981).

[38] Tax Analysts, Plaintiff, v United States Department of Justice, Defendant, and West Publishing Company, Defendant-Intervenor, 913 F Supp. 599.

[39] Cf Linda Myers, "CU Law institute web site has latest legal information, from Miranda to Elian", http://www.news.cornell.edu/Chronicle/00/4.27.00/Legal_Info_ Inst.html [25 July 2002].

[40] Cf http://www.law.cornell.edu/.

[41] One will appreciate that 1992 is very early indeed for such a service.


[42] The term 'Legal information Institute' (LII) refers to a provider of legal information that is independent of government, and provides free access on a non-profit basis to multiple sources of essential legal information, cf Graham Greenleaf, Philip Chung and Andrew Mowbray, "Free access to law via internet as a condition of the rule of law in Asian societies: HKLII and WorldLII", http://www2.austlii.edu.au/-~graham/publications/2002/HKLII_WorldLII_Jan02/HKLII_WorldLII.html#Heading3 [25 July 2002]. See also Greenleaf this volume.

[43] Directive 2003/98/EC of the European Parliament and of the Council of 17 November 2003 on the re-use of public sector information.

[44] One may refer to the experience of Niblett when trying to introduce a statutory information service. In general, see Stephen John Saxby, Public Policy and Legal Regulation of the Information Market in the Digital Network Environment, CompLex 2/1996 (Oslo: Norwegian Research Center for Computers and Law).

[45] Graham Greenleaf, Andrew Mowbray, Geoffrey King and Peter van Dijk, "Public access to law via internet: the Australasian Legal Information Institute", http://www.austlii.edu.au/austlii/articles/libs_paper.html#RTFToC11 [25 July 2002].

[46] Also other LIIs have similar standards, for Cornell LII see Thomas R Bruce "Some Thoughts on the Constitution of Public Legal Information Providers", http://www4.law.cornell.edu/working-papers/open/bruce/warwick.html [26 July 2002].

[47] One of the innovative features of AustLII is an expert system integrated in the information service, when the user has identified a provision in a statute, the user may (where available) switch to an expert system mode that will guide the user through a series of questions in order to advice the user whether the provision will apply to the problem of the user.

[48] Cf http://www.caselex.com/