Chris Cieri
University of Pennsylvania School of Law
ccieri@oyez.law.upenn.edu
The University of Pennsylvania Law School has been involved in several digital library projects over the past two years. Although the requirements of each of the projects has varied, one constant has been our emphasis on throughput and the quality of the final product. The majority of our electronic material, approximately 90,000 pages, consists of digital images, raw text and an interface that allows the user to search on the raw text but view the perfectly rendered images. This database includes the archive of the American Law Institute and two collections of Uniform Commercial Code drafts also issued by ALI. The papers of Judge Bazelon are next to be added. As a participant in the RLG Studies in Scarlet Project, we are converting over 100,000 pages of text into digital images which will be accessed via SGML encoded indices. A small subset of the most important material will also appear in SGML encoded full text. As a participant in the RLG Webdoc project we are using Adobe Capture to create PDF files with mixed text and image of the last five years of the Law School's journals. Finally, to produce an archive of back Law School exams with sample answers we used optical character recognition to convert the paper exams into ASCII files which we then edited and converted into HTML format documents mounted on our Intranet. In this presentation we will review our experiences with each of the projects and summarize what we see as the advantages and costs of each approach.