Skip to Main Content

Digital Initiatives : Glossary

Questions? Additions?

If you see a TERM that is not clear or needs refreshing or if you have a TERM you would like to add to the GLOSSARY, please send email to . Thanks!



an application programmatic interface “is the interface that a computer system, library or application provides in order to allow requests for services to be made of it by other computer programs, and/or to allow data to be exchanged between them”


automatic document feeder

an attachable part that automatically sends multiple pages through the scanner


automatic document reader

a scanner that has the ability to process many documents


bento box 

Bento boxes are traditional Japanese lunch boxes which come in different sizes and shapes.  They have little containers that hold a variety of lunch time foods.  The organizational structure of a bento box is now used when designing a user interface on a web site, thus creating several box shapes to hold different yet related information. 


best practices

a method, process, or activity that is more effective at delivering a particular outcome than any other technique, method, process, etc., with fewer problems and unforeseen complications



an asset that originated in digital form. some examples include: Websites, wikis, e-books, digital sound recordings, and email. 


cascading shylesheets

describe and customize the presentation, such as colors and fonts, of a document or a web page written in HTML



an image compression schema based on the "Comité Consultatif International Téléphonique et Télégraphique"), a telecommunications standard created in 1956



a function used for validating data integrity. Also referred to as MD5 (Message-Digest algorithm 5). an algorithm or forumla is applied against the source (typically a file and its content, such as the image of a scanned page from a book) in order to generate a unique, 128-bit hash value often called a checksum. In digital preservation processes, the MD5 checksum from when the content was created is compared to another checksum created after the content has been received or stored over a period of time. The values are compared and, if they match, this indicates that the data (e.g. the scanned page image) is intact and has not been altered.



group of organizations with a common purpose to meet a goal that would normally be beyond the capabilities of a single member



exclusive rights regulating the use of a particular expression of an idea, materials, or information. In other words, "the right to copy" an original creation.


copyright holders

only the copyright holder – whether the person, estate, or representative -- is permitted to use the rights restricted by copyright; all others are prohibited from using the work or materials without the consent of the copyright holder



Dublin Core is a metadata standard for cross-domain information resource description, created by OCLC, a library consortium, based in Dublin, Ohio.



digitization is the process or series of software programs used to make a representation of an object, an image, or a signal (when dealing with audio) by a discrete set of its points or samples. The result is usually called a digital image for the object, and digital form for the signal.



Encoded Archival Description is an XML standard for encoding archival finding aids, maintained by the Library of Congress in partnership with the Society of American Archivists.


fair use

the concept from United States copyright law that permits limited use of copyrighted material without requiring permission from the copyright holders, such as use for scholarly research or review.



Flexible Extensible Digital Object Repository Architecture is a software framework to construct and maintain repositories of digital objects.



File Transfer Protocol



The GNU General Public License (GNU GPL or simply GPL) is a widely used free software license, originally written by Richard Stallman in 1984 for the GNU project.



Intellectual Property Rights – copyright information related to materials published and later held in libraries



International Organization for Standardization



Joint Photographic Experts Group – the name of the group that developed the standard.  JPG is a compression method for images.


JPG 2000

JPEG 2000 is a wavelet-based image compression standard. It was created by the Joint Photographic Experts Group committee in the year 2000 with the intention of superseding their original discrete cosine transform-based JPEG standard (created about 1991). The standardized filename extension is JP2 .



LZW is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch.



MARC (MAchine-Readable Cataloging) is bibliographic and related information in machine-readable form; the standard was developed by Henriette Avram at the Library of Congress in the 1960’s. 



an encoding standard for descriptive, administrative, and structural metadata regarding objects within a digital library, using XML schema language



a schema for a bibliographic element set that may be used for a variety of purposes, and particularly for library applications, to define elements in datasets often used in digital libraries



National Information Standards Organization



OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting)  Harvesters are software programs that search the Internet for metadata conforming to published OAI standards. 



Optical Character Recognition:  computer software designed to convert images of text (usually captured by a scanner) into machine-editable text



A file format, created by Adobe Systems, for document exchange in a manner independent of the application software, hardware, and operating system


Persistent URL

a Uniform Resource Locator that remains unique and intact regardless of its object’s location or state



usually sites on the Internet providing specially designed features for their visitors to provide services from a number of different sources.



Preservation Metadata: Implementation Strategies, a core set of preservation metadata, applicable across a wide range of digital preservation contexts and supported by guidelines.



a central place where databases or files are located or distributed over a network, providing persistence of access and preservation of the digital objects



Request for Proposal: a request for a written bid from an outsourcing vendor



A software company based in Egypt active in the IT industry since the early 1980s.  Sakhr Automatic Reader is the OCR software specialized for converting Arabic text produced by Sakhr Software Company.



Tagged Image File Format -- acknowledged as the best format for preservation and technical longevity



A filename extension for files consisting of text usually contain very little formatting


A character coding system to support the worldwide exchange, processing, and display of the written texts of the diverse languages and technical disciplines of the modern world


union list

a unified listing of materials held distinctly or in common by a group of libraries.  Materials represented often reflect a given subject area of mutual interest to the participating institutions and to others beyond that group.



Universal Serial Bus is a serial bus standard to interface devices, such as flash or external drives.



8-bit UCS/Unicode Transformation Format) that is backwards compatible with ASCII.  The encoding standard is capable of displaying in email and in Internet browsers the standard 128 ASCII characters for English as well as Latin alphabet characters with diacritics, Greek, Cyrillic, Coptic, Armenian, Hebrew, and Arabic characters.



The OCR software specialized for converting Arabic text produced by NovoDynamics Inc., headquartered in Ann Arbor, Michigan,


Web services

“a software system designed to support interoperable machine-to-machine interaction over a network”, e.g. over the Internet systems using open standards can communicate much like different software systems on a computer can interact.



the movement of documents and / or tasks through a process to accomplish a goal, e.g. digitization workflow involves scanning, processing, and OCR conversion; repository workflow can include ingest, indexing, searching, retrieval and presentation.



Extensible Markup Language is a specification for indicating customized mark-up languages.  Thus, MARCXML, created by the Library of Congress, is a web-based standard for the original LC schema.