If you see a TERM that is not clear or needs refreshing or if you have a TERM you would like to add to the GLOSSARY, please send email to elizabeth.beaudin@yale.edu . Thanks!
API
an application programmatic interface “is the interface that a computer system, library or application provides in order to allow requests for services to be made of it by other computer programs, and/or to allow data to be exchanged between them”
automatic document feeder
an attachable part that automatically sends multiple pages through the scanner
automatic document reader
a scanner that has the ability to process many documents
bento box
Bento boxes are traditional Japanese lunch boxes which come in different sizes and shapes. They have little containers that hold a variety of lunch time foods. The organizational structure of a bento box is now used when designing a user interface on a web site, thus creating several box shapes to hold different yet related information.
best practices
a method, process, or activity that is more effective at delivering a particular outcome than any other technique, method, process, etc., with fewer problems and unforeseen complications
born-digital
an asset that originated in digital form. some examples include: Websites, wikis, e-books, digital sound recordings, and email.
cascading shylesheets
describe and customize the presentation, such as colors and fonts, of a document or a web page written in HTML
CCITT Group IV
an image compression schema based on the "Comité Consultatif International Téléphonique et Télégraphique"), a telecommunications standard created in 1956
a function used for validating data integrity. Also referred to as MD5 (Message-Digest algorithm 5). an algorithm or forumla is applied against the source (typically a file and its content, such as the image of a scanned page from a book) in order to generate a unique, 128-bit hash value often called a checksum. In digital preservation processes, the MD5 checksum from when the content was created is compared to another checksum created after the content has been received or stored over a period of time. The values are compared and, if they match, this indicates that the data (e.g. the scanned page image) is intact and has not been altered.
consortium/consortia
group of organizations with a common purpose to meet a goal that would normally be beyond the capabilities of a single member
copyright
exclusive rights regulating the use of a particular expression of an idea, materials, or information. In other words, "the right to copy" an original creation.
copyright holders
only the copyright holder – whether the person, estate, or representative -- is permitted to use the rights restricted by copyright; all others are prohibited from using the work or materials without the consent of the copyright holder
DC
Dublin Core is a metadata standard for cross-domain information resource description, created by OCLC, a library consortium, based in Dublin, Ohio.
digitization
digitization is the process or series of software programs used to make a representation of an object, an image, or a signal (when dealing with audio) by a discrete set of its points or samples. The result is usually called a digital image for the object, and digital form for the signal.
EAD
Encoded Archival Description is an XML standard for encoding archival finding aids, maintained by the Library of Congress in partnership with the Society of American Archivists.
fair use
the concept from United States copyright law that permits limited use of copyrighted material without requiring permission from the copyright holders, such as use for scholarly research or review.
Fedora
Flexible Extensible Digital Object Repository Architecture is a software framework to construct and maintain repositories of digital objects.
FTP
File Transfer Protocol
GNU
The GNU General Public License (GNU GPL or simply GPL) is a widely used free software license, originally written by Richard Stallman in 1984 for the GNU project.
IPR
Intellectual Property Rights – copyright information related to materials published and later held in libraries
ISO
International Organization for Standardization
JPG
Joint Photographic Experts Group – the name of the group that developed the standard. JPG is a compression method for images.
JPG 2000
JPEG 2000 is a wavelet-based image compression standard. It was created by the Joint Photographic Experts Group committee in the year 2000 with the intention of superseding their original discrete cosine transform-based JPEG standard (created about 1991). The standardized filename extension is JP2 .
LZW
LZW is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch.
MARCXML
MARC (MAchine-Readable Cataloging) is bibliographic and related information in machine-readable form; the standard was developed by Henriette Avram at the Library of Congress in the 1960’s.
METS
an encoding standard for descriptive, administrative, and structural metadata regarding objects within a digital library, using XML schema language
MODS
a schema for a bibliographic element set that may be used for a variety of purposes, and particularly for library applications, to define elements in datasets often used in digital libraries
NISO
National Information Standards Organization
OAI
OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) Harvesters are software programs that search the Internet for metadata conforming to published OAI standards.
OCR
Optical Character Recognition: computer software designed to convert images of text (usually captured by a scanner) into machine-editable text
A file format, created by Adobe Systems, for document exchange in a manner independent of the application software, hardware, and operating system
Persistent URL
a Uniform Resource Locator that remains unique and intact regardless of its object’s location or state
Portal
usually sites on the Internet providing specially designed features for their visitors to provide services from a number of different sources.
PREMIS
Preservation Metadata: Implementation Strategies, a core set of preservation metadata, applicable across a wide range of digital preservation contexts and supported by guidelines.
repository
a central place where databases or files are located or distributed over a network, providing persistence of access and preservation of the digital objects
RFP
Request for Proposal: a request for a written bid from an outsourcing vendor
Sakhr
A software company based in Egypt active in the IT industry since the early 1980s. Sakhr Automatic Reader is the OCR software specialized for converting Arabic text produced by Sakhr Software Company.
TIFF
Tagged Image File Format -- acknowledged as the best format for preservation and technical longevity
TXT
A filename extension for files consisting of text usually contain very little formatting
Unicode
A character coding system to support the worldwide exchange, processing, and display of the written texts of the diverse languages and technical disciplines of the modern world
union list
a unified listing of materials held distinctly or in common by a group of libraries. Materials represented often reflect a given subject area of mutual interest to the participating institutions and to others beyond that group.
USB
Universal Serial Bus is a serial bus standard to interface devices, such as flash or external drives.
UTF-8
8-bit UCS/Unicode Transformation Format) that is backwards compatible with ASCII. The encoding standard is capable of displaying in email and in Internet browsers the standard 128 ASCII characters for English as well as Latin alphabet characters with diacritics, Greek, Cyrillic, Coptic, Armenian, Hebrew, and Arabic characters.
VERUS
The OCR software specialized for converting Arabic text produced by NovoDynamics Inc., headquartered in Ann Arbor, Michigan,
Web services
“a software system designed to support interoperable machine-to-machine interaction over a network”, e.g. over the Internet systems using open standards can communicate much like different software systems on a computer can interact.
Workflow
the movement of documents and / or tasks through a process to accomplish a goal, e.g. digitization workflow involves scanning, processing, and OCR conversion; repository workflow can include ingest, indexing, searching, retrieval and presentation.
XML
Extensible Markup Language is a specification for indicating customized mark-up languages. Thus, MARCXML, created by the Library of Congress, is a web-based standard for the original LC schema.