Archon

Archon: A Unified Information Storage and Retrieval System for Lone Archivists, Special Collections Librarians and Curators

Scott W. Schwartz
Archivist for Music and Fine Arts
University of Illinois at Urbana-Champaign

Christopher J. Prom
Assistant University Archivist
University of Illinois at Urbana-Champaign

Christopher A. Rishel
Programmer
University of Illinois at Urbana-Champaign

Kyle J. Fox
Assistant Programmer
University of Illinois at Urbana-Champaign

Abstract

The University of Illinois developed an open-source collections management software program and in August 2006 began making it freely available to archivists, curators, and special collections librarians.  This program gives those with limited technological resources and knowledge the ability to easily mount a variety of on-line access tools to their historical collections using ISAD(G)1 and DACS2-compliant standards for description.   Archon was created with robust interoperability using a single web-based platform for the management of collections of documents and artifacts held by archives, museums and libraries.  It was developed as a “plug and play” application for easy installation on any web server or on any web hosting service.  It uses common web-browser input mechanisms and SQL data storage to produce dynamic data output in the form of searchable collections websites, MARC bibliographic records (Smiraglia 1990), EAD finding aids (Pitti 268-293), and long-term preservation TXT data files.  The article discusses the design concepts that lead to the University of Illinois’ creation of Archon, the challenges faced by the archives community when providing descriptive access to large bodies of historical papers and records, and describes Archon’s public and administrative interfaces as well as future plans for additional developments to this software program.

Keywords: Archon, encoded archival description, archival information systems, databases, web interfaces

Archives, museums, and libraries strive to promote open and equitable access to historical and documentary records of enduring value in their care, and they recognize their responsibility to promote the use of those records as their fundamental purpose. (Society of American Archivists)

Introduction of Design Concept

The University of Illinois’ Archivist for Music and Fine Arts and the Assistant University Archivist developed Archon, a new automated collections management program, because we needed a new easy-to-use archival information system that could be adapted to any institutional setting because several units in our Library hold archival materials.3  We also believed that many of our colleagues in North America had the same need.  We wanted our application to be particularly useful to small “one-person” repositories that have been unable to take full advantage of current archival descriptive standards and other complex collections management software tools under development.  Our objective was to create an application in which the entry of collection information would be through a single web form, but with the power to output this data in many different formats.  In addition, updates or corrections to our repository’s on-line collections information would propagate automatically to their related output formats in order to ensure the public’s access to the most current data about our collections without any manual intervention. 

Optimistic skepticism from many colleagues was the most common reaction we encountered when explaining our initial idea to them.  However, a demonstration of Archon 1.0 to a standing-room-only audience during the August 2006 annual meeting of the Society of American Archivists has tempered much of this uncertainty among the members of the archives community.  Since this initial presentation we have had 880 downloads of the application from our Archon website4 and 117 completed installations of the program by a variety of repositories including: Archivistica Dominicana Inc., Auburn University, Church USA Archives-North Newton, Edinburgh University, Lawrence Massachusetts Historical Society, Purdue University, Simmons College Archives, Southern Baptist Theological Society, Southern Illinois University Carbondale, University of Akron, University of Houston, University of Illinois at Springfield, University of West Florida and William and Mary College.5  As more archives, museums and libraries in North American begin to use Archon we hope a user group of archivists, curators and librarians will join the University of Illinois in the future collaborative development of this application, which we believe will better serve our communities’ preservation and access needs.

Challenges of Archival Description and Access

Archival records, personal papers and artifact collections are among the most valuable materials held by libraries and other cultural repositories.  These historical papers, objects and administrative documents have great evidential, informational, intrinsic and monetary value.  Taken collectively, they comprise the most significant resources that many cultural institutions hold for primary research.  In addition, many of these groups of documents and artifacts continue to have administrative and cultural significance to those who created them long after they have been deposited in these repositories. 

These compilations of material culture also create unique descriptive and access challenges.  Most published books and journals come to libraries with specific titles, identifiable authors, and other standardized descriptive information about their general content.  This information is usually entered into either a paper-based or digital public access catalog.  Most collections of historical papers, records and artifacts do not typically come to archives, special collections and museums with these same types of supplied descriptive information.  In addition, many of these groups of documents and objects frequently do not bear a coherent pattern of arrangement.  In most instances, the many parts of an archival aggregation of documents retain their significance only when they are described as a whole within the context of their original creation.

Archivists have attempted to alleviate these special challenges by balancing the need to describe the original context of a body of records (i.e., the evidence of transactions between individuals and organizations) against the desire to describe the specific products of those transactions (i.e., the  information about the documents).6  Prior to 1990, technological obstacles restricted most institutions from sharing up-to-date descriptions of their holdings outside of their immediate repositories.   This situation contributed to a proliferation of local descriptive practices particularly among archives, special collections and museums.  However, remedies to many of these same obstacles were made possible with the development of more advanced desktop, data storage, and data distribution technologies during the 1990s. 

Significant steps taken by archivists and librarians to create unified descriptive standards for archives and special collections have included the development of MAchine Readable Cataloging Format for Archives and Manuscripts Control (MARC-AMC) in 1985.  MARC-AMC established a uniform bibliographic format for archival materials.  Encoded Archival Description (EAD) was developed in 1997 and initiated a uniform data structure for encoding on-line finding aids to these materials.7

However, MARC-AMC and its current successor, MARC21, continue to lack the flexibility to efficiently describe the full context and content of archival materials.  In addition, archivists have identified several EAD implementation problems including archivists’ difficulty using currently available encoding tools (Prom, “EAD Cookbook” 257-75), incompatible encoding of documents which defeats the easy exchange of collections information across repositories (Prom, “Does EAD Play Well” 52-72), and web pages generated from EAD documents that are not optimally accessible to users using commonly available web browsers (Prom, “User Interactions” 234-68).  All of the current encoding tools also lack the ability to seamlessly publish those EAD finding aids on-line without the use of second party software and compatible web hosting services (e.g., Research Libraries Group’s EAD Conversion Services).8  Furthermore, the standards themselves do not provide guidance to repositories regarding what information to include in a descriptive record.  In the United States, the current standard is referred to as Describing Archives: A Content Standard (DACS).  In Canada, most archives use a correlate standard referred to as Rules for Archival Description (RAD).9  Archon was developed to accommodate both of these descriptive standards.

Another significant challenge faced by archivists, curators and librarians involves the approaches that are typically used for the arrangement and description of most historical materials.  Traditional archival practices have argued that until a collection of papers or records is processed (i.e., either identifying or establishing an intellectual and physical arrangement for these materials) it cannot be described, and until that body of papers and artifacts is described, that description cannot be made accessible to users through paper-based and online tools.  Some repositories have used this cyclic argument as justification for the existence of their extensive unprocessed and undescribed collections.  Greene and Meissner have suggested these backlogs are the result of frequent over-processing of collections by individuals seeking “perfect” custodial practice (208-63).  However, recent  research of arrangement and descriptive practices used by today’s college and university archives attributes these unnecessary accumulations of historical materials to an obsessive application of a variety of descriptive standards, and the complexity of most of today’s EAD coding and online publication options that are available to libraries, archives and museums (Prom and Swain). 

The initial allure of EAD as a possible panacea for these growing backlogs of old and new collections of documents and artifacts has turned to frustration for many archivists and special collections librarians in North America.  While there are no current studies that have identified an acceptable timeframe as an encoding norm for the creation of EAD finding aids, recent informal studies conducted by the University of Illinois of its arrangement and description practices have demonstrated that skilled staff members typically take an average of 20 hours to encode a 100-page finding aid using currently available XML markup tools.  This is in addition to the time that is needed to write a general description and develop a collection-level box and folder listing using a standard word processing application.  We feel that these are time and staffing resources that we can ill afford to expend, given the nature of our growing backlog of unprocessed collections at the University of Illinois.  The same could be said for many of our other colleagues in North America.

Backlogs of unprocessed collections of documents are an endemic product of most traditional arrangement and description practices.  However, their existence threatens the core objective of good custodial practice, which is the provision of access to historical information and artifacts in the care of libraries, archives and museums.  Until the creation of descriptive aids flows seamlessly from archival processing, and the sharing of collections information across repositories is as easy as word processing, it is unlikely that significant improvements will be achieved in terms of providing better public access to these growing backlogs of papers, records and artifacts.  This is particularly true for small, underfunded, and understaffed repositories.  Archon was developed to address these problems.

Developmental Considerations

The initial development of Archon included collection- and folder-level descriptive functions.  However, to keep the application robust and current with the needs of today’s archival practice, the programming was refined to accommodate description at any level, including the repository/division, record group/fonds, collection, series, file/folder, and document/item levels.  In addition, further levels of description, when required by a particular repository, can be defined by that repository. 

We felt that because Archon could be used to describe a wide variety of collections across different repositories within a single institution, it was crucial to support authority control functions.   These features enabled our collections to be dynamically searched, grouped and re-grouped by provenance, creator, subject and genre by researchers using controlled vocabularies commonly utilized by libraries and archives (e.g., Library of Congress Subject Headings and Art and Architecture Thesaurus).  In addition, we included in Archon’s authority control system the flexibility to accommodate local subject access terms when these were needed for a particular repository or collection.10  We also decided that this authority information should be managed separately from collections information, but agreed that it was essential that the system have the ability to link these authority records to appropriate individual collection-level descriptions entered into the Archon system.

Several platforms were considered as we mapped out the initial development of Archon, but we believed the final product would function best as a web-based application utilizing any web-server, running Hypertext Preprocessor 5.0 (PHP)11 or higher, and linked to a relational database.12  Relying on open-source rather than proprietary software development tools enabled us to tightly control development costs expended from our limited budget.13  In addition, the resulting open source programming that was one of the outcomes of the project also made it easier to package Archon to meet the needs of other institutions. 

Constructing Archon around a relational database structure enabled us to easily integrate the authority control and digital content library functions via a series of internal lookup tables.  The relational database structure also eased the process of converting, inputting and outputting collections information in multiple formats.  This was particularly relevant for the handling of EAD files.  While most repositories store their EAD finding aids in their native XML format, we felt from past experience that XML is much better suited as a data exchange rather than data storage format.  Since Archon automatically produces EAD files after data is entered via a webform, archival staff do not need to understand XML tagging or the details of the EAD standard.  This is also true for all MARC-AMC records produced by Archon.

Archon’s Public Interface

Archon’s public “search screens and results” pages can be integrated easily into any repository’s existing website utilizing one of several default templates supplied with the software.  New templates and style sheets can also be easily created, or existing sheets can be adapted to match the graphical appearance of a particular repository’s existing web pages.14  Archon’s “Administrative Interface” is available to staff members who are authenticated to log into it through a link at the bottom of any public page generated by Archon [see figure 1].

Figure 1.  Example of link to administrative interface login screen that appears at the bottom of each public page generated by Archon.

This feature provides staff the ability to easily move between the system’s public and administrative interfaces.  This feature has been quite useful for our staff who are responsible for processing and describing archival collections, especially when they need to quickly view the public output of information as it is entered into Archon.15  In addition, specific collections’ location information, not available through the public interface, can be displayed to reference room staff through Archon’s administrative interface.  

The public web pages generated by Archon’s scripts allow users to search collection descriptions and digital content simultaneously within a single repository or across multiple repositories.16  In addition to its general keyword search function, Archon provides users with the ability to browse collections by title, name of creator, subject, digital object title and archival record group [see figure 2]. 

Figure 2.  Archon navigation bar used by University Archives website.

General search results provided by Archon’s public interface are returned for both collections and digital objects that have been entered into the system as well as related creator and subject authority records. Users can expand and contract the displayed lists of associated links to specific material (i.e., box, folder and item content) within one or more collections.  This feature provides users the ability to narrow searches to specific content found across different collections of materials (e.g., John Philip Sousa’s Washington Post March) [see figure 3] as well as the ability to broaden their search from a specific piece of content in order to understand its context within larger groups of related documents and records (e.g., box-, series- and record-group level). 

Figure 3. Search results page for Sousa Archives and Center for American Music website.

The ability to browse collections of archival materials by record group/fonds is another unique feature of Archon.  Repositories that utilize provenance-based descriptive practices can use this function to group collections by a common creator or agency.  Archon dynamically generates provenance-based lists directly from the search queries through the public interface [see figure 4]. While provenance-based description is not a substitute for good subject indexing, it provides staff with the ability to fill in contextual and informational gaps when traditional subject indexing proves inadequate.  These automated browsing and inter-collection search features are unique to Archon and are unavailable among other archival software packages under development.

Figure 4.  Record group browsing results for University Archives.

Another crucial feature of Archon’s public interface is its dynamic production of collection-level descriptive records [see figure 5]. 

Figure 5.  Collections level record for the John Philip Sousa Music and Personal Papers.

These furnish preliminary information about specific collections (i.e., scope, size and arrangement).  The most important data is delivered to the user’s desktop and related information is loaded in the background.  If users need to access detailed information for a particular collection (e.g., a biographical note, a list of subject terms, or administrative information) they can open the “show” links illustrated in figure 5.  The system also produces access links to formatted, printer-friendly, EAD and Portable Document Format (PDF) finding aids to these collections when they are available.  Links are also provided to associated digital content which can be stored either directly in the Archon database, or in other systems [see figure 6].17

Figure 6.  Truncated finding aid for the John Van Fossen Papers with a green arrow link to associated digital content (e.g., a sample image from box 1, folder 4).

Archon’s finding aid and digital object pages also render links to other collections or digital objects that are related either by provenance, subject or creator.  This feature is particularly useful to researchers seeking contextual information about specific objects contained in Archon’s digital library, because it shows in which collection the item is found and provides links to related collections and digital objects in the repository [see figure 7]. 

Archon’s Administrative Interface

Archon’s administrative interface can be used as easily as its public interface because the two are tied together.  Once an administrative member has successfully logged into Archon’s administrative interface, a series of “pencil icons” will appear next to the various data elements of the public display.   These symbols provide access to editable content for that specific record or finding aid [see figure 8].  Clicking on this pencil icon loads the content for editing in the administrative interface [see figure 9].   Administrative access to basic descriptive information for a specific collection is provided at the top of the collections manager window with more detailed information fields located in the bottom half of this window (e.g., location and creator information, collection description, subjects).  The level of editorial access granted to the staff member may vary depending on their individual service responsibilities within a repository.  For example, one individual’s level of access could include only read and write access while another may have read, write and delete access to collection information contained in Archon. 

Figure 7.  A digital object record for a photograph included in the John Van Fossen Papers, held by the Sousa Archives and Center for American Music, University of Illinois at Urbana-Champaign.

Figure 8.  Edit icon (pencil symbol) displayed to the logged-in administrative member.

Figure 9.  Top portion of collection manager module in Archon’s administrative interface for the John Van Fossen Papers.

One of the more innovative aspects of the administrative interface is the ease with which complex operations, such as identifying and applying controlled subject terms to a specific collection, can be implemented through Archon’s programming.  If a staff member needs to apply a term to collection or digital content, he/she opens the “subject” module of the collections manager and begins typing any portion of a term he/she wishes to use.  Archon will immediately filter the term against the existing list of terms until the appropriate one is displayed.  Once the term is displayed it can be easily linked to the appropriate collection or digital content with a simple click on the term.  If no term exists within the controlled vocabulary list, a staff member can then begin constructing either an appropriate Library of Congress (LC) or local heading using the subject manager and load it to his/her repository’s controlled vocabulary list.  The subject manager enables the staff member to identify new terms as either LC or local heading.  Once this has been done, the new term can then be linked to either the collection record or digital object.  All subject, genre, and creator headings are displayed as LC or local headings in the MARC bibliographic records for these collections.

Archon’s administrative interface includes many other features designed to simplify the management of information related to specific collections and digital content.  Archon’s programming includes a “Content Manager” which provides administrative members with the ability to create series-, box-, and folder-level content descriptions of collections as easily as creating a word processing document.  Once these descriptions have been added to Archon they are automatically displayed in the public interface as finding aids and other access tools correctly encoded as EAD,

HTML, TXT and MARC files.  If administrative members wish to restrict public access to this information until a collection is fully arranged and described, they can do this by clicking the “no” button for the “web-enable” function at the top of the collection manager window.  This will disable all public display and search functions for this collection until the web-enable feature is implemented by an administrative member.

Once collection-level information has been entered or edited, a staff member can provide a more detailed description of the components of the collection (e.g., series, boxes, folders or items) by using the “Content Manager” [see figure 10].  Using the content manager is nearly as easy as creating a word processing document, and once these descriptions have been added to Archon they are automatically displayed in the public interface as finding aids, and other access tools are correctly encoded as EAD and MARC files.  All additions, changes, and deletions to Archon’s collections information are done in real time.

In cases where it is not possible to fully enter “legacy” finding aids into the content manager, (i.e., existing word-processing box lists), Archon provides staff members with the ability to link collection-level descriptive records to external digital files, such as PDF documents, by entering a URL into the appropriate field of the content manager.  In these cases, a collection-level MARC record and EAD file are still dynamically produced by the system ensuring that an institution can share general descriptive data about these collections with other institutions. 

Finally, Archon supports batch import of collection information from a variety of different data formats that are typically used by archives, museums and libraries (e.g., static database, spread sheet, word processing, HTML and EAD/XML).  Archon can export data in these formats as well, so there is minimal risk if an institution decides to migrate away from Archon in the future.  The MARC records, EAD-, HTML- and TXT-formatted finding aids, and digital content files that are dynamically generated by Archon can be uploaded as individual documents into any automated stand-alone system an institution chooses to use.

Figure 10.  Archon’s content manager window for the John Van Fossen Papers.

Future Development Features

The University of Illinois Archives and Sousa Archives and Center for American Music currently have 5,614 collections of historical documents, administrative records and artifacts described in its installation of Archon 1.11.18  This provides public access to over 20,846 cubic feet of materials in a variety of formats.  While only 170 different digital objects are presently stored in our Archon system, we have completed the creation of a special import script that will automatically load over 2,500 digital images and their related metadata directly into our system in the summer of 2007.  All of these images will be dynamically linked to their associated collection-level record.  This is nearly a gigabyte of data that will be stored directly in the Archon database structure.19 

We are now developing new multilingual support for Archon’s administrative interface so individuals from other countries can easily use the program in languages other than English.  This new functionality is being built around Archon’s ability to handle all language character sets.  Currently we are working on translations of the system into Spanish, French and Polish, but it will be possible to add other languages as well.  We are also creating several new administrative reports that will enable users to better track usage of specific collections, boxes and items.  This will assist with annual holdings maintenance and condition surveys of our collections at the University of Illinois.  In addition, this same programming feature will enable users to identify, tag, and request specific content from our various collections.  These online submissions will help our reference staff and researchers better prepare for these visits by ensuring requested collections are available and ready to use by these individuals.  

Conclusion

This article does not provide a thorough description of Archon’s many other features.  A detailed explanation of Archon’s programming also falls outside the scope of this article.  Those who wish to learn more can download the User and Administrator Manuals from the Archon website20 (www.archon.org/reports.php).  In addition, anyone can download and install Archon free of charge or spend time testing all of Archon’s user and administrative features using our sandbox site.  The site also provides a link to the Archon Users Group/Listserv that allows individuals to discuss specific features with other institutions currently using this system.  Any questions that cannot be answered by the User Group can be sent directly to the Archon’s Programming and Development staff. 

Works Cited

Greene, Mark A. and Dennis Meissner. “More Product, Less Process: Revamping Traditional Archival Processing.” American Archivist  68.2 (2005) : 208-63.

Pitti, Daniel V.  “Encoded Archival Description: the Development of an Encoding Standard for Archival Finding Aids.” American Archivist 60 (Summer 1997) : 268-293

Prom , Christopher J.  “The EAD Cookbook: A Survey and Usability Study.” American Archivist 65.2 (2002 ) : 257-75.

Prom, Christopher J.  “Does EAD Play Well with other Metadata Standards?”  Journal of Archival Organization 1 (2002 ) : 52-72.

Prom, Christopher J. “User Interactions with Electronic Finding Aids in a Controlled Setting.” American Archivist 67.2 (2004) : 234-68.

Prom, Christopher J. “Optimum Access?  Processing in College and University Archives.”  Forthcoming in College and University Archives: Selected Readings. Ed. Christopher J. Prom and Ellen D. Swain.  Chicago, Illinois: Society of American Archivists Press.  Preliminary draft available at http://web.library.uiuc.edu/ahx/workpap/

Smiraglia, Richard P.  Describing Archival Materials: the Use of MARC AMC Format.  Binghamton, New York: Haworth Press Inc, 1990.

NB



[1] General International Standard for Archival Description (http://www.ica.org/biblio/cds/isad_g_2e.pdf).

[2] Describing Archives: A Content Standard (Chicago: Society of American Archivists, 2004).  (http://www.archivists.org/catalog/pubDetail.asp?objectID=1279)

[3] The initial design concept was brought to the University of Illinois by Scott Schwartz, the newly hired Archivist for Music and Fine Arts, who had been developing a dynamic online archives finding aid tool for the Smithsonian Institution’s National Museum of American History prior to his arrival at the Illinois campus in 2003.  A student programmer, Christopher Rishel, was hired in 2004 to help develop a prototype program for the Sousa Archives and Center for American Music to test and implement the initial design concept.  After a careful evaluation of the early prototype in 2005, the Assistant University Archivist, Christopher Prom, agreed to help further develop the program with a special focus on developing EAD-compliant finding aids while Schwartz focused on the creation of MARC bibliographic records.  Rishel took the lead on the development of database and PHP programming.  A second programmer was hired in the fall of 2006 to begin working on a new foreign language component for Archon 2.0 and learning the programming protocols used for Archon in preparation for taking over for the head programmer, Rishel, when he left for medical school in June 2007.

[4]The University of Illinois’ Archon website address can be found at (http://www.archon.org).

[5] All downloads and installations of Archon 1.0 and 1.1 beginning in January 2007 are free to government, public, academic and private “not-for-profit” archives, museums, and libraries.

[6] ISAD(G) defines the purpose of archival description as the identification of the context and the content of archival materials in order to facilitate their access. 

[7]For further information on EAD consult the Library of Congress website at (http://www.loc.gov/ead/).

[8]  Further information on Research Libraries Group’s EAD Conversion Services can be found at (http://www.rlg.org/en/page.php?Page_ID=448).

[9] For further information on Rules for Archival Description consult (http://www.cdncouncilarchives.ca/archdesrules.html).

[10] The University Archives and Sousa Archives and Center for American Music utilizes 6,000 Library of Congress and local subject terms as part of their controlled vocabulary. 

[11] PHP is a server-side, HTML embedded scripting language used to create dynamic Web content.

[12] Archon’s data storage functions can be run using a standard relational database such as MySQL or SQL Server.  It can run on any webserver which supports PHP 5.0 or higher, which is widely supported by webhosting companies and other web service providers.

[13] The budget expenditures for Archon’s initial programming between January and August 2005 were $9,260 provided through an internal University Library project development grant.   

[14] For further information on how this is and other simple programming modifications can be achieved consult the Archon technical manual at (http://www.archon.org/UserManualv1.11.pdf).

[15] All new and updated information loaded into Archon is now done in real time so new and revised information can be made available immediately to users through Archon’s public interface. 

[16] While Archon was originally developed to have the ability to provide access to collection information from multiple repositories within a single institution, current discussions with the University of Illinois’ development team have suggested the creation of a programming mechanism that enables the public to search across different institution’s holdings using a federated access mechanism created for an Archon users group is a crucial new feature that may be included in the 3.0 upgrade release of Archon in either 2008 or 2009.

[17] Archon’s digital library functions accommodate a variety of image, moving image and audio file formats.  Unlike other digital library systems (e.g., ContentDM), all of Archon’s digital content is stored directly in its relational database structure.  This ensures that all links to digital content are managed automatically through Archon’s programming.  Archon’s data storage is limited only by the storage capacity of the database server.

[18] Archon 1.11 is the current version available to the public, but a newer version, Archon 2.5, will be made available to select archives for beta testing beginning July 15, 2007 and a final release of the 2.5 will be made available to all archives beginning August 15.

[19] All of the uncompressed preservation TIF files for these Archon digital library images will continue to reside externally from the Archon system in order to ensure their long-term preservation.

[20] For further information on Archon’s report generators consult (www.archon.org/reports.php).



Copyright (c) 2016 Partnership: The Canadian Journal of Library and Information Practice and Research

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Partnership: the Canadian Journal of Library and Information Practice and Research (ISSN: 1911-9593)