Computer scientists who work with digital data that has long-term preservation value, archivists and librarians whose responsibilities include preserving digital materials, and other stakeholders in digital preservation have long called for the development and adoption of open standards in support of long-term digital preservation. Over the past fifteen years, preservation experts have defined “trust” and a “trustworthy” digital repository; defined the attributes and responsibilities of a trustworthy digital repository; defined the criteria and created a checklist for the audit and certification of a trustworthy digital repository; evolved this criteria into a standard; and defined a standard for bodies who wish to provide audit and certification to candidate trustworthy digital repositories. This literature review discusses the development of standards for the audit and certification of a trustworthy digital repository.
Ward, J.H. (2012). Managing Data: Preservation Standards & Audit & Certification Mechanisms (i.e., “policies”). Unpublished Manuscript, University of North Carolina at Chapel Hill. (pdf)
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
Table of Contents
Requirements for Bodies Providing Audit and Certification of Candidate Trustworthy Digital Repositories Recommended Practice
Table of Figures
Figure 2 – Audit and Certification of Trustworthy Digital Repositories Recommended Practice, 3.1.1 (CCSDS, 2011).
Computer scientists who work with digital data that has long-term preservation value, archivists and librarians whose responsibilities include preserving digital materials, and other stakeholders in digital preservation have long called for the development and adoption of open standards in support of long-term digital preservation (Lee, 2010; Science and Technology Council, 2007; Waters & Garrett, 1996). However, Hedstrom (1995) cautions that only “if” standards provide the conditions for the archive to conform to standard archival practices, software and hardware designers comply with the standards, and producers and users select and use the standards, will they then provide a high-level solution to some of the obstacles that may prevent the preservation of digital materials. The development of standards for the audit and certification of digital repositories as “trustworthy” is a major development towards ensuring that digital data will be curated and preserved for the indefinite long-term, as they provide the conditions so that all three of Hedstrom’s criteria may be met.
In 1996, the Commission on Preservation and Access and the Research Libraries Group released the now-seminal report, “Preserving Digital Information” (Waters & Garrett, 1996). The Research Libraries Group (RLG) (2002) noted three key points that lead to the interest in developing standards for the “attributes and responsibilities” of a “trusted digital repository”: the requirement for ‘a deep infrastructure capable of supporting a distributed system of digital archives’; ‘the existence of a sufficient number of trusted organizations capable of storing, migrating, and providing access to digital collections’; and, ‘a process of certification is needed to create an overall climate of trust about the prospects of preserving digital information’. A few years later, the Consultative Committee on Space Data Systems (CCSDS) released the “Reference Model for an Open Archival Information System (OAIS)” (CCSDS, 2002). This document defined a set of common terms, components, and concepts for a digital archive. It provided not just a technical reference, but outlined the organization of people and systems required to preserve information for the indefinite long-term and make it accessible (RLG, 2002).
However, experts and other stakeholders with an interest in preserving information for the long-term recognized that as part of defining an archival system, they also needed to form a consensus on the responsibilities and characteristics of a sustainable digital repository. In other words, they needed a method to “prove” (i.e., “trust”) that an organization’s systems were, in-fact, OAIS-compliant. First, they would have to define the attributes and responsibilities of a “trusted” digital repository. Next, they would have to develop a method to audit and certify that a repository may be “trusted”. And, finally, they would have to create an infrastructure to certify and train the auditors.
The essay “Managing Data: the Emergence & Development of Digital Curation & Digital Preservation Standards” contains sections that provide the motivations for the development of standards and an overview and example applications of, the “Audit and Certification of the Trustworthy Digital Repositories Recommended Practice” (CCSDS, 2011). That essay also covers the definitions of “reliable”, “authentic”, “integrity”, and “trustworthy”, et al. A very short discussion of this Recommended Practice and a detailed discussion of the OAIS Reference Model are available in the essay, “Managing Data: Preservation Repository Design (the OAIS Reference Model)“.
This essay on “preservation standards and audit and certification mechanisms” is an overview of “trust”; the types of audit and certification available generally; the development of standards for the audit and certification of a repository as “trustworthy”; a brief overview of the standards themselves; and, a very brief overview of the requirements for the certification of bodies that certify the auditors of said trusted digital repositories. Thus, the scope of this particular literature review is deliberately narrow to avoid the duplication of previously discussed topics.
Jøsang and Knapskog (1998) discussed “trust” as a “subjective belief” when they described a metric for a “trusted system”, while Lynch (2000) described “trust” as an elusive and subjective probability. Both the former and the latter wrote that a user trusts the evaluation of the certifier, not the actual system component. Jøsang and Knapskog drew attention to that fact that an evaluator only certifies that a system has been checked against a particular set of criteria; whether or not a user should or will trust that criteria is another matter. The two researchers pointed out that most end users of a certified system do not have the necessary expertise to evaluate the appropriateness and quality of the criteria used to audit the system. They must trust that the people who established the criteria chose relevant components, and that the evaluator had the skill and knowledge to assess the system.
This is similar to Lynch (2001), who wrote that users tend to assume digital system designers and content creators have users’ best interests at heart, which is not always the case; yet the idea of creating a formal system of trust “is complex and alien to most people”. Ross & McHugh (2006) posit that “trust” may be established with the various stakeholders affiliated with a repository by providing quantifiable “evidence” such as annual financial reports, business plans, policy documents, procedure manuals, mission statements, etc., so that a system’s “trustworthiness” is believable. Jøsang & Knapskog (1998) and Ross & McHugh’s (2006) research goal was to provide a methodical evaluation of system components to define “trust” in a system that in and of itself was trustworthy (RLG, 2002).
Finally, Merriam-Webster (Trust, 2011) defines “trust” as “one in which confidence is placed”; “a charge or duty imposed in faith or confidence or as a condition of some relationship”; and, “something committed or entrusted to one to be used or cared for in the interest of another”.
Jøsang and Knapskog (1998) described four types of roles generally assigned to “government driven evaluation schemes”: accreditor, certifier, evaluator, and, sponsor. They defined the accreditor as the body that accredits the evaluator, the certifier, and, sometimes, evaluates the system itself. They noted that the certifier is accredited based on “documented competence level, skill, and resources”. They stipulated that the certifier might also be a “government body issuing…certificates based on the evaluation reports from the evaluators”. They defined the evaluator as “yet another government agency” that is “accredited by the accreditor”, and “the quality of the evaluator’s work will be supervised by the certifier”. They described the sponsor as the party interested in having their system evaluated (Jøsang & Knapskog, 1998). In other words, the authors wrote that someone who would like their system audited and certified by a particular evaluation criteria (“the sponsor”) hires an auditor (“the evaluator”) who has been certified (“the certifier”) by an accredited agency (“the accreditor”).
RLG (2002) defined four approaches to certification: individual, program, process, and data. They described “individual” as personnel certification. This is also called professional certification or accreditation, and it is often given to an individual when they meet some combination of work experience, education, and professional competencies. RLG noted that at the time of writing, there were no professional certifications for digital repository management or electronic archiving. They cited “program” as a type of certification for an institution or a program achieved through a combination of site visits and “self-evaluation using standardized checklists and criteria”.
RLG explained that the assessment areas included access, outreach, collection preservation and development, staff, facilities, governing and legal authority, and financial resources. They provided examples of this type of certification that included museums, schools and programs within a university, etc. They defined “process” as “quantitative or qualitative guidelines…to internal and external requirements” that use various methods and procedures, such as the ISO 9000 family of standards (RLG, 2002).
Finally, the authors designated the “data” approach to certification as addressing “the persistence or reliability of data over time and data security”. They wrote that this certification requires adherence to procedures manuals and international standards, such as ISO, that ensure both external and internal quality control. They note that certification will require the managers of a repository to document migration processes, to maintain and create metadata, authenticate new copies, as well as update the data or files (RLG, 2002).
RLG (2002) defined a “trusted digital repository” as “one whose mission is to provide reliable, long-term access to managed digital resources to its designated community, now and in the future”. They described the “critical component” as “the ability to prove reliability and trustworthiness over time”. The authors’ stated goal for the report was to create a framework for large and small institutions that could cover different responsibilities, architectures, materials, and situations yet still provide a foundation with which to build a sustainable “trusted repository” (RLG, 2002).
The authors of the RLG document noted that repositories may be contracted to a third party or locally designed and maintained, regardless, the expectations for trust require that a digital repository must:
- Accept responsibility for the long-term maintenance of digital resources on behalf of its depositors and for the benefit of current and future users;
- Have an organizational system that supports not only long-term viability of the repository, but also the digital information for which it has responsibility;
- Demonstrate fiscal responsibility and sustainability;
- Design its system(s) in accordance with commonly accepted conventions and standards to ensure the ongoing management, access, and security of materials deposited within it;
- Establish methodologies for system evaluation that meet community expectations of trustworthiness;
- Be depended upon to carry out its long-term responsibilities to depositors and users openly and explicitly;
- Have policies, practices, and performance that can be audited and measured; and
- Meet the responsibilities detailed in Section 3 [sic] of this paper” (RLG, 2002).
Per the OAIS Reference Model (CCSDS, 2002), they noted that the repository’s “designated community” will be the primary determining factor in how the content is accessed and disseminated; managed and preserved; and what, including content and format, is deposited. The authors of the report discussed and defined “trust”, noting, “most cultural institutions are already trusted”. Regardless, they outlined three levels of trust that administrators of a repository must consider in order to be a “trusted repository”: the trust a cultural institution must earn from their designated community; the trust cultural institutions must have in third-party providers; and the trust users of the repository must have in the digital objects provided to them by the repository owner via the repository software.
The report authors wrote that archives, libraries, and museums must simply keep doing what they have been doing for centuries in order to maintain the trust of their user community; they do not need to develop that trust, as institutions, they have already earned it. RLG (2002) explained that while librarians, archivists, etc., are loath to use third-party providers who have not proven their reliability, the establishment of a certification program with periodic re-audits may overcome their reluctance. Finally, the authors stated that users must be able to trust that the digital items they receive from a repository are both authentic and reliable. In other words, the objects the users access must be unaltered and they must be what they purport to be (Bearman & Trant, 1998).
They established that this can be accomplished by the use of checksums and other forms of validation that are common in the Computer Science and digital security communities, although security does not equal integrity (Lynch, 1994). Waters & Garrett (1996) put forth that the “central goal” of an archival repository must be “to preserve information integrity”; this includes content, fixity, reference, provenance, and context.
For a discussion on “reliable”, “authentic”, “integrity”, and “trustworthy”, please see the essay, “Managing Data: the Emergence & Development of Digital Curation & Digital Preservation Standards“.
RLG (2002) identified seven primary attributes of a trusted digital repository. They were and are: compliance with the OAIS Reference Model; administrative responsibility; organizational viability; financial sustainability; technological and procedural suitability; system security; and procedural accountability.
The authors defined “compliance with the OAIS” as the repository owners/administrators ensuring that the “overall repository system conforms” to the OAIS Reference Model. They described “administrative responsibility” as the repository administrators adhering to “community-agreed” best practices and standards, particularly with regards to sustainability and long-term viability. RLG (2002) explained “organizational viability” as creating and maintaining an organization and structure that is capable of curating the objects in the repository and providing access to them for the indefinite long-term. They included as part of this maintaining trained staff, legal status, transparent business practices, succession plans, and maintaining relevant policies and procedures.
RLG (2002) designated “financial sustainability” as maintaining financial fitness, engaging in financial planning, etc., with an ongoing commitment to remain financially viable over the long-term. The authors outlined “technological and procedural suitability” as the repository owners/administrators keeping the archives software and hardware up to date, as well as complying with applicable best practices and standards for technical digital preservation. They traced an outline for “system security” by describing the minimal requirements a repository must follow regarding best practices for risk management, including written policies and procedures for disaster preparedness, redundancy, firewalls, back up, authentication, data loss and corruption, etc.
Finally, RLG (2002) defined “procedural accountability” as the repository owners/administrators being accountable for all of the above. That is, the authors wrote that maintaining a trusted digital repository is a complex set of “interrelated tasks and functions”; the maintainer of the repository is responsible for ensuring that all required functions, tasks, and components are carried out (RLG, 2002).
RLG (2002) described two primary responsibilities for the owners and administrators of a trusted digital repository: high-level organizational and curatorial responsibilities, and, operational responsibilities. They subdivided organizational and curatorial responsibilities into three levels. The authors noted that organizations must understand their local requirements, which other organizations may have similar requirements, and, how these responsibilities may be shared.
The authors of the report summarized five primary areas in support of those three levels: the scope of the collections, preservation and lifecycle management, the wide range of stakeholders, the ownership of material and other legal issues, and, cost implications (RLG, 2002).
- The scope of the collections: the repository owners and administrators must know exactly what they have in their digital collection, and how to adequately preserve the integrity and authenticity of the properties and characteristics of the individual items.
- Preservation and lifecycle management: the repository owners and administrators must commit to proactive planning with regards to preserving and curating the items in the repository.
- The wide range of stakeholders: the repository owners and administrators must take into account the interests of all stakeholders when planning for long-term access to the materials. In some instances, they will have to act in spite of their stakeholder’s wishes, as some stakeholders tend to have short-term views, and they will not care about the long-term preservation of, and access to, the materials. Other stakeholders will have a differing point of view, and they will want the material preserved in the long-term. The repository owners and administrators will have to balance these competing interests.
- The ownership of material and other legal issues: digital librarians and archivists will have to take a proactive role with content producers. They must seek to preserve materials by curating the data early in the life cycle of it, while being cognizant of the copyright and intellectual property concerns of the content producers and owners.
- Cost implications: repository owners and administrators must commit financial resources to maintaining the content over the indefinite long-term, while bearing in mind that the true costs of doing so are variable.
In sum, RLG (2002) recommended incorporating preservation planning into the everyday management of the preservation repository.
Next, the authors of this RLG report defined operational responsibilities in more detail than the organizational and curatorial responsibilities, above. They wrote the operational responsibilities based on the OAIS Reference Model, and added to that the “critical role” of a repository in the “promotion of standards” (RLG, 2002). They defined these areas as:
- Negotiates for and accepts appropriate information from information producers and rights holders: this responsibility covers the submission agreement between a content Producer and the OAIS Archive. These responsibilities include preservation metadata, record keeping, authenticity checks, and legal issues. As part of fulfilling this role, a repository will have policies and procedures in place to cover collection development, copyright and intellectual property rights concerns, metadata standards, provenance and authenticity, appropriate archival assessment, and, records of all transactions with the Producer.
- Obtains sufficient control of the information provided to support long-term preservation: this responsibility refers to the “staging” process, where ingested content is stored after submission from a Producer and before the material is ingested into the archive. The responsibilities of a repository administrator at this point encompass best practices for the ingest of materials, which includes an analysis of the digital content itself, including its “significant properties”; what requirements must be fulfilled to provide access to the material continuously; a metadata check against the repository’s standards (including adding metadata to bring the current metadata up to par); the assignment of a persistent and unique identifier; integrity/fixity/authentication checks; the creation of an OAIS Archival Storage Package (AIP); and, storage into the OAIS Archive.
- Determines, either by itself of [sic] with others, the users that make up its designated community, which should be able to understand the information provided: the repository administrators and owners must determine who their user base is so that they may understand how best to serve their Designated Community.
- Ensures that the information to be preserved is “independently understandable” to the designated community; that is, the community can understand the information without needing the assistance of experts: the repository owner and administrator must make the information available using generic tools that are available to the Designated Community. For example, documents might be made available via .pdf or .rtf because the software to render these documents is available for free to most users. A repository owner and/or administrator may not wish to preserve documents in the .pages file format, as this Apple file format is not commonly used and the software to render it is not free beyond a limited day trial period.
- Follows documented policies and procedures that ensure the information is preserved against all reasonable contingencies and enables the information to be disseminated as authenticated copies of the original or as traceable to the original: the repository owners and administrators will document any unwritten policies and procedures, and follow best practice recommendations and standards where possible. These policies must include policies to define the Designated Community and its knowledge base; policies for material storage, including service-level agreements; policies for authentication and access control; a collection development policy, including preservation planning; a policy to keep policies updated with current recommendations, standards, and best practices; and, finally, links between procedures and policies, to ensure compliance across all collections in the repository.
- Makes the preserved information available to the designated community: the repository owners and administrators must comply with legal responsibilities such as licensing, copyright, and intellectual property regarding access to the content in the repository. Within that framework, however, they should plan to provide user support, record keeping, pricing (where applicable), authentication, and, most importantly, a method for resource discovery.
- Works closely with the repository’s designated community to advocate the use of good and (where possible) standard practice in the creation of digital resources; this may include an outreach program for potential depositors: the repository owners and administrators should work with all stakeholders to advocate the use of standards and recommended best practices (RLG, 2002). As the Science and Technology Council (2007) noted, using standards will reduce costs for all parties involved and better ensure the longevity of the material.
In conclusion, the OAIS Reference Model has provided a useful framework “for identifying the responsibilities of a trusted digital repository” (RLG, 2002).
As part of the certification framework, the authors of the RLG report intended to support Waters & Garrett’s (1996) assertion that archival repositories “must be able to prove that they are who they say they are by meeting or exceeding the standards and criteria of an independently-administered program for archival certification”.
RLG (2002) described two types of certification then in use within the libraries and archives community: the standards model and the audit model. The “standards” model is an informal process. They stated that standards are created when best practices and guidelines are established by the consensus of the expert community and then “certified” by other practitioners’ acceptance and/or use of the “standard”. In other words, librarians, archivists, and computer scientists who work with libraries decide what constitutes a “standard”; only rarely does a standard become formalized via ISO or another international organization. The authors described the audit model as an output of legislation or policies and procedures established by national agencies, such as the U.S. Department of Defense. That is, a governing body passes laws or policies, and the information repository’s policies must conform to the governing body’s requirements (RLG, 2002).
For a discussion of other approaches to certification, please see an earlier section, “Types of Audit and Certifications”.
RLG (2002) described a framework for a trusted digital repository’s responsibilities and attributes. They noted that these apply to repositories both large and small that hold a wide variety of content. The authors summarized their work above with several recommendations.
- Recommendation 1: Develop a framework and process to support the certification of digital repositories.
- Recommendation 2: Research and create tools to identify the attributes of digital materials that must be preserved.
- Recommendation 3: Research and develop models for cooperative repository networks and services.
- Recommendation 4: Design and develop systems for the unique, persistent identification of digital objects that expressly support long-term preservation.
- Recommendation 5: Investigate and disseminate information about the complex relationship between digital preservation and intellectual property rights.
- Recommendation 6: Investigate and determine which technical strategies best provide for continuing access to digital resources.
- Recommendation 7: Investigate and define the minimal-level metadata required to manage digital information for the long term. Develop tools to automatically generate and/or extract as much of the required metadata as possible (RLG, 2002).
The remainder of this essay focuses on the results of Recommendation 1, above, regarding the development of certification standards for digital repositories.
Several researchers have addressed the problem of audit and certification. For example, Ross & McHugh (2006) created the Digital Repository Audit Method Based On Risk Assessment (DRAMBORA) to provide a self-audit method for repository administrators that provided quantifiable results (Digital Curation Centre, 2011). Dobratz, Schoger, and Strathmann (2006) created nestor, the Network of Expertise in Long-Term Storage of Digital Resources. Other lesser-known researchers such as Becker, et al. (2009) described a decision-making procedure for preservation planning that provides a means for repository administrators to consider various alternatives.
This section will examine the audit and certification method known as the “Trustworthy Repositories Audit & Certification (TRAC): Criteria and Checklist” and its follow up document, the “Audit and Certification of Trustworthy Repositories Recommended Practice”. Researchers and practitioners across the globe — including Ross, McHugh, Dobratz, et al. – combined their efforts and contributed their expertise into developing TRAC from a draft into a final version (Research Libraries Group, 2005; Dale, 2007). Their efforts have led to the development and refinement of TRAC into a CCSDS “Recommended Practice”; this may eventually become an ISO standard.
The essay, “Managing Data: the Emergence & Development of Digital Curation & Digital Preservation Standards” describes some of the related work in this area not covered below.
The authors of TRAC created it as part of a larger international effort to define an audit and certification process to ensure the longevity of digital objects. They defined a checklist that any repository manager could use to assess the trustworthiness of the repository. The checklist provided examples of the required evidence, but the list is considered “prescriptive”; the authors did not try to list every possible type of example. It contained three sections: “organizational infrastructure”, “digital object management”, and, “technologies, technical infrastructure, and security”.
The authors provided a spreadsheet-style “audit checklist” called “Criteria for Measuring Trustworthiness of Digital Repositories and Archives”. They note that the criteria measured is applicable to any kind of repository, using documentation (evidence), transparency (both internal and external), adequacy (individual context), and, measurability (i.e., objective controls). The authors stated that a full certification process must include not just an external audit, but tools to allow for self-examination and planning prior to an audit (OCLC & CRL, 2007). The terminology in the audit checklist conformed to the OAIS Reference Model.
A typical policy in TRAC followed the model of statement, explanation, and evidence (see Figure 1, below).
The authors of TRAC considered the organizational infrastructure to be as critical a component as the technical infrastructure (OCLC & CRL, 2007). This reflected the view of the authors of the OAIS Reference Model, who consider an OAIS to be “an archive, consisting of an organization of people and systems, that has accepted the responsibility to preserve information and make it available for a Designated Community” (CCSDS, 2002). OCLC & CRL (2007) considered “organizational attributes” to be a characteristic of a trusted digital repository, and these characteristics are reflected RLG’s (2002) grouping of financial sustainability, organizational viability, procedural accountability, and administrative responsibility as four of the seven attributes of a trusted digital repository.
The authors of TRAC considered the following ten elements to be part of organizational infrastructure, but they did not limit it to only these elements.
- Organizational structure
- Mandate or purpose
- Roles and responsibilities
- Policy framework
- Funding system
- Financial issues, including assets
- Contracts, licenses, and liabilities
- Transparency (OCLC & CRL, 2007).
In addition, they grouped the above elements into five areas:
- Governance and organizational viability: the owners and managers of a repository must commit to established best practices and standards for the long term. This includes mission statements, and succession/contingency plans.
- Organizational structure and staffing: the repository owners and managers must commit to hiring an appropriate number of qualified staff that receives regular ongoing professional development.
- Procedural accountability and policy framework: the repository owners and managers must provide transparency with regards to documentation related the long-term preservation and access of the archival data. This requirement provides evidence to stakeholders of the repository’s trustworthiness. This documentation may define the Designated Community, what policies and procedures are in place, legal requirements and obligations, reviews, feedback, self-assessment, provenance and integrity, and operations and management.
- Financial sustainability: the repository owners and administrators must follow solid business practices that provide for the long-term sustainability of the organization and the digital archive. This includes business plans, annual reviews, financial audits, risk management, and possible funding gaps.
- Contracts, licenses, and liabilities: the repository owners and administrators must make contracts and licenses “available for audits so that liabilities and risks may be evaluated”. This requirement includes deposit agreements, licenses, preservation rights, collection maintenance agreements, intellectual property and copyright, and, ingest (OCLC & CRL, 2007).
The authors described this section as a combination of technical and organizational aspects. They organized the requirements for this section to align with six of the seven OAIS Functional Entities: Ingest, Archival Storage, Preservation Planning, Data Management, Administration, and Access (OCLC & CRL, 2007; CCSDS, 2002). The authors of the TRAC audit & checklist defined these six sections as follows.
- The initial phase of ingest that addresses acquisition of digital content.
- The final phase of ingest that places the acquired digital content into the forms, often referred to as Archival Information Packages (AIPs), used by the repository for long-term preservation.
- Current, sound, and documented preservation strategies along with mechanisms to keep them up to date in the face of changing technical environments.
- Minimal conditions for performing long-term preservation of AIPs.
- Minimal-level metadata to allow digital objects to be located and managed within the system.
- The repository’s ability to produce and disseminate accurate, authentic versions of the digital objects (OCLC & CRL, 2007).
The authors further elucidated the above areas as follows.
- Ingest: acquisition of content
This section covered the process required to acquire content; this generally falls under the realm of a Submission Agreement between the Producer and the repository. The Producer may be external or internal to the repository’s governing organization. The authors recommended considering the object’s properties, any information that needs to be associated with the submitted object (s), mechanisms to authenticate the materials, verify each ingested object for integrity, maintaining control of the bits so that none may be altered at any time, regular contact with the Producer as appropriate, a formal acceptance process with the Producer for all content, and, an audit trail of the Ingest process.
- Ingest: creation of the archival package
The actions in this section covered the creation of an AIP. These actions involved documentation: of each AIP preserved by the repository; that each AIP created is actually adequate for preservation purposes; of the process of constructing an AIP from a SIP; of the actions performed on each SIP (deletion or creation as an AIP); of the use of persistent and unique naming schemas/identifiers, else, of the preservation of the existing unique naming schema; of the context for each AIP; of an audit trail of the metadata records ingested; of associated preservation metadata; of testing the ability of current tools to render the information content; of the verification of completeness of each AIP; of an integrity audit mechanism for the content; and, of any actions and process related to AIP creation.
- Preservation planning
The authors’ recommended four simple actions a repository administrator may take regarding keeping the archive current. The administrator must document their current preservation strategies; monitor format, etc., obsolescence; adjust the preservation plan if or when conditions change; and, provide evidence that the preservation plan used is actually effective.
- Archival storage & preservation/maintenance of AIPs
The actions in this section covered what is required to ensure that an AIP is actually being preserved. This involved examining multiple aspects of object maintenance, including, but not limited to, storage, tracking, checksums, migration, transformations, and copies/replicas. The repository administrator must be able to demonstrate the use of standard preservation strategies; that the repository actually implements these strategies; that the Content Information is preserved; that the integrity of the AIP is audited; and that there is an audit trail of any actions performed on an AIP.
- Information management
This section addressed the requirements related to descriptive metadata. The repository owner must identify the minimal metadata required for retrieval by the Designated Community; create a minimal amount of descriptive metadata and attach it to the described object; and, prove there is referential integrity between each AIP and its associated metadata (both creation and maintenance of).
- Access management
The authors designed this section to address methods for providing access to the content (i.e., DIPs) in the repository to the Designated Community; they wrote that the degree of sophistication of this would vary based on the context of the repository itself and the requirements of the Designated Community. They further subdivided this section into four areas: access conditions and actions, access security, access functionality, and, provenance. In order to fulfill the requirements presented in this section, a repository owner must: provide information to the Designated Community as to what access and delivery options are actually available; require an audit of all access actions; only provide access to particular Designated Community members as agreed to with the Producer; ensure access policies are documented and comply with deposit agreements; fully implement the stated access policy; log all access failures; demonstrate the DIP generated is what the user requested; prove that access success or failure is made known to the user within a reasonable length of time; and, all DIPs generated may be traced to an authentic original and themselves authentic (OCLC & CRL, 2007).
In summary, OCLC & CRL (2007) designed this section to make it mandatory for a trustworthy digital repository to be able to produce a DIP, “however primitive”.
The authors of TRAC did not want to make specific software and hardware requirements, as many of these would fall under standard computer science best practices and they are covered by other standards. Therefore, they addressed general information technology areas as related to digital preservation. These areas fall under one of three categories: system infrastructure, appropriate technologies, and security (OCLC & CRL, 2007).
- System infrastructure
This section addressed the basic infrastructure required to ensure the trustworthiness of any actions performed on an AIP. This meant that the repository administrator must be able to demonstrate that the operating systems and other core software are maintained and updated; the software and hardware are adequate to provide back ups; the number and location of all digital objects, including duplicates, are managed; all known copies are synched; audit mechanisms are in place to discover bit-level changes; any such bit-level changes are reported to management, including the steps taken to prevent further loss and replace/repair the current corruption and loss; processes are in place for hardware and software changes (e.g., migration); a change management process is in place to mitigate changes to critical processes; there is process for testing the effect of critical changes prior to an actual implementation; and, software security updates are implemented with an awareness of the risks versus benefits of doing so.
- Appropriate technologies
The authors recommended that a repository administrator should look to the Designated Community for relevant standards and strategies. They proposed that the hardware and software technologies in place are appropriate for the Designated Community, and that appropriate monitoring is in place to update hardware and software as appropriate.
This section addressed non-IT security, as well as IT security. The authors recommended that a repository administrator conducts a regular risk assessment of internal and external threats; ensures controls are in place to address any assessed threats; decides which staff members are authorized to do what and when; and, has an appropriate disaster preparedness plan in place, including off-site recovery plan copies (OCLC & CRL, 2007).
In conclusion, the archivists, librarians, computer scientists, and other experts who contributed to the development of TRAC created a document that encompassed the minimum requirements for an OAIS Archive to be considered “trustworthy”.
The CCSDS released the “Audit and Certification of Trustworthy Digital Repositories Recommended Practice” (v. CCSDS 652.0-M-1, the “Magenta Book”) in September 2011 (CCSDS, 2011). This section will discuss the Recommended Practice only with regards to major differences with TRAC (OCLC & CRL, 2007), above. This is because the two documents are similar enough that to repeat a description of each of the sections would be gratuitous.
The CCSDS described the purpose of the Recommended Practice as that of providing the documentation “on which to base an audit and certification process for assessing the trustworthiness of digital repositories” (CCSDS, 2011). The essay “Managing Data: the Emergence & Development of Digital Curation & Digital Preservation Standards” contains an overview of this Recommended Practice. This section will cover areas not covered by the overview in that essay or earlier in this document.
The three major sections of the Recommended Practice are the same as for TRAC, except that the last section has been re-named. Therefore, instead of “organizational infrastructure”, “digital object management”, and, “technologies, technical infrastructure, & security”, the authors of the Recommended Practice renamed the last section, “infrastructure and security risk management”. Within that technology section, the sections were reduced from three to two. Therefore, instead of, “system infrastructure”, “appropriate technologies”, and “security”, the Recommended Practice contains sub-sections on “technical infrastructure risk management” and “security risk management”. The subsections for “organizational infrastructure” and “digital object management” remained the same. The CCSDS re-worded, re-organized, and expanded the content of the sub-sections, but the general ideas behind each section stayed in place. So for example, Figure 2, below, is the Recommended Practice version of the same content in the same section in TRAC from Figure 1, above.
In short, the members of the CCSDS evolved and expanded the original TRAC checklist to create the Recommended Practice, but overall, the ideas in the original version have held up well during the four-year transition to a Recommended Standard.
Both Waters & Garrett (1996) and RLG (2002) recommended the creation of a certification program for trusted digital repositories. As a result, librarians, archivists, computer scientists and other experts and stakeholders in digital preservation created the “Trustworthy repositories audit & certification: criteria and checklist” in order to create a common set of standards and terminology by which a repository may be certified. These experts and others then took TRAC, via the CCSDS, and created the “Audit and Certification of Trustworthy Digital Repositories (CCSDS 652.0-M-1) Recommended Practice”. As part of the process of creating this Recommended Practice, these experts also determined the requirements for bodies that will provide the audit and certification of “candidate” trustworthy digital repositories.
They created a second Recommended Practice, “Requirements for bodies providing audit and certification of candidate trustworthy digital repositories CCSDS 652.1-M-1”. This Recommended Practice for bodies providing audit and certification is a supplement to an existing ISO Standard that outlines the requirements for a body performing audit and certification, “Conformity assessment — Requirements for bodies providing audit and certification of management systems” (ISO/IEC 17021, 2011).
The authors of this standard covered seven primary areas: principles, general requirements, structural requirements, resource requirements, information requirements, process requirements, and, management of system requirements for certification bodies. They defined “principles” as covering impartiality, competence, responsibility, openness, confidentiality, and responsiveness to complaints. They described “general requirements” as covering legal and contractual matters, management of impartiality, and liability and financing. They kept “structural requirements” simple — this is about the organizational structure and top management, and a committee for safeguarding impartiality.
The authors detailed “resource requirements” as covering the competence of management and personnel, the personnel involved in the certification activities, the use of individual auditors and external technical experts, personnel records, and outsourcing. They outlined “information requirements” as publicly accessible information, certification documents, directory of certified clients, reference to certification and use of marks, confidentiality, and the information exchange between a certification body and its clients. The authors delineated “process requirements” as covering general requirements, audit and certification, surveillance activities, recertification, special audits, suspending, withdrawing or reducing the scope of certification, appeals, complaints, and, the records of applicants and clients.
Finally, the authors provided three options for “management systems requirements for certification bodies” that includes general management requirements and management system requirements that are in accordance with ISO 9001. In document appendices, the authors discussed the required knowledge and skills to be an auditor, the possible types of evaluation methods, provided an example of a process flow for determining and maintaining competence, desired personal behaviors, the requirements for a third-party audit and certification process, and, considerations for the audit programme, scope or plan (ISO/IEC 17021, 2011).
Requirements for Bodies Providing Audit and Certification of Candidate Trustworthy Digital Repositories Recommended Practice
This section of this essay will address the areas in which the Recommended Practice for bodies providing audit and certification differs from “ISO/IEC 17021 Conformity Assessment”.
The CCSDS created the Recommended Practice, “Requirements for bodies providing audit and certification of candidate trustworthy digital repositories” as a supplement to “Conformity assessment — Requirements for bodies providing audit and certification of management systems” (ISO/IEC 17021, 2011). They created the document to provide additional information on which an organization that is assessing a digital repository for certification as trustworthy may base their operations for issuance of such certification (CCSDS, 2011). In other words, the CCSDS (2011) created the document to support the accreditation of bodies providing certification. They created the document with a secondary purpose of providing repository owners with documentation by which they may understand the processes involved in achieving certification. They wrote the document using terminology from the OAIS Reference Model.
The authors defined a “Primary Trustworthy Digital Repository Authorisation Body” (PTAB) as an organization that accredits training courses for auditors, accredits other certification bodies, and that provides audit and certification of candidate trustworthy digital repositories. The membership consists of “internationally recognized experts in digital preservation” (CCSDS, 2011). They defined the primary tasks of the organization as: accrediting other trustworthy digital repository certification bodies; certifying auditors; making certification decisions; accrediting auditor qualifications; undertaking audits; and, last, having a mechanism to add new experts to PTAB as needed. They noted that PTAB will also be accredited by ISO and will become a member of the International Accreditation Forum (IAF). In the event of any possible conflicts of interest, the authors designated two areas that are not considered conflicts by those members who are certifiers: lecturing, including in training courses, and identifying areas of improvement during the course of an audit (CCSDS, 2011).
The CCSDS outlined the criteria for the training of audit team members. This training must include: understanding digital preservation, including the technical aspects related to the audited activity; understanding of knowledge management systems; a general knowledge of the regulatory requirements related to trustworthy digital repositories; an understanding of the basic principles related to auditing, per ISO standards; an understanding of risk management and risk assessment with regards to digitally encoded information; and, finally, an understanding of the Recommended Practice, “Audit and Certification of Trustworthy Digital Repositories (CCSDS 652.0-M-1)”.
Furthermore, the authors specified that the audit team should have or find members with appropriate technical knowledge for the scope of the digital repository certification, the necessary comprehension of any applicable regulatory requirements for that repository, and knowledge of the repository owner’s organization, such that an appropriate audit may be conducted. The CCSDS wrote that the audit team might be supplemented with the necessary technical expertise, as needed. As well, the authors charged PTAB with assessing the conduct of auditors and experts and monitoring their performance, as well as selecting these experts and auditors based on appropriate experience, competence, training, and qualifications (CCSDS, 2011).
The CCSDS outlined the required levels of work experience for a trusted digital repository auditor. They required these auditors to have completed five days of training via PTAB or an accredited agency; some prior experience assessing trustworthiness, including participating in two audit certifications for a total of 20 days; four years of workplace experience focusing on digital preservation; remained current with regards to digital preservation best practices and standards; current experience; and, received certification from PTAB. The authors stipulated three additional requirements for audit team leaders. They must be able to effectively communicate in writing and orally; have been an auditor previously for two completed trustworthy digital repository audits; and, have the capability and knowledge of managing an audit certification process (CCSDS, 2011).
The authors outlined additional recommendations, including a requirement that the auditor must have access to the client organization’s records. If these records may not be accessed, then it is possible the audit may not be performed. The CCSDS defined the criteria against which an audit is performed as those defined in the Recommended Practice, “Audit and Certification of Trustworthy Digital Repositories (CCSDS 652.0-M-1)”. They require two auditors to be present on site; other auditors may work remotely. The authors’ note in an appendix on security that all auditors maintain confidentiality with respect to an organization’s systems, content, structure, data, etc., as required (CCSDS, 2011).
In conclusion, the CCSDS has created a method for a larger umbrella organization — PTAB — to certify the certifiers of a trusted digital repository by creating a “Recommended Practice for bodies providing audit and certification” as a supplement to the existing ISO/IEC standard for “Conformity assessment — Requirements for bodies providing audit and certification of management systems”. By creating both a certification program and the criteria for certification of trustworthiness, these experts believe they have ensured the availability of digital information over the indefinite long-term.
Gladney (2005; 2004) has been a vocal critic of the repository-centric approach to digital preservation, which he considers “unworkable”. He has proposed, instead, the creation of durable digital objects that encode all required preservation information within the digital object itself. R. Moore has reservations about the “top-down” approach, in which standards are handed-down from a body of experts to be used by practitioners. He would like to know what policies preservation data grid administrators are actually implementing at the machine-level (Ward, 2011).
Similar to R. Moore’s concerns, Thibodeau (2007) supports the development of standards for digital preservation, but he believes these standards should be supplemented by empirical data regarding the purpose of each repository. For example, practitioners should not assess a repository based solely on whether or not the repository is OAIS-compliant. He writes that practitioners should consider the purpose of the repository, its mission, and its user base, and whether or not the repository owner’s are fulfilling those requirements. Thibodeau (2007) defined a five-point framework for repository evaluation that considers service, collaboration, “state”, orientation, and coverage. He believes that this broader context, along with the OAIS Reference Model and the Recommended Practice for the Audit and Certification of Trustworthy Repositories, provide a more realistic determiner of a repository’s “success” or “failure”.
Archivists, librarians, computer scientists and other stakeholders and experts in digital preservation wanted to create certification standards for trustworthy digital repositories, and they voiced this desire in a 1996 report, “Preserving Digital Information” (Waters & Garrett, 1996). As one part of this enthusiasm for standards, the CCSDS released the OAIS Reference Model (CCSDS, 2002). Experts recognized that a technical framework was only part of a preservation repository, and so they worked to define the attributes and responsibilities of a trusted digital repository (RLG, 2002). They created an audit and certification checklist based on these attributes and responsibilities, called TRAC (OCLC & CRL, 2007). After receiving feedback from the preservation community, the CCSDS evolved TRAC into the Recommended Practice for the Audit and Certification of Trustworthy Digital Repositories (2011), and released the Recommended Practice for Requirements for Bodies Providing Audit and Certification of Candidate Trustworthy Digital Repositories (2011).
Thus, after many years of work, stakeholders with an interest in the preservation of digital material now have criteria against which to judge whether or not a repository and its contents are likely to last for the indefinite long-term, as well as an umbrella organization that will provide certified and trained auditors. To reiterate these accomplishments, over the past fifteen years, preservation experts have defined “trust” and a “trustworthy” digital repository; defined the attributes and responsibilities of a trustworthy digital repository; defined the criteria and created a checklist for the audit and certification of a trustworthy digital repository; evolved this criteria into a standard; and defined a standard for bodies who wish to provide audit and certification to candidate trustworthy digital repositories.
The significance of these accomplishments cannot be overstated — at stake in the concerns over the preservation of digital objects and information are the cultural and scientific heritage, and personal information, of humanity.
Bearman, D. & Trant, J. (1998). Authenticity of digital resources: towards a statement of requirements in the research process. D-Lib Magazine. Retrieved April 14, 2009, from http://www.dlib.org/dlib/june98/06bearman.html
Becker, C., Kulovits, H., Guttenbrunner, M., Strodl, S., Rauber, A., & Hofman, H. (2009). Systematic planning for digital preservation: evaluating potential strategies and building preservation plans. International Journal of Digital Libraries, 10(4), 133-157.
CCSDS. (2011). Requirements for bodies providing audit and certification of candidate trustworthy digital repositories recommended practice (CCSDS 652.1-M-1). Magenta Book, November 2011. Washington, DC: National Aeronautics and Space Administration (NASA).
CCSDS. (2011). Audit and certification of trustworthy digital repositories recommended practice (CCSDS 652.0-M-1). Magenta Book, September 2011. Washington, DC: National Aeronautics and Space Administration (NASA).
CCSDS. (2002). Reference model for an Open Archival Information System (OAIS) (CCSDS 650.0-B-1). Washington, DC: National Aeronautics and Space Administration (NASA). Retrieved April 3, 2007, from http://nost.gsfc.nasa.gov/isoas/
Dale, R. (2007). Mapping of audit & certification criteria for CRL meeting (15-16 January 2007). Retrieved September 11, 2007, from http://wiki.digitalrepositoryauditandcertification.org/pub/Main/ReferenceInputDocuments/TRAC-Nestor-DCC-criteria_mapping.doc
Digital Curation Centre. (2011). DRAMBORA. Retrieved December 9, 2011, from http://www.dcc.ac.uk/resources/tools-and-applications/drambora
Dobratz, S., Schoger, A., & Strathmann, S. (2006). The nestor Catalogue of Criteria for Trusted Digital Repository Evaluation and Certification. Paper presented at the workshop on “digital curation & trusted repositories: seeking success”, held in conjunction with the ACM/IEEE Joint Conference on Digital Libraries, June 11-15, 2006, Chapel Hill, NC, USA. Retrieved December 1, 2011, from http://www.ils.unc.edu/tibbo/JCDL2006/Dobratz-JCDLWorkshop2006.pdf
Gladney, H.M. & Lorie, R.A. (2005). Trustworthy 100-Year digital objects: durable encoding for when it is too late to ask. ACM Transactions on Information Systems, 23(3), 229-324. Retrieved December 29, 2011, from http://eprints.erpanet.org/7/
Gladney, H.M. (2004). Trustworthy 100-Year digital objects: evidence after every witness is dead. ACM Transactions on Information Systems, 22(3), 406-436. Retrieved July 12, 2008, from http://doi.acm.org/10.1145/1010614.1010617
Hedstrom, M. (1995). Electronic archives: integrity and access in the network environment. American Archivist, 58(3), 312-324.
ISO/IEC 17021. (2011.) Conformity assessment — Requirements for bodies providing audit and certification of management systems. Retrieved December 30, 2011, from http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=56676
Jøsang, A. & Knapskog, S.J. (1998). A metric for trusted systems. In Proceedings of the 21st National Information Systems Security Conference (NISSC), October 6-9, 1998, Crystal City, Virginia. Retrieved December 27, 2011, from http://csrc.nist.gov/nissc/1998/proceedings/paperA2.pdf
Lee, C. (2010). Open archival information system (OAIS) reference model. In Encyclopedia of Library and Information Sciences, Third Edition. London: Taylor & Francis.
Lynch, C. (2001). When documents deceive: trust and provenance as new factors for information retrieval in a tangled web. Journal of the American Society for Information Science and Technology, 52(1), 12-17.
Lynch, C. (2000). Authenticity and integrity in the digital environment: an exploratory analysis of the central role of trust. Authenticity in a digital environment. Washington, DC: Council in Library and Information Resources. Retrieved April 14, 2009, from http://www.clir.org/pubs/reports/pub92/pub92.pdf
Lynch, C. A. (1994). The integrity of digital information: mechanics and definitional issues. Journal of the American Society for Information Science, 45(10), 737-744.
OCLC & CRL. (2007). Trustworthy repositories audit & certification: criteria and checklist version 1.0. Dublin, OH & Chicago, IL: OCLC & CRL. Retrieved September 11, 2007, from http://www.crl.edu/sites/default/files/attachments/pages/trac_0.pdf
Research Libraries Group. (2005). An audit checklist for the certification of trusted digital repositories, draft for public comment. Mountain View, CA: Research Libraries Group. Retrieved April 14, 2009, from http://worldcat.org/arcviewer/1/OCC/2007/08/08/0000070511/viewer/file2416.pdf
Research Libraries Group. (2002). Trusted digital repositories: attributes and responsibilities an RLG-OCLC report. Mountain View, CA: Research Libraries Group. Retrieved September 11, 2007, from http://www.oclc.org/programs/ourwork/past/trustedrep/repositories.pdf
Ross, S. & McHugh, A. (2006). The role of evidence in establishing trust in repositories. D-Lib Magazine 12(7/8). Retrieved May 6, 2007, from http://www.dlib.org/dlib/july06/ross/07ross.html
Science and Technology Council. (2007). The digital dilemma strategic issues in archiving and accessing digital motion picture materials. The Science and Technology Council of the Academy of Motion Picture Arts and Sciences. Hollywood, CA: Academy of Motion Picture Arts and Sciences.
Thibodeau, K. (2007). If you build it, will it fly? Criteria for success in a digital repository. Journal of Digital Information, 8(2). Retrieved December 27, 2011, from http://journals.tdl.org/jodi/article/view/197/174
Trust. (2011). Merriam-Webster.com. Encyclopaedia Britannica Company. Retrieved December 30, 2011, from http://www.merriam-webster.com/dictionary/trust
Ward, J.H. (2011). Classifying Implemented Policies and Identifying Factors in Machine-Level Policy Sharing within the integrated Rule-Oriented Data System (iRODS). In Proceedings of the iRODS User Group Meeting 2011, February 17-18, 2011, Chapel Hill, NC.
Waters, D. and Garrett, J. (1996). Preserving Digital Information. Report of the Task Force on Archiving of Digital Information. Washington, DC: CLIR, May 1996.
If you would like to work with us on a digital curation and preservation project, please review our informatics consulting page.