Long-Lived Digital Collections
These continuity factors are based on CRL's case studies of long-lived digital repositories, supported by the National Science Foundation. The case studies identified a number of strategies and practices that successful repositories use to help mitigate risks. Ten strategic factors in particular seem to tend to promote continuity in the efforts of the repositories studied thus far.
Those continuity factors, are:
1. Quality Management: The baseline definition of sustainability is embodied in the requirements of ISO 9001-2000, the standard for process-based quality management systems. This standard calls for organizations to continually monitor and improve their services and products to satisfy user requirements while optimizing available resources. The standard identifies basic traits, such as clear lines of authority and accountability, robust policy infrastructure, and well-designed processes and mechanisms that preserve the value of an organization’s services.
2. Domain Dominance: Some repositories come to scale quickly, and rapidly surpass or co-opt alternatives or competitors in the field. Such repositories rapidly capture a critical mass or comprehensive share of the existing data for a discipline or a domain. The ability of a repository to eliminate redundant or rival efforts is essential in avoiding the high costs that often accompany unproductive competition for resources and clients. Domain dominance also eliminates the waste of resources that arises from ambiguity and uncertainty in the market.
Clear pre-eminence of a repository or its controlling organization in a particular domain can also enable the repository to set standards for the production, exchange and use of data; impose uniformity in the instrumentation and tools employed to produce and use the data; and otherwise shape the landscape to the repository’s advantage. Exemplars: Associated Press, Chemical Abstracts Service, UMI.
3. Concentration / Clarity of Purpose: Long-lived repositories clearly delineate their territory and user communities; precisely specify services and other outputs; and articulate their missions clearly and consistently to stakeholders. Some repositories focus relentlessly on a particular type of content or data and thus are able to realize economies of scale in key processes. Others focus on single fields of endeavor or research, and are thereby able to capture as large a share of the contributor/user populations as possible. Exemplars: ICPSR, CAS, General Social Survey.
4. Market Diversity: Enduring repositories tend to cultivate and support multiple communities of interest that reside in different geographic, demographic or economic sectors. This enables a repository to survive downturns in any single sector of its market. Repositories that rely predominantly on public funding, for example, are vulnerable in periods of political change and nation- or region-based economic crises. Exemplars: CAS, AP.
5. Multiple Versioning and Outputs: The ability of a repository to generate multiple derivative products and services from its core content and activities also seems to promote longevity. Exemplars: AP, CAS. Associated Press (AP) has been able to generate revenue from uses of its older text archives for longitudinal studies of business performance. CAS has built an array of databases and published indexes based upon its traditional core activity: the abstracting and analysis of chemical literature.
6. Control of Supply and Distribution Channels: Some repositories are able to control the amount and complexity of the content or data they accept. Since amount and complexity of content are cost drivers, a repository must ensure that its intake and processing costs remain commensurate with the value of its outputs to its user communities.
7. Incentive Systems: Effective repositories offer powerful incentives for the preparation and deposit of data, and support of the repository. Incentives can be monetary, functional, material, or reputational returns, in exchange for the contribution of content, services, and other forms of support to the repository. Simply put, those scientists and researchers that realize a return from making data available are likely to find ways to continue to contribute. This dynamic is demonstrated not only by proprietary, subscription-based data collections like ICPSR but in the open access models explored by AP, where impact and relevance earn advertising revenues for the publisher. Reputational capital and its corollary, professional advancement, are powerful rewards in the sciences. Exemplar: Chemical Abstracts Service. The early CAS editors generated and maintained a sense of community among CAS abstractors, which enabled the organization to build a cohort of thousands of volunteers.
8. Environmental Sensitivity: To be effective, repositories must put in place the means to detect changes, positive or negative, in the technology, business and legal environments in which they operate. These should be not just reactive but, to the extent possible, predictive as well. This enables a repository to anticipate changes that might threaten its viability, degrade the usefulness or functionality of its content, or undermine its value to the stakeholders. These often go beyond simple technology watch mechanisms, and ideally position the repository to shape its economic, legal, regulatory, and technology environments. Exemplars: AP, UMI, CAS
9. Robust Feedback Mechanisms: One form of environmental sensitivity is a repository’s responsiveness to its producer and user communities. Effective mechanisms for obtaining feedback from stakeholders, and responsiveness to that feedback, as well as perception of responsiveness to same. This enables the repository to adapt promptly and effectively to changes in the practices, behaviors and expectations of users and producers of data. The most direct and efficient feedback loops are when those who maintain the data are also the producers and/or users of same. Exemplars: CAS, UMI, GSS.
10. Structural Accountability: The repository has in place governance and internal organizational processes and structures that ensure the continual disclosure of key information to stakeholders, and all but guarantee organizational responsiveness to stakeholder concerns. In concrete terms, this means that the primary governance bodies are duly constituted and empowered, and mirror the populations of predominant stakeholder sectors. It can also be evidenced by a sound written and formal escalation path for the organization, consisting of progressive reporting and referral upward of unresolved issues. Exemplars: CAS, ICPSR.