Long-Lived Digital Collections

Release Date: 
Monday, September 20, 2010

Funded by the National Science Foundation under its Strategic Technologies for Cyberinfrastructure Program, CRL initiated a two-year project (Jan. 2008-Dec. 2009) to analyze established, "long-lived" collections of data and digital resources, and to create tools and metrics for developing and assessing new repositories. The CRL case studies identified the practices, strategies and mechanisms that have enabled those repositories to sustain massive data collections over substantial periods of time.

The creation and collection of massive amounts of digital data by the sciences and social sciences today is creating stewardship demands that cannot be met fully by traditional libraries and archive organizations. During the past three decades large, new repositories of digital data have emerged to meet the needs of scientists and of researchers in the social sciences and humanities. Data stewardship is now undertaken by federal agencies, discipline-based consortia of scientists and researchers, supercomputer centers, universities, institutes, and for-profit corporations like ProQuest, ExxonMobil, and Google. Some emerging data repositories have flourished and persisted; others have not.

The project generated and disseminated innovative models, risk assessment tools, cost data and metrics to enable informed planning and prudent investment in Cyberinfrastructure by the NSF and other federal agencies, universities, scientific consortia and institutes, corporations, publishers, and other stakeholders across the spectrum of science, social science, and humanities communities.

The subjects of the case studies include the following repositories: