PCWA Purpose

The CRL Political Communications Web Archiving project, undertaken in 2003-2004, explored methodologies for the systematic, sustainable preservation of Web-based political communications. Because these important communications comprise a valuable source of information for historical studies and the social sciences, but are by nature fugitive and susceptible to loss, it is important to ensure their long-term survival and broad availability for research. Drawing upon the expertise of technology and subject specialists at New York University, Cornell University, Stanford University, and the University of Texas–Austin, the cooperative effort built upon investigations  underway at those institutions and drew from the broader community—including the Library of Congress, the California Digital Library and the Internet Archive—to identify methodologies that can generally be applied by the larger research community and across regions.


  • Determine organizational and economic framework necessary to support the archiving of Web-based political materials on an ongoing basis and the persistent availability of those resources for long-term research use.
  • Identify the optimal curatorial regimes, practices, and tools for ongoing identification, targeting, and capture of the various types of Web-based political communications to be archived. Develop a growth plan, reconciling Web archiving and curatorial methodologies with traditional collection development activities and the regimens appropriate to the capture of various kinds of communications.
  • Identify and specify the most appropriate technology architecture(s), tools, and techniques for gathering and preserving Web-based political communications and the associated costs, benefits, characteristics, and risk factors.


  • Scholarly research and teaching, in particular by historians and political scientists.
  • Study and informational use by members of the international development, policy, diplomatic, and journalism communities and lay individuals.
  • Inclusion in noncommercial publications/aggregations (definition to come later).

Scope of Archive

The project focused on Web sites (as defined in wireframe document), including those created by individuals and institutions. These included sites of political parties, movements, radical organizations or NGOs in Latin America, Sub-Saharan Africa, Southeast Asia, and Europe. Related materials under the political communications rubric that might be addressed by subsequent investigations included listserv digests, RSS feeds, databases, and deeper Web sites that are password-protected or otherwise designed to be robot-restricted.