Portico is a digital repository that intends to archive scholarly electronic materials for the long term. Its primary intended customers are libraries and publishers who subscribe to the repository’s services. Titles within Portico are made available to subscribers when access is not possible through the publisher’s usual distribution channels.
The information in this report is based on interviews with the repository's staff and users of the databases, as well as the ITHAKA and Portico websites. In particular we would like to thank Amy Kirchoff, Archive Service Product Manager at ITHAKA. Secondary sources such as books and magazine or journal articles were also consulted. Any information that is not cited was found on the websites of ITHAKA and Portico.
Mission and History
Portico is a service of ITHAKA, a charitable trust (IRC 4947a1) started in 2003 and funded by the Andrew W. Mellon Foundation. ITHAKA’s mission is to accelerate the productive uses of information technologies for the benefit of higher education around the world. Portico shares ITHAKA’s mission:
ITHAKA is a not-for-profit organization dedicated to helping the academic community take full advantage of rapidly advancing information and networking technologies. We serve scholars, researchers, and students by providing the content, tools, and services needed to preserve the scholarly record and to advance research and teaching in sustainable ways. We are committed to working in collaboration with other organizations to maximize benefits to our stakeholders. (from ITHAKA website)
ITHAKA’s mission includes preservation as one of its goals and this is Portico’s primary service. Fulfilling a mission goal is important because it strengthens Portico’s relevance within the Ithaka organization.
Portico was created to fill a gap in the scholarly record. Staff from JSTOR (short for Journal Storage), a digital content service, was concerned that versions of electronic journal issues were sometimes different from the print but were not being systematically preserved. They believed that preservation of the electronic version was important for the future scholarly record. This concern developed into the JSTOR “Electronic-Archiving Initiative” project, funded with a grant from The Andrew W. Mellon Foundation. The project, beginning in 2002, intended for JSTOR to build the organizational and technical infrastructure necessary to ensure long-term preservation and access to e-journals. In 2004, the Electronic-Archiving Initiative was reassigned from a JSTOR project to an ITHAKA Harbors initiative. Renamed Portico, the new enterprise, was located in Princeton, New Jersey. The system was completed in 2005.
After completing the technical system Portico began to build its administrative infrastructure and funding model. JSTOR, ITHAKA Harbors, and the Andrew W. Mellon Foundation provided funding for Portico’s development. In 2006 Portico introduced a funding model with the goal to be a self sustaining service. The subscription model was developed with input from both publishers and librarians.
With the system in place, Portico began to acquire content from publishers; the first to sign on were the American Mathematical Society and Elsevier. By April 2006, nine publishers had committed more than 3,200 journals to Portico1. Elsevier increased Portico’s holdings significantly when it deposited seven million articles in 2006. By the end of 2007, the titles had almost doubled to six thousand and more than forty publishers were participating. Currently Portico’s holdings included eleven thousand e-journals and thirty-three thousand e-book titles, a total of twenty-eight-and-a-half terabytes of data.
JSTOR and ITHAKA merged in January 2009 to form a new organization called Ithaka, which provides three services: JSTOR, an online database that offers scholarly resources; Portico; and Ithaka S+R, a research service focused on new technologies for the scholarly community. At the time of the merger, ITHAKA stated, “As one organization, Ithaka will explore how to use its combined knowledge and experience to help its constituents in new ways.2” ITHAKA intends to use its knowledge base to create services for the academic community. This may affect Portico’s staff as key members are moved to new projects, but based on CRL’s 2010 audit, there is every reason to assume Portico is a stable organization that will continue to provide preservation services to the library community.
Governance and Staffing
Portico is a service of ITHAKA, which provides administrative oversight for Portico. ITHAKA is responsible for Human Resources, financial control, legal counsel, information technology, and tool building for Portico subscribers. Portico’s staff focuses on two areas: content management and customer services. Content management includes the activities of acquisition, ingestion, and dissemination of content. Customer service includes outreach to current and perspective subscribers and communication with content owners.
The ITHAKA Board of Trustees provides formal governance for Portico. The board meets four times a year. A subcommittee of the ITHAKA Board led by Deanna Marcum, Associate Librarian for Library Services at the Library of Congress, oversees the implementation of the preservation aspects of Ithaka’s mission, including Portico. Ms. Marcum is the only permanent member of the committee, with other ITHAKA board members serving on the committee as needed3.
Planning effectively is an important aspect of repository administration. In the CRL 2010 audit, we saw evidence of Portico’s active planning process. Portico effectively used the stages of planning, strategy development, and implementation to develop new services, such as an e-books project, and held planning sessions with senior staff twice a year. The Portico Leadership Group (PLG), made up of Portico staff and invited guests, holds a strategic planning session over two days each summer and fall. These meetings set goals and objectives for Portico departments and staff. The time Portico sets aside for bringing leadership staff together has been an effective mechanism for setting future activities and monitoring change within the enterprise.
A trusted digital repository must obtain outside feedback from important communities. Portico seeks outside opinions and ideas on particular topics through ad-hoc committees, which ensure that the needs of the community are being met. One such committee, the Portico Advisory Committee (PAC), comprised of representatives from the scholarly publishing and academic library communities, met from 2005 to 2008 to advise Portico on its business model, content acquisition, and services.
It is essential that Portico staff understand the needs of the communities they serve. Staff at Portico come primarily from the publishing, information technology, and library fields. Many of the original Portico administrative staff came from JSTOR or scholarly publishers. Portico’s Executive Director Eileen Fenton is a librarian and was previously Director of Production at JSTOR. Information technology staff at the senior level were largely hired from JSTOR.
Staff from Portico, JSTOR, and Ithaka manage different aspects of Portico’s repository system. The Portico hardware and software systems were designed by JSTOR and Portico staff. The shared IT unit of ITHAKA manages much of the technical infrastructure services, including: networking, hardware installation, configuration, and administration, and database administration. Access management, should a trigger event occur, will be provided through JSTOR’s web interface, leveraging existing extensive infrastructure for delivery. The Princeton University Computing Center stores systems hardware and masters or primary copies of all electronic journals in their server room.
There have been some changes within the Portico leadership since 2006. With the merger, Eileen Fenton’s title changed from Executive Director to Managing Director of Portico and she was given additional responsibilities, overseeing ITHAKA’s technology services and content management unit. The merger also added Bruce Heterick, Vice President for Outreach & Participation Services at JSTOR, to Portico’s administrative staff. Mr. Heterick’s responsibilities for JSTOR’s global outreach and access services were expanded to include Portico. The departure of Evan Owens in March 2010 was another change within Portico’s administrative staff. Mr. Owens participated in the building of Portico’s system and in building tools, such as JHOVE2 and the NLM-DTD that advanced the Portico archiving system.
Funding and Planning
A trusted digital repository has to prove it is financially stable. One way a repository can do this is by establishing and maintaining diverse sources of income. Portico has three main sources of income: Ithaka, grants, and additional support from libraries and publisher’s subscriptions.
Portico’s goal is to be supported through subscription. A tiered subscription model was chosen as a result of feedback from a committee of publishers and libraries. For publishers the annual financial contribution is based on a publisher’s total revenues, including print and electronic subscriptions, licensing, and advertising. Current yearly rates for publishers are between $250 and $78,000. For libraries, the subscription is based on the total library materials expenditures (LME) (Portico uses the definition of LME provided by the Association of Research Libraries (ARL) to determine payment). Rates for most libraries range from $1,500 to $25,000 annually.
Subscriptions from libraries and publishers continue to grow. In January 2011, Portico had over 700 library members worldwide, including libraries on every continent but Africa. The illustration below details the growth of library subscribers. Publisher subscriptions have also increased. Today 121 publishers hold formal archiving agreement with Portico.
Grant funding, another source of income, does not sustain regular Portico activities, but is usually earmarked for development of a particular aspect of Portico and its services. This is a sound financial practice because grant funding is unreliable. Portico’s grants have come from a variety of funders, including the Library of Congress, the Institute of Museum and Library Science (IMLS), the National Endowment for the Humanities (NEH), and other funders. The Library of Congress has been especially generous, contributing a startup grant of $3 million. Other grants have come for the development of specific repository tools. Portico was one of three partners included in a grant project of the Library of Congress National Digital Information Infrastructure and Preservation Program (NDIIPP). The project was to develop JHOVE2 (JHOVE and JHOVE2 are repository tools for validating content). The tools Portico has participated in creating have been shared throughout the repository community and improve repository processes.
It is difficult to ascertain exactly how much money Ithaka has invested in Portico, as ITHAKA’s finances are publicly available but Portico’s are not. As ITHAKA is a solvent organization willing to contribute to the development and sustenance of Portico services, this willing parent organization with a sold cash flow helps to ensure Portico will not fail.
Stakeholders and Designated Community
If a repository cannot provide usable content to its designated community when it is needed, then it is not a trusted digital repository. How a repository will accomplish preservation of content will vary, but the intended outcome of digital preservation is that the data is usable when it is needed. Meeting the needs of the designated community (the users for whom the content is being preserved) is of primary importance when determining the trustworthiness of Portico or any repository. In the 2010 CRL audit, Portico's designated community is defined as, "publishers, librarians, scholars and students in participating institutions.”
Portico does not preserve the content for the publisher, but for the use of the designated community. This important distinction implies the content is being preserved for eventual use within Portico’s systems, rather than through publisher’s websites. In the future this goal better serves the needs of the university community, rather the publishers. Portico provides publishers with some useful benefits. Portico services allow publishers to reduce (or eliminate) internal archiving costs, address customer concerns by providing a trusted, third-party archive, and provide libraries with perpetual access to content they have purchased. Publishers may also choose to provide post-cancellation access to their archived content through Portico’s platform.
ITHAKA and the publisher sign a Publication License Agreement that allows Portico to preserve the original content within the object. In the agreement, Portico states some of the actions it will take to preserve the content, including:
* Creating an archival version that will preserve the textual, audiovisual, and other content of publication(s)
* Facilitating preservation (including migration to new file formats and technologies)
* Verification and management of the content
Publishers agree to allow Portico to preserve the original content within the file. Portico does not commit to preserving all aspects of the file. The look and feel, links to outside content, and other features of the original online version may not be rendered when the file is eventually made available for use. This commitment is reasonable, and many other digital repositories commit to a similar level of preservation. Portico preserves the original file(s) so restoration of the original file may be possible given the appropriate technology.
Portico provides library subscribers with assurance that the electronic content they purchase from publishers will be available through the Internet in the event of an established trigger event. The Portico service is not a replacement for traditional library databases that provide access to new or current scholarly content. Libraries may also benefit from Portico’s services because it may allow them to reduce or eliminate redundant print or digital content within their facilities.
Other stakeholders who benefit from Portico include organizations using digital repository technologies and the designated community: the scholars and researchers of the future. This work has an impact far beyond current subscribers and future users. Organizations using digital repository technologies include archives, libraries, science data centers, and many others. These organizations have benefitted from access to the tools and standards that Portico helped develop. Portico participation in building important digital repository tools, including JHOVE2 and the NLM DTD, has helped the digital archiving community. The designated communities, future users, are guaranteed access to Portico’s content for research when it is needed. Many of today’s scholars are not aware of digital preservation efforts but this is likely to change as digital publications become more widely accepted as an accepted scholarly resource platform.
Content and Services
The Portico collection consists of e-journals, e-books, and d-collections (digital collections). Portico aims to preserve the scholarly record. With this goal, it is likely they will continue to expand the types of content they will preserve. In February 2011, Portico’s website listed approximately 12,000 e-journal titles, 66,000 e-book titles, and 39 d-collections. Existing Portico subscribers may choose to subscribe to either the e-journals or e-books collections, or both when they renew their Portico agreement. This subscription model allows libraries the flexibility to limit spending, should they be interested in narrowing their preservation goals. D-collections have a different business model, as the publishers are paying all the storage costs and, should a trigger event occur, the publisher’s subscribers will be the only ones with access.
When a trigger event occurs, Portico subscribers are given access to whatever licensed content they have purchased from its publisher. Depending on Portico’s agreement with a publisher, it will provide access to a title when trigger events occur or a title with post-cancellation access status is cancelled. Publishers rely on Portico’s post-cancellation access (PCA) to provide this service to former customers, eliminating the requirement to build their own delivery system, and ensuring they can meet the terms of their agreements. For libraries, the perpetual access service ensures that the investment they have made in content is long-term, and that future scholars and researchers will continue to be served. Publishers may not select particular titles for PCA; only all or none of their titles may be designated as PCA. Eighty-eight percent of e-journal publishers and 87 percent of e-book publishers have committed their content in Portico with post-cancellation access.
Portico trigger events are defined as when:
1. A publisher ceases operations and no entity purchases and makes its titles accessible
2. A publisher ceases to publish a title and it is not offered by another entity
3. Back issues are removed from a publisher’s site and are not available elsewhere
4. Catastrophic and sustained failure disables the publisher’s delivery platform
5. A publisher opts to rely upon Portico to meet perpetual access obligations
Prior to providing access to a publisher’s content due to the occurrence of a trigger event, Portico sends written notice to the licensor informing them of their intention to release the content. Portico will not make the content accessible if they are reasonably assured by the publisher that a trigger event has not taken place, or that the content is now available through the publisher or another entity. After receiving notice, publishers may specify the date when triggered content is accessible on Portico’s delivery platform, so long as it does not to exceed 60 days. Publishers may also choose to release their content via the Portico delivery platform should an interruption occur. When the content is once again available through the publisher or its designated successor, Portico will stop access.
There are currently 117 publishers contributing e-journals and approximately 12,500 e-journal titles (the Portico website has a list of all titles archived or committed). Portico also provides a downloadable Excel spreadsheet of all titles and issues, which is a useful tool for those who wish to analyze the titles and holdings. For subscribing libraries can receive a customized list of Portico holdings against their own. Eighty-eight percent of the e-journals within the repository have perpetual content-access designations. Most of the journal titles available in Portico originate in the United States and Northern Europe. Languages are predominantly Western European, with English being the major language.
D-collections started in 2008 and encompass digitized historical collections. They predominantly consist of primary source content, such as newspapers and correspondence. Many of these digital collections originated as microfilm sets, such as "19th Century U.S. Newspapers." Currently two publishers, Adam Matthew Digital, and Gale, a part of Cengage Learning, contribute content. Libraries do not pay to subscribe to D-collection content, which is supported solely by the individual publishers that have committed their collections to the archive. If a trigger event requires access to a D-collection, the access is limited to a publisher’s previous customers.
Portico began preserving e-books at the beginning of 2011. Currently Springer and Taylor & Francis Group are the largest contributors. Nine publishers deposit e-books into Portico: Brill; De Gruyter; Duke; Elsevier; John Wiley and Sons, Inc.; Palgrave MacMillan; SPIE; Springer and Taylor; and Francis Group. Of these publishers, four do not offer post-cancellation access to e-books. Subject areas for e-books have a scholarly focus, with science, law, medicine, and the liberal arts all strongly represented, and some of the individual titles share a series or a society affiliation.
Current holdings for e-books within Portico archive by publisher, from data acquired 2/8/2001
Portico is a dark archive. Access to content within the repository is not available to subscribers unless a trigger event occurs. In November 2007 the first trigger event occurred when the journal Graft: Organ and Cell Transplantation (SAGE Publications) was removed from SAGE’s online platform. Portico then made the title available to its library participants through the Portico website. Currently, there are four titles available through post-cancellation access to Portico subscribers on their website: Auto/Biography, Brief Treatment and Crisis Intervention, Graft: Organ and Cell Transplantation, and Pain Reviews.
Portico provides a web database for subscribers who want to audit the archive content. This tool is restricted to access for one to four users from a subscribing institution. Users have access to all content and metadata within the repository. The verification tool is not meant to be used as a substitute for regular library delivery channels such as ILL or document delivery. It is not the same access platform that Portico provides to deliver triggered content. The interface allows users to download an HTML rendering of a requested Archived Information Package (AIP) for verification. The subscriber is not given access to the actual AIP within the Portico digital repository. This is important because providing access to the actual AIPs within the archive would put the content at risk, violating Portico’s dark archive status.
Portico uses Open URL links and works with linking and A–Z list vendors like CrossRef, CUFTS, EBSCO, ExLibris, OCLC Openly Informatics, SerialsSolutions, and TDnet to reroute library catalogs and other knowledge bases to the Portico delivery platform. This helps libraries and other knowledge institutions by connecting patrons to licensed content without additional work to their local catalog.
Technical Systems and Analysis
Publishers submit source files to Portico to ingest new content into the repository. Portico normalizes these source files to an archival format and provides long-term archival management and format migration as needed. When specific conditions or “trigger events” occur, subscribing libraries receive online, campuswide access to relevant titles and holdings.
The Portico workflow involves verifying and normalizing publisher content, then zipping it into an archival information package (AIP) for storage. Portico’s policy is to accept what the publisher has available, regardless of its file formats or structure. Publishers have little incentive to spend labor on post-processing published content; requiring them to normalize content would discourage the flow of content. So publisher content submitted to Portico is never uniform and must be normalized before ingestion into the archive. Archival objects are usually submitted in either TIFF or PDF format, along with publisher full-text or bibliographic (header-only) metadata in an XML file. Additional supplemental files (including spreadsheets, executable files, audio and video, etc.) may be included. These supplemental files are in formats commonly associated with the information they hold. Delivery to Portico is usually accomplished through ftp pull or push from the publisher.
Before beginning batch ingestion workflows, Portico staff spends time working with samples of new content streams. The investigation involves running their tools on sample content, and then evaluating the results. This approach enables them to develop specific plans (called transforms) that create versions of the publisher’s content suitable for the Portico Archived Information Package (AIP). For example, a publisher may use a character that Portico’s system does not recognize, so the transform will include mapping this unknown character to a Unicode character understood by Portico’s system. Portico is mindful that these types of actions have an impact on a file’s authenticity. For this reason, any transformation where data is lost or added is marked within the repository. In addition, all the original publisher files are included in the AIP.
Portico Archival Processes6
Portico’s Content Preparation System (ConPrep) is the platform for both management of provider content and workflow processing. ConPrep manages the steps that lead to the creation of an AIP that will be sent to the archive. Some of the ConPrep workflow is automated and some is manual. The following steps have been excerpted in part from the Portico Content Preparation System Architecture document:
1. Data is delivered to ConPrep either manually or automatically by a publisher.
2. Submitted content is divided into batches.
3. Batches are scheduled for workflow processing.
4. Batches are automatically processed. Files are unzipped, verified with checksums, and the JHOVE file format verification tools run on the contents. Metadata is extracted and relationships between files are recorded.
5. After auto-processing is completed, the batch undergoes a Quality Check (QC) to ensure there are no errors. This determines if the batch can be released to the archive.
6. After QC, a PorticoMETS (the Portico version of the NLM DTD) file is generated to describe the structure of the content, and the data is released from ConPrep to be loaded into the archive (an Oracle database) for long-term storage.
7. Prior to loading into the archive additional steps are taken. The files are checked for: an authentic agreement ID, format types, and preservation level. The following are validated: asset inventory and checksums. An ingest event is recorded in the PorticoMETS file.
8. After this step, the batch is loaded into the archive.7
Portico has made the choice to keep all versions and formats of the object it is preserving. Once normalized, a Portico AIP contains publisher-supplied files; components of the files, such as graphics and media files; format specifications; page images, and text files that may contain full text of an article and metadata headers or full records. Their decision is to write the AIP once and then add additional or new content versions to the existing AIP. This collection of different versions of files will provide alternatives for future rendering or migrating an object. The intent is that each preserved object can be reconstituted as a file system object, using non-platform-specific readers, completely independent of the Portico archive system. Portico packages each archived object in a ZIP file with all original publisher-provided digital artifacts, along with any Portico-created digital artifacts and XML metadata associated with it.
A Portico METS (PMETS) file is created for every object within the archive. An object’s PDF files are converted to XML and then wrapped in a Metadata Encoding Transmission Standard (METS) wrapper. The PMETS file is an XML file in a modified version of the METS schema. The content within the archive can be reassembled because these PMET files provide the structure for recreating the original object by linking all the component source files necessary to render the object.
Portico considers the metadata object(s) to be of equal value to the content it describes. This is a good preservation policy because in the future it may be difficult to render a file and use it without sufficient metadata. The Portico metadata file includes structural, technical, descriptive, and events metadata. Descriptive metadata is usually extracted from a publisher-provided XML document or header. Portico does not correct or curate the publisher’s descriptive metadata, except for those fields necessary to provide access (ISSN, title, publication date, publisher name, journal title, volume and issue, author names, copyright information). They define the syntax and element definitions and attributes represented in their XML document. Some technical metadata is generated during the JHOVE validation process. To reduce redundant information within their Metadata records, they created a master record for batch processes and link individual objects to that master record rather than recording details of each batch event at the object level.
Portico chooses to use commonly available hardware and software, such as that sold by Oracle, to build its technology infrastructure. It commits to updating the technology as it evolves. The hardware is stored at the Princeton University campus data center. Portico staff have run several large-scale tests of their systems and data, including fixity verifications and a web penetration test. System problems are captured and stored using JIRA issue-tracking software.
Preservation of Repository Content
Although Portico is a “dark” archive, it serves content when certain conditions are met. It must assure there are access-ready objects available, because this affects how the content is managed and how it is stored. Portico’s preservation strategies include several offsite backups, using open standard file formats, policies and procedures developed to ensure content is properly archived, fixity verification of content, and monitoring of file formats and technologies.
Off-line copies of archived content are stored in a combination of commercial storage providers and with strategic partners. Portico maintains four copies of its archive. The master copy is stored near Portico’s main office in a data center located on the campus of Princeton University in New Jersey. Three backups of its content are stored offsite. Copies of the repository are maintained in Ann Arbor, Michigan, and in Colorado in the Amazon cloud. Tape backups kept at the Kolinklijk Biblioteek (KB) in the Netherlands are replaced on a six-month rotating schedule. The KB maintains a clean, secure environment and the tapes are not available for access.
Portico commits to using open standard file formats to archive content. These include the NLM DTD for XML metadata files, and storing objects in PDF format. Open-standard file formats make it more likely a future user will be able to recover data because the code is known. When the code is available, software can be written to recover the content of the files.
Portico has developed a series of policies and procedures that outline how it ingests, store, and deliver content. For example, among the documents required by the TRAC Checklist, the Portico Disaster Recovery Plan helps to illuminate how it will ensure its content is preserved should the archive fail. This plan is reviewed on a regular basis and provides a step-by-step guide to how the repository will act in the event of a disaster.
To maintain content within the repository, Portico monitors its ingested data. It runs a biannual fixity check of the master copy of the archive against the checksums in the metadata database. A checksum identifies when changes to files have occurred. Portico uses the SHA-512 Checksum verification standard, which is the most precise of the SHA checksums. Portico plans to incorporate a CRC Checksum into the fixity check within the next year. The TRAC Checklist requires that repositories run an independent fixity check on their content on a regular basis.
In addition to fixity checking, Portico has done other testing of its system, including a 2008 test by a security firm who performed a security vulnerability assessment. Portico addressed the vulnerabilities found after the completion of the test. Commitment to this type of testing helps illustrate the level of commitment Portico has to preserving content.
Portico also has appropriate staff monitor information sources for changes in the software, hardware, and file formats used at Portico. This activity is referred to as a Technology Watch in the TRAC Checklist. Repositories need to monitor and migrate software, hardware, and file formats as they approach obsolescence. To monitor changes in file formats, as well as other technology, Portico staff monitors a large number of relevant RSS feeds and listservs, ensuring they are up-to-date on the standards for relevant file formats.
Portico alsdo uses metadata as a preservation strategy. It includes descriptive information from the publisher, and preservation metadata that records every action taken on archived content. Portico records events performed on the AIP in the PMETS file. Some examples of actions taken on an object are: extracting descriptive metadata from publisher provided mark-up files, ingesting the package into the archive, verification of the file format using JHOVE or extracting technical metadata from JHOVE output to include in the Portico metadata file. Portico also records important information about the file, such as the MIMETYPE field. This field stores the original file name extension and identifies the format of a file. This information could be useful, should they need to use emulation to render a file.
Portico does not commit to preserving the look and feel of the original object. The final object that renders may be minimal, such as a text file. Subscribers agree to this at the time of signing the Portico agreement.
Portico commits to being an OAIS-compliant repository. OAIS compliance helps to define the preservation strategies to which Portico commits. To maintain its status, Portico must continually review its preservation choices, including the standards to which it adheres, the policies it has in place, and its work processes. This commitment will guarantee Portico is current in terms of software, hardware, adherence to standards, and current preservation practices.
Portico is a solution to the digital preservation needs of university libraries and the publishers who supply them. Its services continue to develop as digital content and the needs of the community change. Portico has a lot of room to expand its customer base and its holdings.
Subscribing libraries need to be aware of their role in building the Portico repository. They can encourage publishers to deposit electronic content at Portico. Many libraries have a “perpetual access” clause within their contracts. This clause can be satisfied through Portico’s PCA service. Library subscribers should encourage their publishers to deposit content into Portico and subscribe to PCA services.
The service is not as widely subscribed to across all U.S. academic libraries as it should be. According to the National Center for Education Statistics (NCES), there are currently 911 academic libraries at universities who offer at least one doctoral level degree in the United States.8 Portico subscribers should include a significant portion of these 911 doctoral universities. Portico currently has a total of 367 U.S. subscribers who are comprised of all types and sizes of libraries, as well as some nonlibraries. Increasing participation by academic libraries continues to be an important goal for Portico, so that it can serve the needs of the academic community.
There are issues at work within the academic community and the larger information environment that may affect Portico. First is competition from other repositories who offer preservation services to the academic community. Among these is CLOCKSS, whose mission is to store academic e-journals on behalf of publishers for the academic library community. There has been a great deal of comparison of the two services, though it is unclear if the market will choose one service over the other, or if it can support two services.
Another library repository service to which Portico is often compared is HathiTrust. However, HathiTrust does not focus on acquiring the same content as Portico. HathiTrust’s holdings may eventually be similar to Portico’s, but they will never be the same because the sources and the content are acquired differently. In addition, HathiTrust does not have the same rights permissions from publishers. So while Portico will always be able to serve content as soon as it is needed, Hathitrust must be mindful of the rights it has set up for each digital object.
Preserving content within a digital repository helps the library protect their investment. Libraries are seeing a huge increase in the costs for e-journals, as well as other electronic materials. As one can see in the illustration, costs for Hope College’s e-journals have increased from less than $50,000 to $400,000 in ten years. This is a large portion of the library’s budget. In addition, e-journals are a different type of library asset than print versions. Print journals subscriptions were paid for up front, and sat on a shelf until ready for use. Agreements with publishers for e-journals are not always guaranteed for the long-term, in some cases libraries are only paying for one year of service.
Portico helps publishers solve several delivery problems for publishers. First, Portico has built a system that provides publishers with an easy method for providing perpetual access. Offering this service encourages libraries to sign agreements with publishers and should encourage publishers to sign on with Portico. In addition, Portico provides publishers with an appropriately preserved outside copy of their digital assets. This strengthens the publisher’s internal preservation strategies. Frank Menchaca, Gale’s executive vice president for publishing, stated in a December 2009 press release that Portico provides Gale with “a secure, permanent back-up."10
ITHAKA’s recent merger with JSTOR may have repercussions for Portico. The merger is likely to strengthen ITHAKA’s revenue. JSTOR is a fiscally healthy organization with a worldwide customer base that turns an annual profit. Should ITHAKA need to make difficult economic decisions regarding what services to support, it is likely to favor sustaining JSTOR activities over those of Portico. Selling access to content also requires different decisions then housing it for the long-term. A digital repository does not need to be a separate entity, but it does need to focus on distinct preservation goals. Ithaka’s mission includes preservation, which needs to be safeguarded, as the marginalization of Portico’s preservation goals could be unfortunate.
Portico’s role as a market for and developer of digital repository tools and services has advanced the digital archiving community. Its participation in projects like NDIIP and the development of the NLM DTD have helped them to develop and build a digital archiving system that is current and reliable. It is important that they continue to work with the larger depository community, which allows them to share in the work of the larger community, and obtain funding for the development of useful tools.
Library subscribers must understand their role as advocates for Portico’s future users. Librarians need to make sure important resources are not lost. Subscribers must communicate what content Portico preserves and use the audit tool provided to ensure the content is complete and usable. Subscribers should require Portico to act transparently and submit to outside evaluations on a regular cycle.
2Portico, "ITHAKA and JSTOR Merge, Uniting Efforts to Serve the Scholarly Community," (Deccember 17, 2010). Accessed February 28, 2011, http://www.ithaka.org/about-ithaka/announcements/ithaka-and-jstor-merge-uniting-efforts-to-serve-the-scholarly-community.
3Amy Kirchoff, “RE: Portico leadership,” e-mail message to Marie Waltz, February 8, 2011.
4Ken DiFiore, "Evolving Preservation Needs: Portico Response," (paper presented at Ithaka Sustainable Scholarship Conference 2010, New York, NY, September 27, 2010). Accessed September 27, 2010, www.ithaka.org/about-ithaka/events/Preservation_K.Difiore.pptx.
6 Sheila Morrissey et al., “Portico: A Case Study in the Use of XML for the Long-Term Preservation of Digital Artifacts” (paper presented at International Symposium on XML for the Long Haul: Issues in the Long-term Preservation of XML, Montréal, Canada, August 2, 2010). In Proceedings of the International Symposium on XML for the Long Haul: Issues in the Long-term Preservation of XML. Balisage Series on Markup Technologies, vol. 6 (2010). Accessed Februrary 28, 2011, doi:10.4242/Balisage.
7 “Mission and background,” ITHAKA. Accessed February 28, 2011, http://www.ithaka.org/about/mission.htm.
8 Tai Phan, Laura Hardesty, and Denise Davis, Academic Libraries: 2008, First Look (Washington, D.C.: U.S. Department of Education, 2009). Accessed 2/28/2011, http://nces.ed.gov/pubs2010/2010348.pdf.
9 Kelly Jacobsma, "Re: Shapeshifting: What’s Happening with Scholarly Journals and What Faculty Can Do," Common Knowledge: News from Hope College Libraries Blog (July 1, 2010). Accessed February 28, 2010, http://libblog.hope.edu/2010/07/shapeshifting-whats-happening-with.html.
10"Gale and Portico Enter into an Agreement To Preserve Gale Digital Collections," Portico, December 2, 2009. Accessed February 2, 2011, http://www.portico.org/digital-preservation/news-events/news/newly-signe....
Governance: ITHAKA Board of Trustees
- Henry S. Bienen
Chairman, ITHAKA Board of Trustees
President Emeritus, Northwestern University
- Paul A. Brest
Vice-Chairman, ITHAKA Board of Trustees
President, The William and Flora Hewlett Foundation
- William G. Bowen
President Emeritus, The Andrew W. Mellon Foundation
- Nancy M. Cline
Roy E. Larsen Librarian, Harvard College
Ira H. Fuchs Executive Director, Next Generation Learning Challenges EDUCAUSE
- Kevin M. Guthrie
- Eugene Y. Lowe Jr.
Assistant to the President, Northwestern University
- Deanna B. Marcum
Associate Librarian for Library Services, Library of Congress
- W. Drake McFeely
President, W. W. Norton & Company, Inc.
- Michele Tolela Myers
President Emeritus, Sarah Lawrence College
- David Pakman
- Judith Shapiro
President and Professor of Anthropology Emerita, Barnard College
- Michael Spinella EVP, Global Content Alliances, ITHAKA
JSTOR Managing Director
- Stephen M. Stigler
Ernest DeWitt Burton Distinguished Service Professor of Statistics, University of Chicago
- Charles M. Vest
President, National Academy of Engineering
- Herbert S. Winokur, Jr
Chairman and Chief Executive Officer, Capricorn Holdings, Inc.
Portico Advisory Committee
- John Ewing
American Mathematical Society
- Kevin Guthrie
- Daniel Greenstein
University of California
- Anne R. Kenney
Cornell University Library
- Clifford Lynch
- Carol Mandel
New York University
- David M. Pilachowski
- Rebecca Simon
University of California Press
- Michael Spinella
- Suzanne E. Thorin
Syracuse University Library
- Mary Waltham
- Craig Van Dyck
John Wiley & Sons, Inc