Center for Research Libraries
about CRL membership collections preservation projects & programs sales
Logo and Seal of the Center for Research Libraries
quick links

Collaborative Programs
Close this browser window to return to the CRL web site

Slavic and East European Microform Project

Business Meeting Minutes

American Association for the Advancement of Slavic Studies
National Convention, Crystal City, VA
November 16, 2001

Present: Janet Crayne (U of Michigan) , June Farris (U of Chicago), Dima Frangulov (East View Publications), Diana Greene (New York U), Carl Horne (Indiana U), Jared Ingersoll (Columbia U), Sandra Levy (U of Chicago), Tatjana Lorkovic (Yale U, Chair), Mike Markiw (Arizona State U/U of Arizona), Larry Miller (U of Illinois, Urbana-Champaign), Michael Neubert (Library of Congress), Dan Pennell (U of Pittsburgh), Karen Rondestvedt (Stanford U; Secretary), James Simon (Center for Research Libraries), Grazyna Slanda (Harvard U), Nina Shapiro (Princeton U), Mary Stevens (U of Toronto), Olga Tabolina (East View Publications), Patricia Thurston (Yale U), Nadia Zilper (U of North Carolina), Janet Zmroczek (British Library)

The meeting was called to order by Tatjana Lorkovic, who had just been elected chair, at approximately 2:00 p.m. She introduced members of the Executive Committee. The minutes from the 2000 meeting were approved.

Budget Report (James Simon):

James distributed a financial statement for SEEMP for fiscal years 2001 (ended 6/30/01) and 2002 up to the end of October. In FY 2001 the project had revenues from membership fees of $20,400, acquisitions of $26,880.57, travel expenses of $641.11 and cataloging expenses of $84.92. SEEMP ended the fiscal year with a fund balance of $37,900.71. As of October, an additional $17,400 had come in as membership dues; travel expenses were $149.63. The ending fund balance as of October was $55,151.08, with commitments of $38,963.37. Nothing had been spent in this fiscal year for acquisitions yet; CRL’s $3000 contribution was expected in the fourth quarter.

James’s financial statement also included a collection-by-collection summary of commitments, expenditures and balances; and a list of material received since October 2000. The list of material received has been incorporated into the holdings listed on SEEMP’s website <http://www.crl.edu/areastudies/SEEMP/index.htm>.

CRL News (James Simon):

  • Bernard Reilly is CRL’s new president. He comes to CRL from the Chicago Historical Society. Melissa Trevvett, who previously worked at Loyola University, started in March as Vice President/Director of Programs and Services. Both are very supportive of foreign materials and area studies groups.
  • A task force has been at work assessing CRL’s collections. The Center has approximately 5 million volumes, most of which are not cataloged, including 800,000 foreign dissertations. James said he would distribute the task force’s report to the SEEMPLIST. It is also available in Focus on the Center for Research Libraries 21, no. 2 (Dec. 2001/Jan. 2002), and online at http://www.crl.edu/content.asp?l1=1&l2=9&l3=13&l4=1.
  • There was a cooperative collection development conference in November 1999, and there will be another one in November 2002. Information on the upcoming conference can be found at http://www.crl.edu/info/awcc2002/02confinfo.htm. Its focus is general, not specifically Slavic.

Reports on current projects:

Luckiw Collection. Mary Stevens (University of Toronto) reported that 86 reels have been completed. This is the end of the project. The Table of Contents at the beginning of each reel gives a list of the original monograph's accession numbers. The originals are held at the Thomas Fisher Rare Book Library, U of Toronto.
The guide and index to the microfilm collection: [Boshyk, Yury and Kiebalo, Wlodzimierz -- Publications by Ukrainian "Displaced Persons" and Political Refugees, 1945-1954 in the Lohn Luczkiw Collection, Thomas Fisher Rare Book Library, University of Toronto: a Bibliography. Edmonton: Canadian Institute of Ukrainian Studies, 1988. 398 p.] will be available in electronic format sometime in 2003 on the following web page:
http://www.pjrc.library.utoronto.ca

Newspapers from Former Yugoslav lands. Janet Crayne (University of Michigan) reported that Norman Ross (Norman Ross Publishing, Inc.) had an informal agreement to film Vreme, but nothing had happened. She will send a letter on behalf of SEEMP. Originally the material to be filmed was to come from several institutions in the U.S., but Ross said he could do it more cheaply. She asked for SEEMP’s approval to continue, which was granted. James asked about filming after 1996; a proposal should be developed by February.

Newspapers of the October Revolution. Michael Neubert (Library of Congress) reported that LC Photoduplication has been filming the material. It will cost $750 instead of the $500 approved by SEEMP, but LC will cover the difference. LC Photodup has had a big staff cut, and they are now outsourcing a great deal.

Russian Regional Newspapers, pt. II (1996-2000). Olga Tabolina (East View Publications) reported that East View is currently working this period for the same 16 newspapers that were filmed in the first part of the project, a total of 80 newspaper years. Filming of two papers has been completed so far, and the material was sent to CRL.

Russian Right-Wing Extremist Press 1990-1999, pt. II. Larry Miller (University of Illinois) reported that the University of Illinois had delivered its part to be filmed, but that UC Berkeley had not, probably because of lack of staff. The Hoover Institution is willing to cooperate, especially to fill gaps in runs. Hoover has a huge collection of these materials. These materials will be listed on the web, with holdings, rather than having individual titles cataloged. They will also go into CRL’s Foreign Newspaper Database. Berkeley’s material is already listed on their website, along with other material from the independent press. The list’s URL: http://sunsite.berkeley.edu/Bibliographies/RussianNewspapers/.

Soviet Eurasian Pamphlets. Rob Davis’s (New York Public Library) and James Simon’s report was given by James. The material has been filmed and has arrived in CRL. NYPL cataloged the titles, and CRL will tapeload their records if possible. There are 25 reels, and the project is essentially finished.

New project proposals:

Russian Regional Newspapers, Part III. The project consists of continuing filming for 2001-2003 on the 16 papers currently being filmed and add 8 new titles, filming them also for the 2001-2003 period. The complete proposal can be found at http://www.crl.edu/info/seemp/seempregional3.htm. Total cost would be $24,472. Some of the newspapers covered are in East View’s Universal Databases, others are not.

East View would like half the cost at the beginning, then half after all the material has been delivered. James stated that this was not SEEMP’s usual way of paying; CRL usually pays as the films are delivered.

SEEMP did not vote on this proposal when it was scheduled to in February. James suggested a special ballot on the proposal after members return to their offices.

It was then suggested to make this project into a standing order. The present proposal was then divided into two parts, to be voted on separately: (a) Continue the 16 papers currently being filmed, on a standing order basis. This would cost approximately $5000 per year. (b) Add the 8 new titles selected by the proposers. The vote will be by e-mail after the meeting.

Discussion ensued about ongoing evaluation of this project. One criterion might be to determine if the papers selected were still independent and objective. But “independent and objective” is new language that was not in the original proposal. How does one show that a project is a success, by the number of borrowings? CRL hasn’t tracked this sort of thing before, but perhaps it could.

East View offers a big discount on this material for SEEMP members and Russian libraries: $115 per reel. LC will buy approximately 8 papers, but few others have. These papers would not have been filmed at all without SEEMP.

PJDA: Prerevolutionary Journals Digital Access Project (Draft). The proposal is to digitize some significant 19th-century Russian journals in order to improve access to them. These journals have already been filmed and are held by a number of libraries. The original proposal can be found at http://www.crl.edu/areastudies/SEEMP/news/pjdaprop.pdf. Since Miranda Remnek (University of Minnesota), the primary proposer, was not able to attend this meeting, she sent a list of discussion points to SEEMPLIST before the meeting. That document is appended to these minutes. It was planned that a vote would be held on the proposal in February.

The fact that the by-laws don’t include projects like this is not a big problem; the by-laws can be revised if the membership wants to do this project. Other area studies microform projects have not revised their by-laws, but CAMP and LAMP have approved digital projects anyway.

A digital project involves not only the initial costs, but migrating it, etc. If SEEMP is going to do digital projects, we need to have a policy for preserving the results.

What is the goal of this project? SEEMP doesn’t have enough money to do the whole thing. We would need outside funding. The proposers could write a proposal to digitize one journal as a pilot, then SEEMP could apply for a grant. They could do a pilot with Option 3 in the proposal (5-year segments of two journals), to show what they can do.

The pilot should include full OCR. OCR-ing this material will be difficult because of the old orthography, among other things. East View experimented on some of the material. There is a program to convert old orthography to new. For a pilot it wouldn’t even be necessary to do a long run of a journal.

NEH is interested in digitization of foreign language material. There is also Title VI. For either of them a good proposal will be necessary.

The proposal doesn’t include decisions about what to film, e.g., letters to the editor, answers, reviews? What gets tagged? SEEMP needs to give Miranda input.

Miranda should consult with Andy Spencer, if she hasn’t already. (He is involved in Indiana’s digitizing of Letopis’ zhurnal’nykh statei.)

The group decided to have further discussion of this proposal on SEEMPLIST.

Russian regional archival guides - proposal idea (Jared Ingersoll, Columbia University). This could be done like the Russian Regional Newspapers project, with East View. Princeton has some; Jared should consult with Pat Grimsted. There was good support from the group for this idea, and Jared will work up a proposal.

The final deadline for proposals in February 15, 2002.

Other business:

• The need for a microform registry was discussed again (after the discussion at last year’s meeting). We need a central place where we can find who is filming what and who plans to film what. If a vendor plans to film something, that information isn’t entered into any national database. There is a problem of vendors saying they will film something in order to pre-empt others. We need to ask vendors if they would be willing to send information to such a registry.

• Cameras from LC in the Russian State Library are now owned by RSL. The only funding RSL has for filming from LC is what they buy. So far they’ve filmed mostly manuscript materials. Mike will make a list of what LC bought from them.

By-law amendments for next year:

(a) The newly-elected chair will assume the position after the annual meeting, not before it or at it.

(b) Add digital material.

The meeting was adjourned at 4:00.

Respectfully submitted,
Karen Rondestvedt
Secretary

PREREVOLUTIONARY JOURNALS DIGITAL ACCESS PROPOSAL
UPDATE (November 10, 2001)
Miranda Remnek
m-remn@tc.umn.edu

This project, if taken beyond the pilot, might become rather ambitious. So the proposal should be considered a work in progress, deserving more discussion. Many questions are not included in this update (programming, serving, maintenance, etc). I took a stab at only eight basic issues…

1. Should SEEMP monies be used for a digital, as opposed to a microfilm, project?

Comment Several feel that this is appropriate. And most other colleagues will agree that Slavic librarians will want to work together in the digital arena as they have on microfilms. Thirdly, the SEEMP structure is the most visible structure we have for community support of Slavic library projects.

If we can agree on this, the next question prompted by the proposal becomes:

2. Why would we want to deal with a body of material that is widely available in fiche format?

Comment:Certainly a body of wisdom touts new availability of marginalized material as a digital advantage. For example, a corpus of 18th-C women’s writing digitized at Virginia in the mid 1990s for use in course work was obtainable only with difficulty before. More recently the concept of digitization for “access only” (i.e. access to collections already available) was criticized at the March 2001 NEDCC seminar on Preservation Options in a Digital World, the idea being that digital projects should produce high-quality copies of material needing to be preserved.

However, some feel that librarians should look at general, not special collections. Mike Neubert has pointed us to a new CLIR report that takes this approach. And this is what I’ve done with projects at Minnesota, since as researcher as well as librarian I think technology presents a wonderful opportunity for making research material more widely useable, even when it is available in print/fiche format. In the Russian area we have a precedent with Indiana’s work on the Letopis zhurnal’nykh statei. In the case of 19thC journals, IDC fiche are also widely available (though not “preservation quality” fiche, as I learned from NEDCC), but if greater access is provided, no body of material is more likely to benefit researchers of 19thC Russia.

If we can agree on the materials we want to include (at least at a general level), the next questions involve presentation of the options, and balloting:

3. Why was such a lengthy proposal developed, with so many options?

Comment:We thought most SEEMP members would appreciate a document that discussed selection and processing options at some length, and provided a range of estimates. However, the resulting document and websites (all the result of considerable work) did not allow for simple balloting. Some people still feel the proposal should balloted as is, but it’s probably best to revise it first to take account of new thinking, and to make the balloting simpler.

If we agree that the proposal should be revised before balloting next spring, the next group of questions involve aims and methodology. The first question might be:

4. Do we want to save money by producing materials to read and print online but not search?

Comment:People tell me that many librarians take this view, especially if it means being able to afford a large body of material. Maybe I’m spoiled from having used SGML-aware Dynaweb for several years, but it seems to me that full-text searching is essential.

If we agree that our output should be searchable, the next question is:

5. Do we want to spend money on accurate OCR?

Comment:The proposal implies that if we pursued OCR for full-text searching, we would produce accurate OCR. This would also enable correct display of the searchable text. However, producing clean OCR is expensive, whether it’s done by scanning/proofreading, or offshore keyboarding (see below). Another option--which should probably appear in the revised proposal --is to produce dirty OCR, and hide it from view. The user would see only the GIF page image (as in our option 3), and links to the pages with text. One reason for doing this is that, according to recent data from the Information Science Research Institute (which routinely surveys document retrieval engines worldwide), the search results from a dirty OCR file are no worse than those from clean OCR, especially since retrieval engines still don’t do well with Cyrillic or Unicode.

But if we decide that we want to stay with accurate OCR, the next question is:

6. How do we want to achieve accurate OCR: by scanning or keyboarding?

Comment:My unit scans in-house from original pages; for Cyrillic text we use Finereader. But since the data for this project is available in IDC fiche, we suggested in the proposal that we experiment with outsourced scanning/recognition (fiche scanners are expensive). However, scanning is always tricky; when we presented the sample pages we deliberately left the OCR uncorrected, to demonstrate this. Another solution--not yet included in the proposal--would involve offshore keyboarding. Here there are options: we could outsource to East View (specifically its Moscow unit), or perhaps, if we needed more capacity, to vendors like Pacific Data Conversion Corporation (uses offshore labor and has experience with Cyrillic text).

But whether we decide to produce dirty or accurate OCR, the next question is

7. What do we use as our source, IDC fiche or printed volumes?

Comment:NEDCC recommends using the original if possible, and we may decide to do this (the pilot project aimed to test both approaches). We first thought it might be easier to produce TIFF images from fiche, but there are two problems: the varying quality of the fiche, and the question of fleshing out financial arrangements with IDC if we use their product.

8. Finally, what about presentation?

Comment:When we developed the proposal there was much support for PDF files. One advantage is that one can encapsulate and print off an article, etc, as a unit. So we developed a search feature to use with Cyrillic PDF files (admittedly a bit clunky, but with more time…) There’s also Oracle’s Intermedia, maybe an option here. But for easy access and searching it may be better to follow our 3rd option: to present the page (as does JSTOR) as a GIF rather than PDF image. Then we could accompany it with a searchable html page (highlighting to be developed)--or hide the page in a database (MySQL or Oracle) if the OCR is not to be corrected.

Return to SEEMP Recent Meetings Page
Last updated 05/24/2004
search the site site map contact us feedback help