American Association for the Advancement of Slavic Studies
National Convention, Crystal City, VA
November 16, 2001
Present: Janet Crayne (U of Michigan)
, June Farris (U of Chicago), Dima Frangulov (East View
Publications), Diana Greene (New York U), Carl Horne (Indiana
U), Jared Ingersoll (Columbia U), Sandra Levy (U of Chicago),
Tatjana Lorkovic (Yale U, Chair), Mike Markiw (Arizona State
U/U of Arizona), Larry Miller (U of Illinois, Urbana-Champaign),
Michael Neubert (Library of Congress), Dan Pennell (U of
Pittsburgh), Karen Rondestvedt (Stanford U; Secretary),
James Simon (Center for Research Libraries), Grazyna Slanda
(Harvard U), Nina Shapiro (Princeton U), Mary Stevens (U
of Toronto), Olga Tabolina (East View Publications), Patricia
Thurston (Yale U), Nadia Zilper (U of North Carolina), Janet
Zmroczek (British Library)
The meeting was called to order by Tatjana Lorkovic, who
had just been elected chair, at approximately 2:00 p.m.
She introduced members of the Executive Committee. The minutes
from the 2000 meeting were approved.
Budget Report (James Simon):
James distributed a financial statement for SEEMP for fiscal
years 2001 (ended 6/30/01) and 2002 up to the end of October.
In FY 2001 the project had revenues from membership fees
of $20,400, acquisitions of $26,880.57, travel expenses
of $641.11 and cataloging expenses of $84.92. SEEMP ended
the fiscal year with a fund balance of $37,900.71. As of
October, an additional $17,400 had come in as membership
dues; travel expenses were $149.63. The ending fund balance
as of October was $55,151.08, with commitments of $38,963.37.
Nothing had been spent in this fiscal year for acquisitions
yet; CRL’s $3000 contribution was expected in the
fourth quarter.
James’s financial statement also included a collection-by-collection
summary of commitments, expenditures and balances; and a
list of material received since October 2000. The list of
material received has been incorporated into the holdings
listed on SEEMP’s website <http://www.crl.edu/areastudies/SEEMP/index.htm>.
CRL News (James Simon):
- Bernard Reilly is CRL’s new president. He comes
to CRL from the Chicago Historical Society. Melissa Trevvett,
who previously worked at Loyola University, started in
March as Vice President/Director of Programs and Services.
Both are very supportive of foreign materials and area
studies groups.
- A task force has been at work assessing CRL’s
collections. The Center has approximately 5 million volumes,
most of which are not cataloged, including 800,000 foreign
dissertations. James said he would distribute the task
force’s report to the SEEMPLIST. It is also available
in Focus on the Center for Research Libraries 21, no.
2 (Dec. 2001/Jan. 2002), and online at http://www.crl.edu/content.asp?l1=1&l2=9&l3=13&l4=1.
- There was a cooperative collection development conference
in November 1999, and there will be another one in November
2002. Information on the upcoming conference can be found
at http://www.crl.edu/info/awcc2002/02confinfo.htm. Its
focus is general, not specifically Slavic.
Reports on current projects:
• Luckiw Collection. Mary Stevens
(University of Toronto) reported that 86 reels have been
completed. This is the end of the project. The Table of
Contents at the beginning of each reel gives a list of the
original monograph's accession numbers. The originals are
held at the Thomas Fisher Rare Book Library, U of Toronto.
The guide and index to the microfilm collection: [Boshyk,
Yury and Kiebalo, Wlodzimierz -- Publications by Ukrainian
"Displaced Persons" and Political Refugees, 1945-1954
in the Lohn Luczkiw Collection, Thomas Fisher Rare Book
Library, University of Toronto: a Bibliography. Edmonton:
Canadian Institute of Ukrainian Studies, 1988. 398 p.] will
be available in electronic format sometime in 2003 on the
following web page:
http://www.pjrc.library.utoronto.ca
• Newspapers from Former Yugoslav lands.
Janet Crayne (University of Michigan) reported that Norman
Ross (Norman Ross Publishing, Inc.) had an informal agreement
to film Vreme, but nothing had happened. She will send a
letter on behalf of SEEMP. Originally the material to be
filmed was to come from several institutions in the U.S.,
but Ross said he could do it more cheaply. She asked for
SEEMP’s approval to continue, which was granted. James
asked about filming after 1996; a proposal should be developed
by February.
• Newspapers of the October Revolution.
Michael Neubert (Library of Congress) reported that LC Photoduplication
has been filming the material. It will cost $750 instead
of the $500 approved by SEEMP, but LC will cover the difference.
LC Photodup has had a big staff cut, and they are now outsourcing
a great deal.
• Russian Regional Newspapers, pt. II (1996-2000).
Olga Tabolina (East View Publications) reported that East
View is currently working this period for the same 16 newspapers
that were filmed in the first part of the project, a total
of 80 newspaper years. Filming of two papers has been completed
so far, and the material was sent to CRL.
• Russian Right-Wing Extremist Press 1990-1999,
pt. II. Larry Miller (University of Illinois) reported
that the University of Illinois had delivered its part to
be filmed, but that UC Berkeley had not, probably because
of lack of staff. The Hoover Institution is willing to cooperate,
especially to fill gaps in runs. Hoover has a huge collection
of these materials. These materials will be listed on the
web, with holdings, rather than having individual titles
cataloged. They will also go into CRL’s Foreign Newspaper
Database. Berkeley’s material is already listed on
their website, along with other material from the independent
press. The list’s URL: http://sunsite.berkeley.edu/Bibliographies/RussianNewspapers/.
• Soviet Eurasian Pamphlets. Rob
Davis’s (New York Public Library) and James Simon’s
report was given by James. The material has been filmed
and has arrived in CRL. NYPL cataloged the titles, and CRL
will tapeload their records if possible. There are 25 reels,
and the project is essentially finished.
New project proposals:
• Russian Regional Newspapers, Part III.
The project consists of continuing filming for 2001-2003
on the 16 papers currently being filmed and add 8 new titles,
filming them also for the 2001-2003 period. The complete
proposal can be found at http://www.crl.edu/info/seemp/seempregional3.htm.
Total cost would be $24,472. Some of the newspapers covered
are in East View’s Universal Databases, others are
not.
East View would like half the cost at the beginning, then
half after all the material has been delivered. James stated
that this was not SEEMP’s usual way of paying; CRL
usually pays as the films are delivered.
SEEMP did not vote on this proposal when it was scheduled
to in February. James suggested a special ballot on the
proposal after members return to their offices.
It was then suggested to make this project into a standing
order. The present proposal was then divided into two parts,
to be voted on separately: (a) Continue the 16 papers currently
being filmed, on a standing order basis. This would cost
approximately $5000 per year. (b) Add the 8 new titles selected
by the proposers. The vote will be by e-mail after the meeting.
Discussion ensued about ongoing evaluation of this project.
One criterion might be to determine if the papers selected
were still independent and objective. But “independent
and objective” is new language that was not in the
original proposal. How does one show that a project is a
success, by the number of borrowings? CRL hasn’t tracked
this sort of thing before, but perhaps it could.
East View offers a big discount on this material for SEEMP
members and Russian libraries: $115 per reel. LC will buy
approximately 8 papers, but few others have. These papers
would not have been filmed at all without SEEMP.
• PJDA: Prerevolutionary Journals Digital
Access Project (Draft). The proposal is to digitize
some significant 19th-century Russian journals in order
to improve access to them. These journals have already been
filmed and are held by a number of libraries. The original
proposal can be found at http://www.crl.edu/areastudies/SEEMP/news/pjdaprop.pdf.
Since Miranda Remnek (University of Minnesota), the primary
proposer, was not able to attend this meeting, she sent
a list of discussion points to SEEMPLIST before the meeting.
That document is appended to these minutes. It was planned
that a vote would be held on the proposal in February.
The fact that the by-laws don’t include projects
like this is not a big problem; the by-laws can be revised
if the membership wants to do this project. Other area studies
microform projects have not revised their by-laws, but CAMP
and LAMP have approved digital projects anyway.
A digital project involves not only the initial costs,
but migrating it, etc. If SEEMP is going to do digital projects,
we need to have a policy for preserving the results.
What is the goal of this project? SEEMP doesn’t have
enough money to do the whole thing. We would need outside
funding. The proposers could write a proposal to digitize
one journal as a pilot, then SEEMP could apply for a grant.
They could do a pilot with Option 3 in the proposal (5-year
segments of two journals), to show what they can do.
The pilot should include full OCR. OCR-ing this material
will be difficult because of the old orthography, among
other things. East View experimented on some of the material.
There is a program to convert old orthography to new. For
a pilot it wouldn’t even be necessary to do a long
run of a journal.
NEH is interested in digitization of foreign language material.
There is also Title VI. For either of them a good proposal
will be necessary.
The proposal doesn’t include decisions about what
to film, e.g., letters to the editor, answers, reviews?
What gets tagged? SEEMP needs to give Miranda input.
Miranda should consult with Andy Spencer, if she hasn’t
already. (He is involved in Indiana’s digitizing of
Letopis’ zhurnal’nykh statei.)
The group decided to have further discussion of this proposal
on SEEMPLIST.
• Russian regional archival guides - proposal
idea (Jared Ingersoll, Columbia University). This
could be done like the Russian Regional Newspapers project,
with East View. Princeton has some; Jared should consult
with Pat Grimsted. There was good support from the group
for this idea, and Jared will work up a proposal.
The final deadline for proposals in February 15, 2002.
Other business:
• The need for a microform registry was discussed
again (after the discussion at last year’s meeting).
We need a central place where we can find who is filming
what and who plans to film what. If a vendor plans to film
something, that information isn’t entered into any
national database. There is a problem of vendors saying
they will film something in order to pre-empt others. We
need to ask vendors if they would be willing to send information
to such a registry.
• Cameras from LC in the Russian State Library are
now owned by RSL. The only funding RSL has for filming from
LC is what they buy. So far they’ve filmed mostly
manuscript materials. Mike will make a list of what LC bought
from them.
• By-law amendments for next year:
(a) The newly-elected chair will assume the position after
the annual meeting, not before it or at it.
(b) Add digital material.
The meeting was adjourned at 4:00.
Respectfully submitted,
Karen Rondestvedt
Secretary
PREREVOLUTIONARY JOURNALS DIGITAL ACCESS PROPOSAL
UPDATE (November 10, 2001)
Miranda Remnek
m-remn@tc.umn.edu
This project, if taken beyond the pilot, might become rather
ambitious. So the proposal should be considered a work in
progress, deserving more discussion. Many questions are
not included in this update (programming, serving, maintenance,
etc). I took a stab at only eight basic issues…
1. Should SEEMP monies be used for a digital, as opposed
to a microfilm, project?
Comment Several feel that this is appropriate. And most
other colleagues will agree that Slavic librarians will
want to work together in the digital arena as they have
on microfilms. Thirdly, the SEEMP structure is the most
visible structure we have for community support of Slavic
library projects.
If we can agree on this, the next question prompted by
the proposal becomes:
2. Why would we want to deal with a body of material that
is widely available in fiche format?
Comment:Certainly a body of wisdom touts new availability
of marginalized material as a digital advantage. For example,
a corpus of 18th-C women’s writing digitized at Virginia
in the mid 1990s for use in course work was obtainable only
with difficulty before. More recently the concept of digitization
for “access only” (i.e. access to collections
already available) was criticized at the March 2001 NEDCC
seminar on Preservation Options in a Digital World, the
idea being that digital projects should produce high-quality
copies of material needing to be preserved.
However, some feel that librarians should look at general,
not special collections. Mike Neubert has pointed us to
a new CLIR report that takes this approach. And this is
what I’ve done with projects at Minnesota, since as
researcher as well as librarian I think technology presents
a wonderful opportunity for making research material more
widely useable, even when it is available in print/fiche
format. In the Russian area we have a precedent with Indiana’s
work on the Letopis zhurnal’nykh statei. In the case
of 19thC journals, IDC fiche are also widely available (though
not “preservation quality” fiche, as I learned
from NEDCC), but if greater access is provided, no body
of material is more likely to benefit researchers of 19thC
Russia.
If we can agree on the materials we want to include (at
least at a general level), the next questions involve presentation
of the options, and balloting:
3. Why was such a lengthy proposal developed, with so many
options?
Comment:We thought most SEEMP members would appreciate
a document that discussed selection and processing options
at some length, and provided a range of estimates. However,
the resulting document and websites (all the result of considerable
work) did not allow for simple balloting. Some people still
feel the proposal should balloted as is, but it’s
probably best to revise it first to take account of new
thinking, and to make the balloting simpler.
If we agree that the proposal should be revised before
balloting next spring, the next group of questions involve
aims and methodology. The first question might be:
4. Do we want to save money by producing materials to read
and print online but not search?
Comment:People tell me that many librarians take this view,
especially if it means being able to afford a large body
of material. Maybe I’m spoiled from having used SGML-aware
Dynaweb for several years, but it seems to me that full-text
searching is essential.
If we agree that our output should be searchable, the next
question is:
5. Do we want to spend money on accurate OCR?
Comment:The proposal implies that if we pursued OCR for
full-text searching, we would produce accurate OCR. This
would also enable correct display of the searchable text.
However, producing clean OCR is expensive, whether it’s
done by scanning/proofreading, or offshore keyboarding (see
below). Another option--which should probably appear in
the revised proposal --is to produce dirty OCR, and hide
it from view. The user would see only the GIF page image
(as in our option 3), and links to the pages with text.
One reason for doing this is that, according to recent data
from the Information Science Research Institute (which routinely
surveys document retrieval engines worldwide), the search
results from a dirty OCR file are no worse than those from
clean OCR, especially since retrieval engines still don’t
do well with Cyrillic or Unicode.
But if we decide that we want to stay with accurate OCR,
the next question is:
6. How do we want to achieve accurate OCR: by scanning or
keyboarding?
Comment:My unit scans in-house from original pages; for
Cyrillic text we use Finereader. But since the data for
this project is available in IDC fiche, we suggested in
the proposal that we experiment with outsourced scanning/recognition
(fiche scanners are expensive). However, scanning is always
tricky; when we presented the sample pages we deliberately
left the OCR uncorrected, to demonstrate this. Another solution--not
yet included in the proposal--would involve offshore keyboarding.
Here there are options: we could outsource to East View
(specifically its Moscow unit), or perhaps, if we needed
more capacity, to vendors like Pacific Data Conversion Corporation
(uses offshore labor and has experience with Cyrillic text).
But whether we decide to produce dirty or accurate OCR,
the next question is
7. What do we use as our source, IDC fiche or printed volumes?
Comment:NEDCC recommends using the original if possible,
and we may decide to do this (the pilot project aimed to
test both approaches). We first thought it might be easier
to produce TIFF images from fiche, but there are two problems:
the varying quality of the fiche, and the question of fleshing
out financial arrangements with IDC if we use their product.
8. Finally, what about presentation?
Comment:When we developed the proposal there was much support
for PDF files. One advantage is that one can encapsulate
and print off an article, etc, as a unit. So we developed
a search feature to use with Cyrillic PDF files (admittedly
a bit clunky, but with more time…) There’s also
Oracle’s Intermedia, maybe an option here. But for
easy access and searching it may be better to follow our
3rd option: to present the page (as does JSTOR) as a GIF
rather than PDF image. Then we could accompany it with a
searchable html page (highlighting to be developed)--or
hide the page in a database (MySQL or Oracle) if the OCR
is not to be corrected.
Return to SEEMP Recent Meetings
Page |