The Times of India
In 2010 ProQuest, LLC, released a digital version of the highest circulating English-language paper printed in India, The Times of India, covering the years from 1848-2001, to be made available in the ProQuest Historical Newspapers digital collection.
Sources for this review include information publicly posted or obtained directly from the publisher, data collected by staff at CRL and its member institutions, and examination of the digital collection when possible. Other sources are noted where cited.
Center for Research Libraries
- Carolyn Ciesla, Research Assistant
- Virginia Kerr, Digital Program Manager
- James Simon, Director, Global Resources Network
The Times of India was founded on November 3, 1838. It became a daily edition in 1850, and is the highest circulating English-language paper in India. The Times of India is published simultaneously in multiple editions (Mumbai, Delhi, Calcutta, and Ahmedabad among them), with some content differences in each.
The Times of India is published by Bennett, Coleman & Co. Ltd. as part of The Times Group. The Times Group publishes the business daily The Economic Times, the tabloid-style newspaper Mirror in several cities, and has other newspapers in Hindi & Marathi. The Times Group also owns Radio Mirchi, an FM Radio network and a business TV channel ET NOW and English-language news channel TIMES NOW. The group’s Times Internet Limited offers email, social networking and a host of other services & sites. It is also significant in book publishing, music publishing, outdoor media and event management.
Source of the digitized content
ProQuest reports the digital content was entirely derived from the microfilm produced by The Times of India publisher Bennett Coleman & Co., Ltd. For a description of the various microfilm efforts and holdings, see Appendix I: “International Coalition on Newspapers (ICON) Title Report: The Times of India microfilm holdings.” Microfilm holdings are certain to be consolidated at local institutions as a result of the search and browse features provided by digitization.
While the browse list of publications in the database describes the chief content as the New Delhi edition of the Times from 1861 onwards, the actual scanned content is from the Bombay (and later, Mumbai) edition, in addition to 2 preceding titles beginning with the Bombay Times and Journal of Commerce in 1838. ProQuest reports a plan to provide a “moving wall” of contemporary content, with the most recent issues in the initial release from 2001. A new year will be added to the database every subsequent year (2002 issues will be added in 2011, etc.). An analysis of the product’s holdings as well as a comparison to available microfilm is currently being conducted by the Center for Research Libraries and will be available in the coming weeks.
Overall, the layout and content is similar to all other titles in the ProQuest Historical Newspapers product. In general, the image quality is adequate, allowing for detailed zooming in on images, as well as providing accurate enough portrayals of the entire page in page map images. A few problems were noted, including some fuzziness, too dark or too light gutters, sections of faded text, and tearing in what appears to be the original paper. All of these are to be expected when working with older newspaper titles and microfilm. However, on occasion one also sees skewed or cut content in ProQuest’s presentation of individual articles. How any of these issues could affect OCR or result sets is unclear.
Timetable for release of the database
A preliminary release was available in August 2010. The full collection through 2001 is expected to be accessible by the end of 2010.
ProQuest provides information on all possible fields which may be attached to each object in their digital collections. ProQuest also reports that they will permit crawling of the Historical Newspaper databases under special circumstances and at an additional charge.
The level of OCR accuracy seems at least average for historical material. Any level of text searching will significantly improve access, since The Times of India has not been comprehensively indexed anywhere.
The value of having indexing for a South Asian newspaper cannot be over-emphasized as (based on some preliminary research) no South Asian newspaper has undergone comprehensive indexing, even in print. There are of course efforts to index newspapers (Guide to Indian Periodical Literature, Index Indiana, Hindu Index, etc.) and also some other emerging online newspaper products (the most notable recent contribution being the World Newspaper Archive) but these have been selective in their content and/or in their chronologic coverage. Furthermore, having even one newspaper indexed for a large span of years (the entirety of the Times of India in this case) can benefit access to a host of other papers as one can locate major news items in the online source and then use that information (date, etc.) to approach other sources (including those not in English, those archived in print and those in microform). Of special value in the ProQuest Historical Newspapers is that, unlike many traditional and/or print based newspaper indices, they have chosen to index each and every component of the newspaper—articles, editorials, advertisements—which allows new kinds of queries and comparisons to be efficiently conducted.
Technical platform & interface
The Times expands the international content of the ProQuest Historical Newspapers digital collection, which includes The Guardian and The Observer, as well as U.S. titles such as The New York Times and The Wall Street Journal. ProQuest reports that The Times will be migrated to ProQuest’s "new unified platform, allowing content to be cross-searched and integrated within a library’s entire ProQuest collection." The public release of this interface is anticipated in January 2011.
The ProQuest “classic” interface available at the time of release in 2010 is simple and allows for fast and efficient searching of the full text of the newspaper. Search results are presented in an organized and easily sorted format. As with all of the other titles within the Historical Newspapers product, searches can be limited by date range, document type, location within the document, and page number. Display results are presented as three options: abstract, page map, and article image.
While every part of the newspaper is indexed and as such users can quickly access articles, book reviews, editorials, advertisements, etc., it would be helpful if ProQuest provided a list of the “content types” for selection in the advanced search pull-downs as this is not completely clear. If further indexing were possible, adding granularity to some of the categories (“matrimonials” and/or “entertainment” within the “classifieds” for example) would be beneficial.
The default choices for search types, results, and browsing options assume a particular kind of researcher, generally using discrete searches to find articles on known subjects. For the researcher wanting to browse the full contents of the paper, these assumptions may be less than ideal. While it is easy to link out to see a retrieved article in context as it appears on the page (“page view”), linking from an article or a page to an entire daily issue is less intuitive. Page browsing or selection is not available from article images, but only when viewing page images. While not impossible, “reading” a paper from cover to cover is not especially easy in this presentation.
Taking an alternate approach, the user can select an issue by date browsing under the “Publications” listing tab, but once having selected an issue one needs to “page sort” in order to have the contents of an issue presented in page number order. Then the user must return to the master list to select the next issue to view.
The ProQuest classic platform is an OpenURL target and source compliant with the San Antonio profile level 1 (SAP1).
Every institution has the option as purchasers of the Perpetual Archive License (as opposed to subscribers) to obtain at cost of transfer the full set of files for local hosting (including page image TIFF files of articles and full pages, metadata, ASCII text and edited ASCII text for headlines and captions, but not the database, search engine, or user interface). As of 2009, ProQuest no longer offers a Permanent Archive Addendum, which provided assurances on access to the archival files ProQuest maintains at Iron Mountain.
For its Historical Newspapers product ProQuest offers a Perpetual Archive License (with ongoing Continuing Service Fee) and subscription models. The cost of purchase includes access to the non-public domain portion of the content. Major ARL libraries have been offered widely different prices, apparently depending on FTE numbers and on the extent of other ProQuest products held by the library or offered as a purchase bundle. Some portion of the annual fees is expected to help support updated content.
A single licensing agreement is used for all ProQuest databases. See http://www.proquest.com/en-US/site/terms_conditions.shtml The lengthy agreement is fairly basic in its provisions, indicating: the products to be purchased or subscribed to, subscription start and end dates, price of products/subscriptions, authorized users (e.g., staff and students or patrons), type of access (e.g., on-site and/or remote access, simultaneous users, etc.), permitted uses (e.g., fair use, digital and print copies, e-reserves, ILL, etc.), and conditions for termination. It also contains the standard contractual provisions for limited warranty and disclaimer of warranty, limitation of liability, and privacy. Note that since copyright restrictions for original sources vary, ProQuest’s policy on outside use of the database (such as e-reserves) varies with the particular database.
It will be important to obtain more information from ProQuest on their future plans for archiving the content of this collection and ensuring sustainable access.
Direct from Publisher
|Subjects covered||India: Newspapers|
|P||Geographic coverage||Mumbai, Delhi, Calcutta, Ahmedabad|
|Total pages||1 million|
|Digital collection launch date||2010|
|Browser compatibility||Internet Explorer 7.0 or higher; Firefox 3.5 or higher; Safari 4 (Mac); Google Chrome|
|Authentication options||Secure access via the following authentication methods: IP Recognition Password Barcode Referring URL Embedded URL|
|Archiving solution – master files||NA|
|P||Archiving solution – derivative files||NA|
|P||Availability in web discovery tools||N (not at present time)|
|Open URL target||Y|
|Federated searching, z39.50||Y|
|P||Local host option||Y|
|P||Search full text||Y|
|P||Advanced search (fielded)||Y|
|P||Search within results||N|
|Limit results by dates and/or document types||Y|
|Display highlighted search terms||Y|
|P||Display snippet -- search term in context||Y|
|P||Download PDF||Y (PDFs are available for single pages or articles, not for entire issues)|
|Print full document||Y|
|Restrictions on use||Y|
|P||Publisher / Distributor||Bennett Coleman and Co, Ltd.; ProQuest, LLC|
|P||Address||Dr. Dadabhoy Naoroji Road The Times of India Building Fort, Mumbai, 400001 India|
|CRL Profile of Publisher|
In 2008 Alison Jones of Tufts University investigated the uses of large digital collections of retrospective newspapers in scholarly research, noting the challenges they can present specifically to linguistic analysis:
One major difficulty…was that the searching defaults are set up with historians in mind. “Historical Newspapers caters mainly to historians and other social scientists who are looking to find as many references as possible to a themes or keywords…to maximize the number of hits . . . the Proquest search engine automatically includes a plural look-up feature, conflating hits for the plural form of any singular word entered. This obviously presents an obstacle to linguists looking to distinguish inflected from bare forms.” Despite these caveats, [a linguist] concluded that these databases “offer invaluable information about language usage in…newspaper writing across a period that is not yet well covered by principled linguistic corpora” and they offer great insight into understanding changing patterns of standard usage in English.
Source: Alison Jones, “The Many Uses of Newspapers,” Technical Report for IMLS Project on “The Richmond Daily Dispatch”, June 20, 2008. http://dlxs.richmond.edu/d/ddr/docs/papers/usesofnewspapers.pdf (viewed 4/8/09)
Appendix A: International Coalition on Newspapers (ICON) Title Report: The Times of India microfilm holdings (PDF)