10th Prof. S. Dasgupta Lecture (2008)

From Automation to Transformation: Looking at our Profesional Meadow by Dr. Usha Mujoo Munshi


The President of the Association, other executive members, visionaries and dignitaries, distinguished guests, and dear students. At the outset, I would like to congratulate the Delhi Library Association (DLA) for its successful completion of 70 years of existence. An organization with firm commitment, glorious past and a bright future.

It is both an honour and privilege for me to participate in the 70th Foundation Day Celebration of the Delhi Library Association. I also take this opportunity to thank the Association for inviting me to deliver the 10th Professor S Dasgupta Lecture. I am indeed honoured to be here this morning.

When I was asked to deliver this lecture with an open invitation for the lecture theme, I wondered what is it that I can speak that will make sense to this august gathering. I came up with the theme 'From Automation to Transformation: Looking at Our Professional Meadow’. The reason for this (at least I thought) was since it would encompass a much wider perspective of the professional metamorphosis that we witness today, whose founding pillars were persons like Professor S Dasgupta, who were personalities personified, and whose images were larger than life. Well! whether it puts the message across, and I succeed in my Endeavour, I think I leave that to you.

1.0 PROLOGUE TO TODAY'S THEME

The libraries have been constantly endeavoring to provide information services to the users they serve. The range and flavor of these services have changed over the years primarily attributed to the developments \ in Information Technology more appropriately Internet technologies. The libraries particularly academic and special libraries are committed to improving the accessibility and retrieval of high quality, authoritative, published literature. Thus working out solutions that aid users in efficiently locating and retrieving information that support users in their area of activities have been high on the agenda of libraries and librarians.

Change is inevitable - for sustenance and development. Libraries have evolved into a new avatar from being collectors, lenders and locaters to innovators and innovative service providers, by undergoing several changes to accommodate a suite of information services in global environment and satisfy users needs. The ways in which information technology developments have changed the information dissemination in an academic library over the last few decades, speculates about further changes that perhaps would be an ongoing process.

2.0 KEY PHASES OF TRANSFORMATON FROM AUTOMATON

Information technology has profoundly changed all aspects of higher education and scholarship, and these changes continue to open out even today. Innovation and transformation for libraries particularly in the higher academic sector take place within this broader context; libraries cannot be considered in isolation from this context of this information technology revolution. The changes are characterized by three blocks that is:

Automation: In terms of modernization of the library that started with a specific purpose to facilitate access.

Innovation: Researching, reaching out and experimenting with new capabilities that the technology makes possible

Transformation: Through the application of these innovative capabilities, resulting in altering the characteristics of the organization

These changes can be broadly viewed in multi pronged contexts that evolved effectively to meet the changes in user information needs. Figure I depict broadly this scenario.

It progressed from individual regional facilities to a centralized and to virtual environment offering Web-based solutions. This model has evolved to effectively meet the changes in user information needs. User service has become the goal of the information rendering services. These emerging changes and platforms provide the library staff with greater opportunity to provide high-touch research and reference services to targeted user groups.


"When simple change becomes transformational change, the desire for continuity becomes a dysfunctional mirage." The Mirage of Continuity (1999) Hawkins & Battin

The considerable range of changes created lot of questions for Libraries. Most essentially, questions about what signifies the core of scholarly communication that they must manage, provide access to, organize, acquire, and preserve.

Automation

Augmentation Transformation

. Computerizing library operations

IT Infrastructure

at all levels

 

Electronic

Digital

Virtual

 

 

 

 

. Public access catalogue

 

Access to online products.

 

 

Collection consolidation-of

 

 

 

regional and worldwide

 

 

 

resources

 

 

 

Numerous products added to

 

 

 

suite of information

 

 

 

solutions

. Host-based automated library

 

 

Intranet portal with intranet

database management system

 

 

search

 

 

 

Catalogue system

 

 

 

Content management system

 

 

 

Document management system

. Regional access to automated

 

 

Seamless information with

library database management system

 

 

innovative technologies

& other library-managed databases

 

 

Federated Search, Harvesting,

 

 

 

Gathering

 

 

 

Interactions of User with the

 

 

 

Library occurs through intranet/ internet portal.

 

. Online catalogue released with

 

 

ILS is (almost) lifeless without

automated workflows

 

 

. Interoperability / Standards

 

 

 

. Web 2.0

 

 

 

. Lib 2.0

 

 

 

. Blogs

 

 

 

.Wikis

 

 

 

. Syndication (RSS)

 

 

 

. Tagging

 

 

 

. Social Networking

. Shift to electronic content

 

 

All Integrated beyond imagination (IT,IS,IR)

(e-publishing)

 

 

 

Trend Changed? Automation gone beyond existing library services and activities

 

 

Can library systems work like Google or yahoo?

 

 

 

 

 

 

"The future belongs to neither the conduit or content players, but to those who control the filtering, searching and sense-making tools we will rely on to navigate through the expanses of cyberspace"-Paul Saffo, Institute for the Future

. Besides, what makes up the raw material of future scholarship that must also be collected, organized, and archived? Undoubtedly, this seems to be going far beyond the output of the traditional scholarly publishers and also the concepts of fixed, published, printed works. There is so much of the new content outside of the library and outside of the entire system of publishing that it is unclear how much responsibility the libraries can or should take for this material or how they should go about taking that responsibility. While on one hand it depicts the colossal information explosion scenario, on the other hand it also focuses on the academic libraries dilemma of managing these resources with scare means. The blending of traditional resources with new electronic resources and keeping pace with more and more quality, relevant and core resources is a real problem, challenge, and opportunity for the libraries and library managers. Harnessing the benefits of technological tools and techniques are digital libraries coming to the forefront with effective search mechanisms and harvesting services across repositories in true sense of the term. Hence there is a undulating situation formed by these factors that is schematically depicted below.

Fig II. Changing Information Delivery Landscape

Hence working in an environment that is information and tool rich, the libraries and the librarians are expected to learn many processes for retrieving information from different systems inside and outside of the organization. Users do not have time to learn new processes or systems to get the information they need, they just need information, when they need it. They have little patience for excessive browsing and navigation to find promising resources, perhaps just like Google way. The libraries have to facilitate effective search mechanisms and make provision for access to multifaceted and multivariate resources from a single source - perhaps one window into the thinking of the time (one stop shop). This will assist in facilitating effective searching from across digital repositories from a single point, thereby evading the frustration and struggle of the users reaching the right resources.

2.1 THREE PHASES OF TRANSFORMA FROM AUTOMATION

The changes from automation to transformation are marked by three phases. The levels of automation in terms of the time zone have been refereed to in tri-pronged approach [Lynch].

Phase I: Automation in Libraries. By automation we refer to modernization of libraries. The libraries applied a growing range of information technologies to the management of collections that was primarily print information. This was a phase, which marked computerization of library operations, to begin with retro-conversion of catalogue records. This was also a phase that saw significant management challenge for libraries Though 1950s and early 60s marked the initiation of information technology in libraries, yet it was only in late 1960s or early 70s, this technology arrived in force for some (academic) libraries, in the form of locally developed or commercial products intended to automate library processes. Computers were introduced to automate circulation; computer- based ordering systems were also introduced.

The most significant accomplishment was the development of shared copy-cataloging systems, which continued until the early 1980s, these systems for collaboration and cooperation within the library community, set up a path for computer networking and other developments in 1980s & 90s.

Shared cataloguing was pioneered by a number of library consortia in the 1960s and 1970s, that have now consolidated into two major shared cataloguing systems, one operated by OCLC in Ohio, and the other by the Research Libraries Information Network in California.


Phase II: The hallmark of this phase was more visibility of library services to the user clientele. The rise of public access resulted in the central databases beginning to reflect the collective holdings of the major research libraries. Individual libraries had machine-readable bibliographic records for significant percentages oftheir holdings. The online catalogue became a powerful tool and a huge advance. Particularly for those scholars in the humanities, the availability of online catalogues and electronic mail ushered in a new era of access and communication. Growth of campus networking led to locate holding anytime, anywhere on the campus. Union catalogues

* Lynch, Clifford: From automation to transformation: Forty years of libraries and information technology in higher education-EDUCA USE, Jan-Feb 2000.

surfed facilitating identification of resources across several libraries by creating a virtual combined collection. The next in the pipeline were the abstracting and indexing services like Index Medicus of 1 960s, an expansive resource available through commercial online services like DIALOG or ERS turning to interactive public access such as MEDLINE in late 1980s and early 90s. These developments made the concept of anytime, anywhere remote access a reality. The 1980s and 90s saw major investments in resource sharing, marked by union catalogue and computer assisted interlibrary loan systems piggy backing on shared national union catalogues.

Phase ill: The phase marked by the spurt in electronic publishing activities (particularly the journals), print content started going electronic way by late 80s and early 90s. Identification and location of resources from the online union catalogues went a step further and demanded full text access as well. This coupled with the faster developments in information technology led to e-publishing. This was further buttressed by reduced cost of IT in terms of­ storage & bitmapped display technology; availability of formats such as Adobe PDF or ASCn text; and HTML offering alternate publishing solutions- The assorted collections offered by aggregators as one stop shop, the trend of making web based search engines as the first stop for access there by putting print to back burner as a first stop also is essence of this phase. The shift to electronic content has now gone beyond the automation of existing library services and activities. The libraries on one hand purchasing e-resources are also converting their legacy collections, including special collections (manuscripts, archival materials) to digital format. These resource treasures of the library primarily because of these being very old, hitherto away from the public domain, though of immense value are now available to users.

The phase is also marked with the networked information revolution as a result of transformation and innovation. The 90s marked popularization of digital library. Multimedia became a part of content. User expectations for personalized information services, with value addition (filtering, CAS, etc) became center of attention. The issues of resource discovery across federated heterogeneous digital resources became a centric parameter for successful navigation with high precision and relevance. The interoperability of digital libraries is a result of these innovations. Some questions cropped up in the process about intellectual property right, legislation, public policy, etc.

From here now high on the agenda of libraries will be addressing the effects and implications of technological change rather then management of technology.

In nutshell while the 1st phase was modernization by conversion (computerizing library operations); the second phase focused on online public access mechanisms (library services visibility to the users); and the third phase saw e-publishing in full swing with networked information revolution.

These three phases are marked by three significant challenges, and opportunities, that are described below and that is what constitutes the major part of this presentation.

3.0 THREE AREAS OF CHANGE: SIGNIFICANT CHALLENGES/ OPPORTUNITIES

The changes resulted in three significant challenges:

1. How to make available information in bits and bytes and have everything (symbolic) in Os and Is (Digital Libraries)?

2. How to get access (effective and efficient) what is available in digital form (Information Retrieval Systems)?

3. How to make systems user centric, involve them in content development as much as we are involved? (Social Networking, Web 2.0 & Library 2.0)?


These challenges provided vast opportunities in re-defining the role of libraries, librarians and thereby library services by accepting the challenges of IT revolution and operating in electronic environment, bringing innovations in services and endeavoring to stand tall in meeting the users expectations. Having said so, the further discussions would revolve around these three key areas.

3.1 DIGITAL LIBRARY [DL]

A digital library is a type of information retrieval system in which collections are stored in digital formats and accessible by computers. Wherein the digital content may be stored locally, or accessed remotely via computer networks.

The terms such as electronic library and virtual library are often used synonymously. The elements that have been identified as common to these definitions are: DL is not a single entity; it requires technology to link the distributed resources, these linkages to several DLs and information services are transparent to the end users.

DL collections are truly multifaceted in nature and not limited to document surrogates and above all the ultimate goal is universal access to digital libraries and information services.

3.11 DL Basic Services

The basic services of the DL that they are expected to facilitate may broadly encompass the following:

.

Acquisition

.

Submission

.

Repository

.

Search/Browsing/Retrieval

.

Dissemination

.

User Interface

3.12 Types

Several types of digital libraries do exist. Many of the best known digital libraries are older than the web including Project Pursues, Project Gutenberg, and ibiblio. Nevertheless, as a result of the development of the internet and its search potential, digital libraries such as the European Library and the Library of Congress are now developing in a Web-based environment.

A distinction between born digital and information that has been converted from a physical medium, e.g., paper, by digitizing. is often made The 'hybrid library' sometimes used for libraries that have both physical collections and digital collections. For instance, American Memory-a digital library within the Library of Congress. Yet there are some important digital libraries that also serve as long term archives, such as, the e-Print arXiv, and the Internet Archive.

The essence of developing digital library (programmes) lies in approaches such as campus-wide initiative to develop as a leader in the use of information technology; modernizing overall university services to attract better students; participate in the digital library programmes being developed at other core institutions; and above all as a commitment to the delivery of high-quality library services. Most of the DL programmes and activities are at some level deploying innovative technologies to deliver traditional library services.

Although digital libraries have a long way to go before they reach their full potential, there has been significant development in the past decade. Nonetheless, referring to the digital library generically masks the fact that digital libraries exist in diverse forms and with quite different functions, priorities, and aims. For example, Harvard's Libraries Digital Initiative is preparing to collect and preserve scholarly and cultural outputs that happen to be in digital form, and to encourage their use in research and teaching. New York University's digital library programme is supporting an institution that has a strong cross-disciplinary interest in theoretical applied aspects of the performing arts. Michigan is supporting the development and conservation of out-of-copyright monograph and serial holdings and efforts to provide highly functional access to digital content. Indiana is using streaming audio to deliver listening assignments to students in its School of Music.

Digital libraries are likely to retain their distinctiveness even as they become more deeply integrated and build upon commonly available collections and services to meet users' needs of electronic resources do not care where their information comes from, as long as it is authoritative and authentic.

The new roles and responsibilities are emerging for the libraries as they entered an increasingly networked digital age.

3.13 Innovation in DL

The origin of DLs lies in the interlocking of the library systems and the Web. Digital Libraries and New Technologies are truly exploring roles and dislocations particularly for the academic institutions. Academic libraries are being transformed by technology. New technologies, information technologies and digital libraries are not just tools to be learned and used, but IT is an environment, a natural science and conservationism in which we operate and are dunked. Startup digital libraries were fundamentally experimental learning experiences that helped digital libraries focus their aims and build their technology expertise. These experiences also helped libraries learn to present their funding cases convincingly, in a language that was comprehensible to those outside the library community and in a way that demonstrated the broader significance of their institutions own efforts. Very early programmes-those from the early to mid-1990s were centered around library's effort to harness the Internet to fulfill historic roles, however greater innovation became apparent from the mid-1990s-for instance in experimentation with digital reformatting, and the like.

3.14 Transformation by Innovation

Three key transformations in making of the digital libraries are [Ridley]

~ From database/repository to environment that signifies managed digital space: This environment will be omnipresent for users ­seamlessly integrated with digital learning and research; etc.; community based, involving resources, people, interaction, process, activities, services, etc; dynamic and natural that is social networking where users will construct it as much as we will;

~ From information management to knowledge management, characterized by coherenceness having value-added outcomes; people-centric, where understanding the potential user has more bearing than the data itself; covering both explicit and tacit knowledge; hall marked trusted information systems; and finally

~ From people finding information to information finding people, where the systems marque will lie in the prudence with which it is handling the people - its potential users. This would for instance be personified by intelligent agents/ personal information systems; where control over users and not systems will be more pervasive; whose hallmark will be all time, everywhere communication (wireless) and the one that would facilitate smart Information.

With the understanding the limitations of purchased content, lack of information provider integration, or the differences between information retrieval for commercial online databases and freely available Web content. Users always wanted to know how to retrieve information quickly from the resources from within the organization and outside it. Due to ever rising user expectations, coupled with information explosion, a continuously expanding collection, and the need for increased return on investment for the licensed content, the librarian and information managers have been at crossroads to address the single point information retrieval to varied resources (internal and external) using a strategic policy of search solution. The up gradation of search solution to include a larger scope of materials; a single search interface that must include all our own information resources within its search scope and at the same time demonstrate the same features and performance provided by popular Internet search engines. Perhaps, such a solution demanded "interoperability" a very important component in the DL environment for facilitating search in highly decentralized systems that is explained in detail in this document.

The role of the Library in the virtual and collaborative world thus lies with an emphasis on how libraries can collaborate to better serve their patrons. What is hence important in the digital environment is to provide for effective search in information retrieval systems.

4.0 SEARCHES IN INFORMATON RETRIEVAL SYSTEMS

Effective search facility is the heart and soul of an effective and efficient information retrieval system. There are multiple types of search options with which we are fairly conversant. For instance options like free text search, fielded search, monolingual and cross language search, similarity search, search using the doc structure, search on annotations are in vogue. However the granularity of the search options can further be extended to the following key search features, which is given on the following page for reference purposes.

4.1 ARCHITECTURE OF INFORMATION RETRIEVAL SYSTEMS

There are different models for DL structure, such as: Centralized model: (facilitate access to DL at a single address and all services are centralized), Replicated model (access to DL at a single address, but for each services there are one or more modules, Distributed model (system redirects the request to the appropriate location, services are distributed over the network), federated model (system redirects the request on the opportune server that can resolve the request, services are replicated and distributed over the network).

In a distributed environment there could be several approaches in architecture for effective information retrieval. However the three broad approaches of distributed architecture (that would find relevance in our future discussions also in this document) are:

4.11 Standard Search Protocols

Use the fielded search to reduce or limit the range of a query in order to increase the relevance of the search results. A fielded search is an advanced query feature that enables users to select and associate the different document fields to which he wishes to limit the query, to then use the required keywords within these fields

 

Boolean searching allows the users to narrow down the search by using special terms before the keywords. Boolean search has three connectors that are used as Boolean connectors: These are 'AND' 'OR' 'NOT'. While AND To make sure a keyword is Included, NOT To make sure a keyword is not included, and OR gives alternative keywords

 

  • Field

 

  • Boolean

 

  • Exact Term

 

  • Proximity

 

  • Wild Cards

 

  • Fuzzy

 

  • Range

 

  • Boosting Terms

Receiving results that match an exact keyword phrases in an particular order You can use punctuation marks, such as quotation marks to achieve this, For instance you want to search "Information Retrieval systems', you put these terms within double quotes ("), the results that you will get after searching these keywords will have all the three words and in the same sequence.

In text processing, a proximity search looks for documents where two or more separately matching term occurrences are within a specified distance, where distance is the number of intermediate words or characters. For example, a search could be used to find "red brick house", and match phrases such as "red house of brick" or "house made of red brick".

Wild cards In a search are characters that will match any character in the field. You can use them where you're trying to find something like 'a code that starts with 'S' and includes an 'x". EX: ale Matches 'are', 'database', 'Ale', a*e Matches 'Alvechurch', 'database'

 

Fuzzy searching will find a word even if it is misspelled. For example, a fuzzy search for apple will find apple. Fuzzy searching can be useful when you are searching text that may contain typographical errors, or for text that has been scanned

 

Provide the relevance level of matching documents based on the terms found. The higher the boost factor, the more relevant the term will be. Boosting allows you to control the relevance of a document by boosting its term. For example, if you are searching for: jakarta apache and you want the term " jakarta" to be more relevant boost it using the ~ symbol along with the boost factor next to the term. You would type: jakartaA4 apache

Range search problems arise in database and geographic information system (GIS) applications. Any data object with d numerical fields, such as person with height, weight, and income, can be modeled as a point in dimensional space. E,g; Asking for all people with income between $0 and $10,000, with height between 6'0" and 7'0", and weight between SO and 140 Ibs. defines a box containing people whose body and wallets are both thin.

Strict adherence to standards allows any user interface to search any conforming search service. Example includes Z 39.50 family of standards for searching library catalogs. Z39.50 principles specify servers storing a set of databases with searchable indexes; where interactions are based on a session.

.* Ridley, Michael. Digital Libraries and New Technologies: Exploring Roles and Dislocations." University of Western Ontario. November 22, 2005. http:// www.uoguelph.cal-mridley!digital-librariesIUWO-Nov2005.ppt

Besides, the client opens a connection with the server( s), carries out a sequence of interactions and then closes the connection. During the course of the session, both the server and the client remember the state of their interaction. However, the Z 39.50 family of standards has proved successful in a tightly knit community, where: there i's a strong tradition of standardization, with many professionally trained people. The categories of material change gradually, allowing a slow-moving standardization process. The standardization approach has failed where these two criteria are not met.

4.12 Broadcast Search

In this approach an interface server broadcasts a query to each collection, combines the results and returns them to the user. Examples of the type are: Deist (digital library protocol), Web metasearch services, etc. Ordinarily, in the simple version, each collection must support the same standards and protocols (e.g., Z 39.50, http, etc.).

However, the problems with broadcast search are relating to the performance; recall; ranking and duplicates: For instance if any collection does not respond, the interface server waits for a time out and in case any collection does not respond means documents in that collection are not found.

4.13 Centralized Search Services

The third type is that of Batch indexing, real time searching, gathering by web crawling and harvesting services that are briefly described here.

Batch indexing: In this metadata about all items are accumulated in a central system.

Real-time searching: Here the user searches the central system, and retrieves items from collections. Examples include Union catalogues, Web search services.

Gathering by web crawling: This is totally automatic, low cost and at the same time highly efficient at gathering very large amounts of material. However, the limitation here is that it can gather only openly accessible materials. It cannot gather material in databases unless explicit URLs are known and make use of meta data provided by collections. Examples include Web search services.

Harvesting: In this case, each collection makes a copy of its metadata available from a sever associated with the collection. A search service harvests metadata from all collections on a regular cycle and builds a central search system

This system has some advantages over the gathering option that it can index material from databases without explicit URLs. This also allows authentication and selection of material At the same time, it requires that collections have metadata and support harvesting protocol, for instance Open Archives Initiative Protocol for metadata harvesting.

Therefore, in order to facilitate an effective search environment, it is important to accomplish interoperability among digital libraries.

4.2 EFFECTIVE SEARCH ENVIRONMENTS IN DLs

Digital libraries need to be integrated so that the users can perform a cross-repository search and thereby see them as a single entity. Such retrieval system for specialized digital libraries is often required to provide a "one-stop" service to the users who often need to access different search engines and databases to obtain information across multiple services. For the Web environment, meta-search engines have been around for a while to provide the same convenience of using a single query interface and gathering search results from multiple sources. But for true digital library environment (organized/controlled collection of resources with higher degree of adherence to standards). For a meta searcher or integrator system to perform cross searching or meta-searching) with a query, there are three major problems [Gravano]:


[J Choosing the best sources or services to which the query should be sent to.

[J Translating a user query into the ones processed by the sources, and.

[J Merging the multiple results from the multiple sources.

The first problem can be handled if the integrator has some knowledge about what individual sources contain or specialize in. The second problem has something to do with the query models and the interfaces of the underlying sources, while the third one is related to the algorithms used for retrieval and the result formats.

* Gravano, L. Chang, C.K , Garcia-Molina H and Paepcke A.. Stanford Protocol Proposal for Internet Retrieval and Search, http://www-db.stanford.edu/-gravano/ starts.htrnl, January, 1997.

In order to handle the three problems, dynamic exchange and sharing of information between the integrator and the sources are critical. Hence for the sharing of representations for cooperation, interoperability is important factor.

4.3 DIGITAL LIBRARY INTEROPERABILITY (DLI)

Before we deliberate issue of DLI, let us briefly touch upon what this interoperability is all about. The' interoperability' is all about the ability of one system to communicate and interact with another system. There are three levels of inter operability [Shepherd]

~ Basic-common tools and interfaces that provide uniformity for navigation and access.

~ Middle-syntactic interoperability that allows the interchange of metadata.

~ Highest-deep semantic interoperability that allows users to access similar classes of objects and services across multiple sites.

4.31 Approaches to Digital library Interoperability

There are many approaches to achieve Digital Library (DL) interoperability with regard to resource discovery users, each with its own advantages and disadvantages.

4.32 Current DL Interoperability Approaches

Current DL interoperability approaches have been categorized into three types by the NSDL program [NSDL]: federated, harvesting, and gathering. [MetaCrawler]:

Based on these levels of interoperability, the current approaches can thus be categorized into three types of interoperable search & retrieval mechanisms:

~ Federation,

~ Harvesting, and

~ Gathering.

Federation: Federation provides the most complete form of interoperability, but requires great efforts from participants.

Federated search is the simultaneous search of multiple online databases and is an emerging feature of automated, Web-based library and information retrieval systems. It is also often referred to as a portal, as opposed to simply a Web-based "search engine. Example of a federated search MERLOT Federated Search (for Physics) http://fedsearch.merlot.org/

In a federated digital library, the aggregator enforces a custodial contract governing the relationship between contributors and subscribers using an access manager.

The purpose of federated search is to transform a query and broadcast it to a group of disparate databases with the appropriate syntax. Thereafter merge the results collected from the databases, and present them in a succinct and unified format with minimal duplication. Finally, provide a means, performed either automatically or by the portal user, to sort the merged result set. The Federated search portals, can thus be classified as


. Commercial or . Open access.

In comaPrision to traditional search engines or metasearch which are uncooperative, isolated environments, the federated search are cooperative and integrated environments. Metasearch participants have access to only sources that have been indexed by the search engine's crawler technology can be searched, retrieved and accessed. The large volume of documents housed in databases is not open to traditional Internet search engines because of limitations in crawler technology. Federated searching resolves this issue and makes these deep Web documents searchable without having to visit each database individually. As a consequence, the result rankings produced by metasearch are less homogeneous than using federated search.

Federated search need not place any requirements or burdens on owners of the individual data sources, other than handling increased traffic. Federated searches are inherently as current as the individual data sources, as they are searched in real time.

Gathering: At the other end, gathering requires little from participants, but to provide the same quality of service as federation, extra work needs to be done by the interoperability service provider.

* MetaCrawler. http://www.metacrawler.com

* NSDL, National SMETE Digital Library. http://www.smete.org/nsdl

* hepherd, Michael, Interoperability for Digital Libraries. https://drtc.isibang.ac.in/bitstream/1849/95/2C shepherd _interoperabi lity.pdf (visited February 6,2008)

Harvesting: Harvesting lies between federation and gathering. We have so much heard about Metadata Harvesting. It may be worthwhile to focus little bit on his issue as it has become very important in our search strategies and thereby facilitating access to deep web, also called as hidden web or embedded web.

Before I talk about Metadata Harvesting, let me mention that metadata is data about data, which I am sure that all of you are familiar with. There are primarily two important initiatives in this direction that may be worthwhile to mention here.

~ Dublin Core Metadata Initiative (DCMI): The Dublin Core Metadata Initiative (DCMI) is an organization dedicated to promoting the widespread adoption of interoperable metadata standards and developing specialized metadata vocabularies for describing resources that enable more intelligent information discovery systems. With the DCMls mission and scope to provide simple standards to facilitate the finding, sharing and management of information, which it does broadly by [ Dublin].

. Developing and maintaining international standards for describing resources;

. Supporting a worldwide community of users and developers; and

. Promoting widespread use of Dublin Core solutions.

~ Open Archives Initiative for Metadata Harvesting (OAI-PMH): Commonly referred to as the OAI-PMH. OAI-PMH provides an application-independent interoperability framework based on metadata harvesting with roots in e-print archives. A harvesting approach to interoperability at metadata level. OAI-PMH with two players that divides the world into:

Metadata (or data) provider: Basically holding the repository and administering systems that support the OAI-PMH as a means of exposing metadata.

Service providers: Basically harvesting metadata with harvester and using metadata harvested via the OAI-PMH as a basis for building value-added services.

. *Dublin Core Metadata Initiative http://dublincore.org/about/ (visited 10 February, 2008)

The other concepts that would need a mention here is the Repository and Harvester.

Repository: A repository is a network accessible server that can process the OAI-PM requests

Harvester: A harvester is a client application that issues OAI-PMH requests. A harvester is operated by a service provider as a means of collecting metadata from repositories. It supports selective harvesting.

The Open Archives Initiative (OAI) presents a technical and organizational metadata-harvesting framework designed to facilitate the discovery of content stored in distributed archives. ARC (http:// arc.cs.odu.edu) the first federated search service based on the OAI-PMH protocol, hence is the first OAI service provider that harvests metadata from a number of OAI data providers and implements an end-user federation searching service.

Institutional Initiatives: Searching and querying across heterogeneous federated digital libraries is becoming almost inevitable to be facilitated by the libraries in an attempt to provide effective services to the users and make

library portal a knowledge hub. With the increasing acceptability of interoperability standards like Open Archives Initiative protocol for metadata harvesting, it is becoming feasible to build federated discovery services which aggregate metadata from different digital libraries (data providers) and provide a unified search interface to users. I would like here to give an example of some initiatives, and I thought let me take my own institutions example. At Indian Statistical Institute, Kolkata we have attempted to provide a one-stop info-shop, by facilitating federation, harvesting, searching apd OAI compliant institutional repositories. Though the initiative is still not fully matured, yet you may be able to access it on our IR portal http://ir.isical.ac.in

4.33 Harvested vs Federated Search

In a federated digital library system all participants use the same DL protocol. Every aspect of interoperability is formally defined and every organization commits to follow the standards exactly and builds its services according to the common specifications. In practice this forces all organizations to use the same platform or software suites and enhance them to the same schedule.

In the harvesting approach, a service provider collects metadata from each of the DLs and then indexes them to provide a federation search service.

If the various organizations are not prepared to cooperate in any formal manner, a base level of interoperability is still possible by gathering openly accessible information. The most common examples of this approach are the Meta Web search engines. Because there is no cost to participating, gathering can provide services that embrace large numbers of digital libraries. However, if there is no extra work done to control the quality, the gathering based services are usual1y of poorer quality than can be achieved by partners who cooperate more ful1y.

4.34 Access Policies

The access policies would primarily depend upon the nature, flavour and the character of the system, since in a distributed system we are handing heterogeneous network of digital libraries, where al1 participating players may not have a uniform code of conduct. However in general, we may have: standards based approach for supporting dynamic access policies for a federated digital library . There may be instances that we have

~ Data Provider restrictive systems that may have­

. Content based restrictions

. Provisional actions (defined by data provider). It may need authentication and authorization

~ Non-restrictive systems (of some sort)

Searching and querying across heterogeneous federated digital libraries is becoming almost inevitable to be facilitated by the libraries in an attempt to provide effective services to the users and make library portal a knowledge hub. With the increasing acceptability of interoperability standards like Open Archives Initiative protocol for metadata harvesting, it is becoming feasible to build federated discovery services which aggregate metadata from different digital libraries (data providers) and provide a unified search interface to users. The Indian Statistical Institute (ISI) has also initiated to facilitate provision of access to various kinds of digital library resources, that includes the institution's own resources (so-called institutional repository), both pedagogical and non-pedagogical (primarily); harvested resources from digital libraries (OAI compliant) in the areas of institution's interest and also facilitating cross access to repositories that ISI has access to from a single point by facilitating a modest federated search facility.


The portal accessible at http://ir.isical.ac.in is a one-stop shop for all information resources for the users. However, this is still at an early stage and is under further developmental process.

5.0 WEB 2.0 TO LIBRARY (LIB) 2.0: TAKING A U TURN

The Web 2.0 is around the comer and we have already started focusing on Lib 2.0 for Resource discovery & information sharing:, but in reality what implications it is going to have for libraries, is yet be deciphered fully. But before we talk about this issue, let me first touch upon what this web 2.0 is all about and then deliberate on Library 2.0 and its implications on libraries.

5.1 WEB 2.0

In October 2004, Tim O'Reilly coined the term Web 2.0. The Concept of "Web 2.0" began with a conference brain storming session between O'Reilly and MediaLive International. Dale Dougherty, web pioneer and O'Reilly VP, noted that far from having "crashed", the web was more important than ever, with exciting new applications and sites popping up with surprising regularity.

Web 2.0 term is widely defined, used and interpreted. Web 2.0 is essentially, not a web of textual publication, but a matrix of dialogues, not a collection of monologues, a web of multi-sensory communication and a user-centered Web in ways it has not been thus far.

In very general terms, Jeff Bezos has defined it as Web 1.0 was making the Internet for people, web 2.0 is making the Internet better for computers"

Web based applications that (i) allow for collaboration and sharing of information, (ii) are easy to use, (iii) encourage users to help build the information environment, (iv) allow for the reuse of data.

5.11 Principles of Web 2.0

Web 2.0 is built upon trust, whether that trust be placed in individuals, in assertions, or in the uses and reuses of data. Generally people say it is an attitude not a technology that presages a freeing of data.

» Permits the building of virtual applications: Drawing data and functionality from a number .of different sources as appropriate. These applications tend to be small, they tend to be relatively rapid to deploy. Example includes various applications of Go ogle Maps.

» Is participative: Facilitates blogs, sharing files, or equivalent e.g.

WikiPedia, e.g.

~ Applications are modular

~ Is about sharing

~ Is about communication and facilitating community

~ Is about remix: Remixing is perhaps the most important concept, also called "mashups"

~ (Find the relevant snippets and make them ours as well as the originators )'

~ Is smart: Uses knowledge about us to deliver services that meet our needs and deliver rich user experiences in Web 2.0

~ Opens up the Long Tail: Leveraging long tail through customer self­ service.

5.0 WEB 2.0 TO LIBRARY (LIB) 2.0: TAKING A U TURN

The Web 2.0 is around the comer and we have already started focusing on Lib 2.0 for Resource discovery & information sharing:, but in reality what implications it is going to have for libraries, is yet be deciphered fully. But before we talk about this issue, let me first touch upon what this web 2.0 is all about and then deliberate on Library 2.0 and its implications on libraries.

5.1 WEB 2.0

In October 2004, Tim O'Reilly coined the term Web 2.0. The Concept of "Web 2.0" began with a conference brain storming session between O'Reilly and MediaLive International. Dale Dougherty, web pioneer and O'Reilly VP, noted that far from having "crashed", the web was more important than ever, with exciting new applications and sites popping up with surprising regularity.

Web 2.0 term is widely defined, used and interpreted. Web 2.0 is essentially, not a web of textual publication, but a matrix of dialogues, not a collection of monologues, a web of multi-sensory communication and a user-centered Web in ways it has not been thus far.

In very general terms, Jeff Bezos has defined it as Web 1.0 was making the Internet for people, web 2.0 is making the Internet better for computers"

Web based applications that (i) allow for collaboration and sharing of information, (ii) are easy to use, (iii) encourage users to help build the information environment, (iv) allow for the reuse of data.

5.11 Principles of Web 2.0

Web 2.0 is built upon trust, whether that trust be placed in individuals, in assertions, or in the uses and reuses of data. Generally people say it is an attitude not a technology that presages a freeing of data.

» Permits the building of virtual applications: Drawing data and functionality from a number .of different sources as appropriate. These applications tend to be small, they tend to be relatively rapid to deploy. Example includes various applications of Go ogle Maps.

» Is participative: Facilitates blogs, sharing files, or equivalent e.g.

WikiPedia, e.g.

~ Applications are modular

~ Is about sharing

~ Is about communication and facilitating community

~ Is about remix: Remixing is perhaps the most important concept, also called "mashups"

~ (Find the relevant snippets and make them ours as well as the originators )'

~ Is smart: Uses knowledge about us to deliver services that meet our needs and deliver rich user experiences in Web 2.0

~ Opens up the Long Tail: Leveraging long tail through customer self­ service.

5.2 LIB 2.0

Library 2.0 envisages disruptive change that challenges our ­considerations of our library services; and current forms of offering our information services to our users. Perhaps (partly) true, as this will bring about deviation in our regular practices. But there is nothing to get bogged down with, changes will come around and they have to come around.

In September 2005, Michael Casey [Casey]coined the term Library 2.0 and said it is web 2.0 concepts and applications in the LIS realm (but there isn't agreement on the definition).

There are many interpretations of the definition given by many bloggers "Library 2.0" as ~'the application of interactive, collaborative, and multi­media web-based technologies to web-based library services and collections."

~ Library 2.0 is all about library users-keeping those we have while actively seeking those who do not currently use our services.

~ It's about embracing those ideas and technologies that can assist libraries in delivering services to these groups, and

~ It is about creating user friendly services that people expect, and encouraging participation, and

* Casey, M. (2006). Library Crunch: bringing you a library 2.0 perspective. Accessed February 13, 2008, from http://www.librarycrunch.com )

~ Where the areas of change will be: policy, programming, physical spaces, and technology.

5.21 Essential Elements Library 2.0

*User Centric, * Multimedia Character, * Social Networking, * Community oriented

Users participate in the creation of the content and services they view within the library's web-presence, OPAC, etc. The consumption and creation of content is dynamic, and thus the roles of librarian and user are not always clear.

Multimedia character: Both the collections and services of Library 2.0 contain video and audio components. While this is not often cited as a function of Library 2.0, rather it ought to be.

Social networking: It is socially rich; library's web-presence includes users' presence. There are both synchronous (e.g. 1M) and asynchronous (e.g. wikis) ways for users to communicate with one another and with librarians.

Community oriented: It is communally (reciprocally) innovative. This is perhaps the single most important aspect of Library 2.0. It rests on the foundation of libraries as a community service, but understands that as communities change, libraries must not only change with them, they must allow users to change the library. It seeks to continually change its services, to find' new ways to allow communities, not just individuals to seek, find, and utilize information.

These conceptual tenets of Library 2.0 might be rather dependable, envisioning the technological specifics of the next generation of electronic library services is at once both fraught with inevitable error and absolutely necessary. How the applications so common to Web 2.0 will continue to evolve and how libraries might utilize and leverage them for their patrons, are inherently hidden, they are wholly about innovation. But the conceptual underpinning of a library's web-presence and how it must evolve into a multi-media presence that allows users to be present as well, both with the library or librarian and with one another, are clearly in need of development.

5.22 Implications of Web 2.0 on Lib 2.0

The implications of web 2.0 revolution in the Lib are enormous. Recent thinking describing the changing Web as "Web 2.0" will have substantial implications for libraries, and recognizes that while these implications keep very close to the history & mission of libraries, they still necessitate a new paradigm for librarianship. Librarians are only beginning to acknowledge and write about it, primarily in the "biblioblogosphere" (web logs written by librarians). However, journals and other more traditional literatures have yet to fully address the concept. But the application of Web 2.0 thinking and technologies to library services and collections has been widely framed.

Library 2.0 demands libraries focus less on secured inventory systems and more on collaborative discovery systems. Perhaps a great synchronicity between librarianship and Web 2.0. But, viewed holistically, Library 2.0 will revolutionize the profession rather than creating systems and services for patrons. The librarians will enable users to create them (systems and services) for themselves. A profession steeped in decades of a culture of control and predictability will need to continue moving toward embracing facilitation and ambiguity. This shift corresponds to similar changes in library history, including the opening of book stacks and the inclusion of fiction and paperbacks in the early 20th century. Therefore, Library 2.0 is not about searching, but finding; not about access, but sharing. Library 2.0 recognizes that human beings do not seek and utilize' information as individuals, but as communities. Some examples of the move from Library 1.0 to Library 2.0 include. Email reference/Q&A pages (Chat reference); T ext-based tutorials (Streaming media tutorials with interactive databases); Email mailing lists, webmasters (Blogs, wikis, RSS feeds); Controlled classification schemes (Tagging coupled with controlled schemes); OP AC (Personalized social network interface); Catalogue of largely reliable print and electronic holdings (Catalogue of reliable and suspect holdings, web­pages, blogs, wikis, etc).

5.23 Evolving: Web 2.0 and the Lib 2.0

~ Synchronous Messaging ~ Streaming Media

~ Blogs and Wikis

~ Social Networks

~ Tagging

~ RSS Feeds

~ Mashups

Synchronous Messaging : This technology has already been embraced quite rapidly by the library community. More widely known as instant messaging (1M), it allows real-time text communication between individuals. Libraries have begun employing it to provide "chat reference" services, where patrons can synchronously communicate with librarians much as they would in a face-to-face reference context, consistent with the tenets of Library 2.0 because it allows (i) User presence within the library web-presence; (ii) Collaboration between patrons and librarians; (iii) More dynamic experience than the fundamentally static, created-then-consume nature of 1.0 services; (iv) Also considered 2.0 as it is becoming a more web­based application, and the software used by chat reference services is usually much more robust that the simplistic 1M applications that are so popular. They often allow co-browsing, file sharing, screen capturing, and data sharing and mining of previous transcripts).

Streaming Media: Streaming of video and audio media is another application that many might consider Web 1.0, as it also predates. For libraries to begin maximizing streaming media's usefulness for their patrons, 2.0 thinking will be necessary. Library instruction delivered online has begun incorporating more interactive, media-rich facets. The static, text­based explanation coupled with a handout to be downloaded is being supplemented by more experiential tutorials. For instance Association of College and Research Libraries', Peer Reviewed Instructional Materials Online (PRIMO). an Instruction Section provides a database of tutorials, many of which are Web 2.0 in their nature http://www.ala.org/ala/ acr lbucket/is/iscommittees/webpages/emergingtech/primo/index.htm. Perhaps these tutorials are first of library services to migrate into more the more socially rich Web 2.0. Many ofthese tutorials use flash programming, screen-cast software, or streaming audio or video, and couple the media presentation with interactive quizzing; users respond to questions and the system responds in kind. These could take the form of multi-media chat rooms or wikis, and users will interact with one another and the learning object at hand", much as they' would in a classroom or instruction lab. Another implication of streaming media for libraries is more along the lines of collections instead of services. As media is created, libraries will inevitably be the institutions responsible for archiving and providing access to them.

Libraries are already beginning to explore providing such through digital repository applications and digital asset management technologies. Yet these applications are generally separate from the library's catalogue & this fracture will need to be mended. Library 2.0 will show no distinction between or among formats and the points at which they may be accessed.

Blogs and Wikis: Blogs and wikis are fundamentally 2.0, and their global proliferation has enormous implications for libraries. Most obvious implication ofblogs for libraries is that they are another form of publication and need to be treated as such. They lack editorial governance and the security this provides, but many are nonetheless integral productions in a body of knowledge, and the absence of them in a library collection could soon become unthinkable.

Wikis are essentially open web pages, where anyone registered with the wiki can publish to it, amend it, and change it. Much as blogs, a library wiki as a service can enable social interaction among librarians and patrons, essentially moving the study group room online. Blogs and wikis are relatively quick solutions for moving library collections and services into Web 2.0.

This beginning of Library 2.0 makes collections and services more interactive and user-centered, enable information consumers to contact information producers and become co-producers themselves. It could be that Library 2~0 blurs the line between librarian and patron, creator and consumer, authority and novice. The potential for this dramatic change is very real and immediate, a fact that places an incredible amount of importance on information literacy. In a world where no information is inherently authoritative and valid, the critical thinking skills of information literacy are paramount to all other forms of learning.

Social Networks: Social networks are perhaps the most promising and embracing technology and noteworthy as well. No imagination is required to begin seeing a library as a social network itself. Much of libraries' role throughout history has been as a communal gathering place, one of shared identity, communication, and action. Social networking could enable librarians and patrons not only, to interact, but to share and change resources dynamically in an electronic medium. Besides, allowing users to create accounts with the library network; see what other users have in common to their information needs, recommend resources to one another, &the network recommends resources to users, based on similar profiles, demographics, previously-accessed sources, and a host of data that users provide. Also enable users to choose what is public and what is not, a notion that could help circumvent the privacy issues Library 2.0 raises. Of all the social aspects of Web 2.0, it could be that the social network & its successors most greatly mirror that of the traditional library. Social networks, in some sense, are Library 2.0. The face of the library's web-presence in the future may look very much like a social network interface.

Tagging: Tagging essentially enables users to create subject headings for the object at hand. It is essentially Web2.0 because it allows users t~ add and change not only content (data), but content describing content (metadata). Tags & standardized subjects are not mutually exclusive. The catalogue of Library 2.0 would enable users to follow both standardized and user-tagged subjects; whichever makes most sense to them. In turn, they can add tags to resources. The user responds to the system, and the system to the user. This tagged catalogue is an open catalogue, a customized, user-centered catalog. It is library science at its best. Tagging simply makes lateral searching easier.For instance, often-cited example of the U.S. Library of Congress' s Subject Heading "cookery," which no English speaker would use when referring to "cookbooks," illustrates the problem of standardized classification. Tagging would turn the useless "cookery" to the useful "cookbooks" instantaneously, and lateral searching would be greatly facilitated.

RSS Feeds: Syndication of content is another Web 2.0 application that is already having an impact on libraries, and could continue to do so in remarkable ways. RSS feeds and other related technologies provide users a way to syndicate and republish content on the Web. Users republish content from other sites or blogs on their sites or blogs, aggregate content on other sites in a single place, and ostensibly distill the Web for their personal use.

Libraries are creating RSS feeds for users to subscribe to, including updates on new items in a collection, new services, and new content in subscription databases. They are also republishing content on their sites.

However libraries have yet to explore ways of using RSS more pervasively

Mashups: Perhaps the single conceptual underpinning to all the technologies. Ostensibly hybrid applications, where two or more technologies or services are conflated into a completely new, novel service. Example: Retriever, (http://labs.systemone.at/retrievr/)for example, conflates Flickr's image database and an experimental information architecture algorithm to enable users to search images not by metadata, but by the data itself. Users search for images by sketching images. Another example is WikiBios, (http://www.wikibios.com/) a site where users create online biographies of one another, essentially blending blogs with social networks.

The following features broadly depict the working of Library 2.0 in libraries:

» OP AC 2.0: records tagging, RSS for search results, acquisitions and alerts, user agents, openurl, federated search, user reviews, open search, recommendations, communities (Googlezon model)

» Subject based wikis

» Bloglines trusted feeds

» Library blogs

» 1M reference

» RSS alerts for library news

» Pod-video-casting guides to library services

» Personal search engines for reference (Swiki, Gigablast)

»Collaborative web (My Space, Protopage, NetVibes...) for communicating with users.

 

5.24 Future of these Technologies in the Library Arena

The future is going to be fascinating and attention grabbing. By providing this interactive Web service, libraries have positioned themselves to adopt its successors quickly and expertly. Text-based nature of 1M applications is changing into a more multi-media experience; audio and video messaging is becoming more common. These will become ubiquitous, as they provide more multi-sensory experiences, hence available throughout the library's web-presence.

Referencing which would be hallmark of these technologies, will deliver more than ever. It is conceivable that should a user allow such a service, these chat reference services can be prompted when certain user seeking behaviors are detected. Library 2.0 will know when users are lost, and will offer immediate, real-time assistance. The time will almost certainly soon come when Web reference is nearly indistinguishable from face-to-face reference; librarians and patrons will see and hear each other, and will share screens and files. The transcripts these sessions already provide will serve library science in ways that face-to-face reference never did. For the first time in the history of libraries, there will be a continuously collected transcription of the reference transaction, always-awaiting evaluation, analysis, cataloguing, and retrieval for future reference.

6.0 WHAT CAN THESE CHANGES DO TO LIBRARIES?

The libraries have lot to gain. For instance:

~ Improved discovery of institutional and remote resources

~ Subject specific aggregation -Especially for the "Academic Internet"

~ Different entry points to collections

~ Provide user-sensitive display of resources (may be using XSL style sheets)

Library 2.0 is therefore indeed a mashup. It is a hybrid of blogs, wikis, streaming media, content aggregators, instant messaging, and social networks. Library 2.0 remembers a us