Research Motivation

 

My research lies in knowledge organization, metadata creation and quality control, digital library, social network, and information retrieval. Specifically, my research has covered topics including metadata quality evaluation, automatic metadata generation, properties of user-created metadata (social annotation), and social annotation based web organization and information retrieval. My research goal is to examine the properties of metadata created by experts, automatic tools, authors, and users and investigate how to improve the performance of current knowledge organization and information retrieval systems by utilizing metadata generated from different sources.


With the explosive development of the Web and rapid growth of digital libraries, digital documents are becoming ubiquitous. How to find the information resources relevant to a user’s specific information need among the enormous volume of digital documents is a non-trivial problem. Knowledge organization and information retrieval are two ways to address this issue. In both approaches, metadata plays an essential role. My research focuses on metadata-based web organization and web search. In particular, my dissertation research is dedicated to improve the organization and retrieval of web documents by leveraging the community-contributed metadata: social annotations.

 

Dissertation Research

 

Metadata is essential for organizing and searching information resources. It is traditionally created by information professionals based on metadata standards or controlled vocabularies. Professionally created metadata is considered of high quality. However, it is costly to produce and difficult to scale. Especially, in the web environment, the enormous volume of online digital documents makes alternative ways of metadata generation a critical need.


Social tagging or social annotation, as a major characteristic of Web 2.0, has gained in popularity since the first social bookmarking system named del.icio.us was started in 2003. The purpose of social tagging systems is to help users share, store, organize and retrieve digital documents they are interested in. Therefore, social annotations created by users provide a special type of metadata that can be utilized for classifying and retrieving web documents.


Social tagging has many advantages over traditional metadata creation methods. Firstly, it lowers the entrance threshold of metadata creation. Web users, as long as they are familiar with the content of resources, can be taggers. As a result, many more resources are tagged with little cost. Secondly, social annotation adapts quickly to changes in user needs and vocabulary. New terms and their related resources can be quickly absorbed into the social annotation system with little maintenance cost. Moreover, social annotations represent users’ perspectives about the resources. In traditional metadata creation methods, users are disconnected from the process. However, in the social tagging system, taggers are at the same time indexers and searchers. Thus, it is easier to attain indexer-searcher consistency, which is a prerequisite of effective retrieval. Another valuable feature of social tagging lies in its social network property. By exploiting the interrelations among documents, tags and users, we are able to learn not only the topics of web documents, but also the semantics of tags and information interests of different users.


Aware of the potential value of social annotations for web organization and information retrieval, I devoted myself to studying the following topics in my dissertation research: