Seventh International Conference on Information and Knowledge Management
Nov. 3, 1998, Washington, D.C., USA.
Sponsored by ACM SIGIR and SIGMIS.
|
|
|
Rajeev Rastogi and Kyuseok Shim |
|
Fredric C. Gey Ph.D |
|
Il-Yeol Song |
|
Sandra Heiler and Gail Mitchell |
|
|
|
T1. DATA MINING ON LARGE DATABASES |
Rajeev Rastogi and Kyuseok Shim |
|
Tutorial Outline |
|
|
Biography of Instructors : |
|
|
T2. MODELS IN INFORMATION
RETRIEVAL |
||||||||||
Fredric C. Gey Ph.D |
||||||||||
|
COURSE DESCRIPTION: Information retrieval algorithms have emerged as the key to effective search of large collections of unstructured text such as found on the Internet. Vector space algorithms are used by Lycos and AltaVista, while Inktome uses a probabilistic document retrieval algorithms. The three major theoretical models in information retrieval are Boolean/logic, vector space, and probabilistic. This tutorial will explain the unique characteristics and problems of each model and how each model has evolved along different lines. Modern variants of the basic models are explained. The attendees of this tutorial will obtain a basic
understanding of the major theoretical models upon which
modern text retrieval software is based. The tutorial should
provide each participant with a starting point for further
self-education. |
||||||||||
|
||||||||||
|
Materials: 110 Course overheads, and 4
pages of bibliographic references will be provided. |
||||||||||
|
WHO SHOULD ATTEND: |
||||||||||
|
ABOUT THE INSTRUCTOR: |
|
T4. DATA WAREHOUSING DESIGN TECHNIQUES
FOR ROLAP |
|
Il-Yeol Song, Ph.D. |
|
Level : Beginning to Intermediate. Intended Audience : Professionals who are
working or thinking for data warehousing based on relational
database systems. |
|
Tutorial Abstract : A data warehouse is an integrated data repository
containing historical data of a corporate for supporting
decision-making processes. Recently, data warehouses became
the focus of corporate information management with the most
advanced database technology. The basic strategy for
accessing individual and aggregate data in a data warehouse
using relational databases is known as ROLAP (Relational
OLAP). This tutorial presents technology overview for the
development of data warehousing. It compares ROLAP and MOLAP
(Multidimensional OLAP) then discusses techniques for
designing star schema. We will look at the multiple
variations of the star schema that exist and the differences
in the properties of these different schema. It also
discusses the techniques for optimizing the performance of
data warehouse systems based on relational database systems.
Specifically, the discussion includes storage, parallel
processing technology, indexing technology, including bit
map indexes, join indexes, multi-table join indexes,
indexing strategies, query optimization based on star
schema, and partitioning techniques. It concludes with the
survey of commercial markets, tools, trends, research issues
and challenges. |
|
Biography of Instructor : Il-Yeol Song is an associate professor in the College of
Information Science and Technology at Drexel University,
Philadelphia, PA. He received his M.S. and Ph.D degrees in
Computer Science from Louisiana State University in 1984 and
1988, respectively. His current research areas include
database modeling and design, data warehousing,
object-oriented database systems, and object-oriented
analysis and design. He has published over 60 refereed
technical articles in various journals, international
conferences, and books. In 1992, he received an exemplary
teaching award as well as a research scholar award from
Drexel University. He has won eight Sigma Xi research awards
from the Drexel Sigma Xi scientific research competition. He
has worked as a program committee member for over twenty
five international conferences and workshops. He was the
guest editor for a 1995 special issue of Journal of Computer
and Software Engineering entitled "Methodologies and Tools
for Intelligent Information Systems." He will be the guest
editor for a special issue of Journal of Computer Science
and Information Management entitled "Applications and
Technologies for Next Generation Database Systems,"
scheduled for early 1999. He is the program co-chair of
First ACM Int'l Workshop on Data Warehousing and OLAP (DOLAP
that will be held with CIKM98 in November 7, D.C. |
|
T5. METADATA REPOSITORIES: ENABLING
INFORMATION ASSET MANAGEMENT |
Sandra Heiler and Gail Mitchell |
|
Metadata repositories have long been used by software engineering tools to store and manage descriptions of system components, and by data administrators to document information stores. More recently, they are being used to support the integration of various tools, databases, and applications, and their use is being expanded to manage metadata for many more kinds of applications, including data warehousing. In this half-day tutorial, we present an industrial perspective on repository technology and its uses in managing an enterprise's information assets. The tutorial starts with a description of repository technology. It examines requirements for managing metadata and describes how these are met by the technology. In particular, we discuss repository architectures, integration mechanisms, repository metamodels, and associated tools for populating, accessing, maintaining, and administering the repository. We identify various implementation strategies for repositories, and look at the state-of-the-art in repository products. The second part of the tutorial examines the use of
repositories. We begin with a discussion of issues in
populating a repository and in implementing applications
using repositories. We then describe a number of
applications of repository technology, including software
lifecycle support, production planning and management, and
decision support systems and data warehousing. Finally, we
look at how the repositories supporting these applications
combine to provide for enterprise-wide information asset
management, and we identify research issues in moving to
this broader use. |
|
Instructors Sandra Heiler is the Principal Investigator of the Data and Database Research project at GTE Laboratories, where her research focuses on the use of metadata repositories to support enterprise-wide management of information and software components. In particular, her work is directed to the use of metadata and repository technology to integrate distributed, heterogeneous systems and databases, and to support data warehousing. She is also involved in the application of this technology to legacy system migration and data archiving in a large SAP rollout. Ms. Heiler's earlier work at GTE Laboratories was with the Distributed Object Management Department, where she did research on object model integration and interoperability frameworks, and on object views and identifiers. She joined GTE from CCA and Xerox Advanced Information Technology, where she developed object models and object management systems for VLSI and software engineering environments, as well as transaction models to support cooperative work in those environments. |