COLLOQUIUM Computer Science Department, Boston University Speaker: Yi Chen University of Pennsylvania Date: Friday, April 1 Time: 9:00 Place: Room MCS 135, 111 Cummington Street (for directions, see www.cs.bu.edu/colloquium) Title: Managing XML Data Effectively in Relational Databases Abstract: XML has become a standard data representation format in various applications, from bioinformatics, healthcare and astronomy, to finance, legislative documents and government data. With the large amount of data now being represented in XML, the question is raised of how to effectively store, index, and access it to retrieve information. Since relational databases have matured through more than 30 years of development, it is natural to investigate whether this technology can be leveraged to manage XML data. In this talk, I present techniques for storing and querying XML data using relational databases in different scenarios. When the schema of XML data is not available, a generic storage mapping to relational databases is proposed. Based on a novel bi-labeling scheme, an XML query is translated to an SQL query. Compared with previous work, the generated SQL query contains fewer joins and requires fewer disk accesses to execute. When a schema is available, structural and semantic constraint information of an XML document is used to guide the mapping design. In contrast with previous work that only considers structural information, the proposed mapping reduces the redundancy of the XML data in its relational storage and enables efficient validation of data correctness with respect to constraints. Biography: Yi Chen is a Ph.D. candidate in the Department of Computer and Information Science at the University of Pennsylvania. She received her M.S. in Computer and Information Science from the University of Pennsylvania in 2000 and her B.S. in Computer Science from Central South University, China in 1999. Her research interests include database systems, web data and scientific data management, query processing and optimization techniques for databases and streams. Host: George Kollios