------------------------------------------------------------------------------- B O S T O N U N I V E R S I T Y Computer Science Department C O L L O Q U I U M Indexing and Mining Multimedia Databases Philip Korn Dept. of Computer Science University of Maryland Thursday, February 12th, 4:00 pm (Coffee served at 3:45pm) Seminar Room / MCS 135 ------------------------------------------------------------------------------- This talk focuses on similarity searching and data mining in large multimedia databases. Similarity search involves the retrieval of multimedia objects (e.g., images, time series) that are most "similar" to a query object, for example: `Find images from a given collection of X-Rays that contain a nodule similar to the given tumor shape,' and `Find all stocks with movement similar to that of IBM.' Using concepts from mathematical morphology and tools from state-of-the-art indexing, we developed a system that efficiently searches for similar tumor shapes while attaining correct output (i.e., no false dismissals). The system is 27 times faster than sequential scanning, and exhibits excellent precision (80%) at perfect recall (100%). The second part of the talk examines data mining. The goal is to support ad hoc queries on large data matrices that might not fit on disk. Such a matrix could have, e.g., customers for rows and days of the year for columns, with each cell value representing the amount spent on products. The target queries are single-cell queries ('Find the amount spent by Smith on 1/1/96') and aggregate queries ('Find the sales of customers from New York on December 1st'). We propose a compression format that permits random access, and thus efficiently supports ad hoc queries. Towards this end, we developed SVDD, a novel lossy compression method for very large data matrices, which reduces the matrix to 2% of the original space (i.e., a 50:1 compression ratio) and achieves 0.5% reconstruction error, as experiments on real data (e.g., AT&T customer sales) showed. (This is a faculty candidate colloquium.) Host: Wayne Snyder (snyder@cs) ------------------------------------------------------------------------------- For colloquium info, including directions, see http://cs-www.bu.edu/colloquium -------------------------------------------------------------------------------