----------------------------------------------------------------------- B O S T O N U N I V E R S I T Y Computer Science Department C O L L O Q U I U M Wednesday, April 24, 11:00 AM (Coffee served at 10:45 AM) Seminar Room / MCS 135 Probabilistic Model Structure from Data Dimitris Margaritis Carnegie Mellon University Abstract Probabilistic models are useful for modeling non-deterministic data generation processes. Examples of these can be found in genetic domains representing gene expression interactions, socio-economic domains representing stock market prices influences by current events, and many others. The greatest problem in modeling such processes is determining the structure of the model. In my talk I will present work I have done at Carnegie Mellon towards inferring the structure of a specific class of models called Bayesian networks (BNs). I will present the GS ("grow-shrink") algorithm which uses conditional independence tests to determine the BN structure. I will also present a statistical independence test that shows progress towards a conditional independence test for domains with continuous variables, a problem currently unsolved in its generality. I will also show some results of an application that uses Bayesian network models to answer count queries from very large databases. My approach features constant time in the size of the database, a small space overhead, and linear preprocessing time. Moreover it is easily parallelizable and can be readily used for data-mining purposes, at no extra cost. Host: George Kollios ------------------------------------------------------------------------- For colloquium info, including directions, see http://cs-www.bu.edu/colloquium -------------------------------------------------------------------------