COLLOQUIUM Computer Science Department, Boston University Speaker: Mike Malyutov and Irosha Wickromasinghe Northeastern University Date: Wednesday, April 27 Time: 14:00 Place: Room MCS 135, 111 Cummington Street (for directions, see www.cs.bu.edu/colloquium) Title: Conditional Complexity of Compression (CCC) application for attributing texts Abstract: We use an empirical CCC minimization principle for focused attributing texts among few candidates for which a large corpus of texts is available. CCC approximates the Kolmogorov's et al approach to Complexity. It was applied recently to Phylogenetic trees of species or languages construction and music clustering. Some asymptotic results will be mentioned. The novelty of our approach is in partitioning the disputed text onto various numbers of approximately equal parts, and comparing the statistics of the lengths of its compressed versions trained on close in style and topic works of competing authors after adequate preprocessing of the texts. When the parts are small, the variance dominates, for larger parts the bias appears due to the self-adapting of a compressor on the disputed text. Various compressors show comparable performance, while spelling errors can well distort the analysis which seems amazingly accurate otherwise in the famous cases studied. Host: Leonid Levin