Header logo is

Greedy Learning of Binary Latent Trees

2011

Article

ei


Inferring latent structures from observations helps to model and possibly also understand underlying data generating processes. A rich class of latent structures are hierarchical latent class (HLC) models. Zhang (2004) proposed a search algorithm for learning such models that can find good solutions but is often computationally expensive. As an alternative we investigate two greedy procedures: the BIN-G algorithm determines both the structure of the tree and the cardinality of the latent variables in a bottom-up fashion. The BIN-A algorithm first determines the tree structure using agglomerative hierarchical clustering, and then determines the cardinality of the latent variables as for BIN-G. We show that even with restricting ourselves to binary trees we obtain HLC models of comparable quality to Zhang‘s solutions, while being faster to compute. This claim is validated by a comprehensive comparison on several datasets. Furthermore, we demonstrate that our methods are able to estimate int erpretable latent structures on real-world data with a large number of variables. By applying our method to a restricted version of the 20 newsgroups data these models turn out to be related to topic models, and on data from the PASCAL Visual Object Classes (VOC) 2007 challenge we show how such tree-structured models help us understand how objects co-occur in images.

Author(s): Harmeling, S. and Williams, CK.
Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence
Volume: 33
Number (issue): 6
Pages: 1087-1097
Year: 2011
Month: June
Day: 0

Department(s): Empirical Inference
Bibtex Type: Article (article)

Digital: 0
DOI: 10.1109/TPAMI.2010.145
Language: en
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik

Links: PDF
Web

BibTex

@article{6671,
  title = {Greedy Learning of Binary Latent Trees},
  author = {Harmeling, S. and Williams, CK.},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume = {33},
  number = {6},
  pages = {1087-1097},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  month = jun,
  year = {2011},
  month_numeric = {6}
}