F. Han and S. Zhu, “Bottom-Up/Top-Down Image Parsing with Attribute Grammar,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 1, pp. 59-73, Jan. 2009, doi: 10.1109/TPAMI.2008.65.
This paper presents a simple attribute graph grammar as a generative representation for made-made scenes, such as buildings, hallways, kitchens, and living rooms, and studies an effective top-down/bottom-up inference algorithm for parsing images in the process of maximizing a Bayesian posterior probability or equivalently minimizing a description length (MDL). Given an input image, the inference algorithm computes (or constructs) a parse graph, which includes a parse tree for the hierarchical decomposition and a number of spatial constraints. In the inference algorithm, the bottom-up step detects an excessive number of rectangles as weighted candidates, which are sorted in certain order and activate top-down predictions of occluded or missing components through the grammar rules. In the experiment, we show that the grammar and top-down inference can largely improve the performance of bottom-up detection.