This paper demonstrates how to use generative models as data mining tools. Our insight is that generative methods learn an accurate model of their training data that we can analyze to summarize and understand that data. This analysis-by-synthesis approach to data mining has two key advantages. First, it scales much better than traditional correspondence-based approaches since it does not require explicitly comparing all pairs of visual elements. Second, generative models can disentangle factors of variation within the training data (such as appearance vs. geographic location), which are nearly always entangled within the data itself. In this work, we train a location-conditioned diffusion model on worldwide streetview data to mine for geographical visual patterns. We synthesize a dataset of parallel images depicting the same scene layouts across different locations using this model. We define typicality measures, assessing how characteristic visual elements are for geographic location, either country-specific or in terms of cross-country variability.
Live content is unavailable. Log in and register to view live content