Detecting Housing Submarkets using Unsupervised Learning of Finite Mixture Models

The problem of modeling housing prices has attracted considerable attention due to its importance in terms of households' wealth and in terms of public revenues through taxation. One of the main concerns raised in both the theoretical and the empirical literature is the existence of spatial assoc… iation between prices that can be attributed, among others, to unobserved neighborhood effects. In this paper, a model of spatial association for housing markets is introduced. Spatial association is treated in the context of spatial heterogeneity, which is explicitly modeled in both a global and a local framework. The global form of heterogeneity is incorporated in a Hedonic Price Index model that encompasses a nonlinear function of the geographical coordinates of each dwelling. The local form of heterogeneity is subsequently modeled as a Finite Mixture Model for the residuals of the Hedonic Index. The identified mixtures are considered as the different spatial housing submarkets. The main advantage of the approach is that submarkets are recovered by the housing prices data compared to submarkets imposed by administrative or geographical criteria. The Finite Mixture Model is estimated using the Figueiredo and Jain (2002) approach due to its ability in endogenously identifying the number of the submarkets and its efficiency in computational terms that permits the consideration of large datasets. The different submarkets are subsequently identified using the Maximum Posterior Mode algorithm. The overall ability of the model to identify spatial heterogeneity is validated through a set of simulations. The model was applied to Los Angeles county housing prices data for the year 2002. The results suggests that the statistically identified number of submarkets, after taking into account the dwellings' structural characteristics, are considerably fewer that the ones imposed either by geographical or administrative boundaries.