Direct log-density gradient estimation with Gaussian mixture models and its application to clustering

Zhang Qi (1651211)


Some machine learning algorithms rely on the accurate estimation of probability density gradient, such as mode-seeking clustering, which assigns cluster labels by associating data samples with the nearest modes (local maxima). We propose a method to estimate the gradient of the log-density which can be used for mode identification. Our work extends the least squares log-density gradient (LSLDG) by using the Gaussian mixture models (GMMs), a method to directly estimate the gradient of the log-density without going through density estimation. The advantage of our work is that the correlation information in the gradient can be captured by GMMs, this gives the model more flexibility to estimate the local-correlated gradient, which leads to higher clustering accuracy especially on clusters with high local correlation. This method is then extended for hierarchical clustering. We show the validity via experiments on both artificial data and benchmark datasets.