Normalized Cuts without Eigenvectors: A Multilevel
Approach
Inderjit S. Dhillon
University of Texas at Austin
Graph partitioning is widely used in scientific computing and parallel
processing.
Typically the goal is to partition the nodes of a graph so that the
graph cut
between the partitions is minimized while constraining the partitions
to be
(nearly) equal in size. Recently, graph partitioning (also called graph
clustering)
has been used in varied data clustering problems in data mining and
machine learning.
However, in these applications, there is no inherent need for the graph
partitions
to be equal in size. Hence weighted graph cuts, such as Normalized
Cuts, have been
used as the objectives to be minimized. Typically, spectral clustering
algorithms
that require eigenvector computations are used for this task. In this
talk, we first
show a theoretical equivalence between the objective function of
minimizing
weighted graph cuts, and the objective function used in the weighted
kernel k-means
clustering formulation. However, kernel k-means is prone to local
minima and
initialization problems. To avoid these problems, we have developed a
multilevel
approach for efficient and robust minimization of weighted graph cuts
that exploits
the above equivalence. Experimental results on large data sets show the
effectiveness
of the multilevel approach.