Authors: Hansen, Pierre
Brimberg, Jack
Urošević, Dragan 
Mladenović, Nenad 
Affiliations: Mathematical Institute of the Serbian Academy of Sciences and Arts 
Title: Solving large p-median clustering problems by primal-dual variable neighborhood search
Journal: Data Mining and Knowledge Discovery
Volume: 19
Issue: 3
First page: 351
Last page: 375
Issue Date: 1-Dec-2009
Rank: M21a
ISSN: 1384-5810
DOI: 10.1007/s10618-009-0135-4
Data clustering methods are used extensively in the data mining literature to detect important patterns in large datasets in the form of densely populated regions in a multi-dimensional Euclidean space. Due to the complexity of the problem and the size of the dataset, obtaining quality solutions within reasonable CPU time and memory requirements becomes the central challenge. In this paper, we solve the clustering problem as a large scale p-median model, using a new approach based on the variable neighborhood search (VNS) metaheuristic. Using a highly efficient data structure and local updating procedure taken from the OR literature, our VNS procedure is able to tackle large datasets directly without the need for data reduction or sampling as employed in certain popular methods. Computational results demonstrate that our VNS heuristic outperforms other local search based methods such as CLARA and CLARANS even after upgrading these procedures with the same efficient data structures and local search. We also obtain a bound on the quality of the solutions by solving heuristically a dual relaxation of the problem, thus introducing an important capability to the solution process.
Keywords: Data clustering | Variable neighborhood search
Publisher: Springer Link

Show full item record


checked on May 18, 2024

Page view(s)

checked on May 9, 2024

Google ScholarTM




Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.