Authors: | Hansen, Pierre Brimberg, Jack Urošević, Dragan Mladenović, Nenad |
Affiliations: | Mathematical Institute of the Serbian Academy of Sciences and Arts | Title: | Solving large p-median clustering problems by primal-dual variable neighborhood search | Journal: | Data Mining and Knowledge Discovery | Volume: | 19 | Issue: | 3 | First page: | 351 | Last page: | 375 | Issue Date: | 1-Dec-2009 | Rank: | M21a | ISSN: | 1384-5810 | DOI: | 10.1007/s10618-009-0135-4 | Abstract: | Data clustering methods are used extensively in the data mining literature to detect important patterns in large datasets in the form of densely populated regions in a multi-dimensional Euclidean space. Due to the complexity of the problem and the size of the dataset, obtaining quality solutions within reasonable CPU time and memory requirements becomes the central challenge. In this paper, we solve the clustering problem as a large scale p-median model, using a new approach based on the variable neighborhood search (VNS) metaheuristic. Using a highly efficient data structure and local updating procedure taken from the OR literature, our VNS procedure is able to tackle large datasets directly without the need for data reduction or sampling as employed in certain popular methods. Computational results demonstrate that our VNS heuristic outperforms other local search based methods such as CLARA and CLARANS even after upgrading these procedures with the same efficient data structures and local search. We also obtain a bound on the quality of the solutions by solving heuristically a dual relaxation of the problem, thus introducing an important capability to the solution process. |
Keywords: | Data clustering | Variable neighborhood search | Publisher: | Springer Link |
Show full item record
SCOPUSTM
Citations
66
checked on Dec 20, 2024
Page view(s)
22
checked on Dec 22, 2024
Google ScholarTM
Check
Altmetric
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.