VNS-BASED ALGORITHMS FOR THE CENTROID-BASED CLUSTERING PROBLEM

Ivan P. Rozhnov; Victor I. Orlov; Lev A. Kazakovtsev

doi:10.22190/FUMI1905957R

VNS-BASED ALGORITHMS FOR THE CENTROID-BASED CLUSTERING PROBLEM

Ivan P. Rozhnov, Victor I. Orlov, Lev A. Kazakovtsev

DOI Number

https://doi.org/10.22190/FUMI1905957R

First page

957

Last page

972

Abstract

The k-means algorithm with the corresponding problem formulation is one of the ﬁrst methods that researchers use when solving a new automatic grouping (clus-tering) problem. Its improvement, modiﬁcation and combination with other algorithms are described in the works of many researchers. In this research, we propose new al-gorithms of the Greedy Heuristic Method, which use an idea of the search in variable neighborhoods for solving the classical cluster analysis problem, and allows us to obtain a more accurate and stable result of solving in comparison with the known algorithms. Our computational experiments show that the new algorithms allow us to obtain re-sults with better values of the objective function value (sum of squared distances) in comparison with classical algorithms such as k-means, j-means and genetic algorithms on various practically important datasets. In addition, we present the ﬁrst results for the GPU realization of the Greedy Heuristic Method.

Keywords

Greedy Heuristic; clustering problem; GPU; k-Means Problem; variable neighborhoods.

Full Text:

PDF

References

S. Vempala and G. Wang: A spectral algorithm for learning mixtures of distributions. FOCS. (2002), 841-860.

L. Kazakovtsev and A. Antamoshkin: Genetic Algorithm with Fast Greedy Heuristic for Clustering and Location Problems. Informatica, 38 (3) (2014), 229-240.

V. I. Orlov, D. V. Stashkov, L. A. Kazakovtsev, I. P. Rozhnov, O. B. Kazakovtseva and I. R. Nasyrov: Improved method of forming production batchs of electronic components with special quality requirements. Modern high technology. 1 (2018), 37-42.

H. Steinhaus: Sur la division des corps materiels en parties. Bull. Acad. Polon. Sci. Cl. III. IV (1956), 801-804.

S. P. Lloyd: Least Squares Quantization in PCM. IEEE Transactions on Information Theory. 28 (1982), 129-137.

J. B. MacQueen: Some Methods of Classication and Analysis of Multivariate Observations. Proceedings of the 5th Berkley Symposium on Mathematical Statistics and Probability, 1 (1967), 281-297.

V. I. Orlov, L. A. Kazakovtsev, I. P. Rozhnov, N. A. Popov and

V. V. Fedosov: neighbourhood search algorithm for k-means clustering. IOP Conf. Series: Materials Science and Engineering. 450 Article ID 022035 (2018), DOI:10.1088/1757-899X/450/2/022035.

B. B. Bhusare and S. M. Bansode: Centroids Initialization for K-Means Clustering using Improved Pillar Algorithm. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET). 3 Issue 4 (2014).

S. Wang: An Improved K-means Clustering Algorithm Based on Dissimilarity. International Conference on Mechatronic Sciences, Electric Engineering and Computer (MEC), Shenyang, China IEEE, (2013).

S. Mahmud, M. Rahman and N. Akhtar: Improvement of K-means clustering algorithm with better initial centroids based on weighted averagen. 7th International Conference on Electrical and Computer Engineering. IEEE. (2012), ISBN 9781467314367. DOI:10.1109/icece.2012.6471633.

K. A. Abdul Nazeer and M. P. Sebastian: Improving the Accuracy and Efficiency of the k-means Clustering Algorithm. Proceedings of the World Congress on Engineering I WCE 2009, July 1 - 3, 2009, London, U.K.,(2009).

P. Hansen and N. Mladenovic: J-Means: a new local search heuristic for minimum sum of squares clustering. Pattern Recognition. 34 Issue. 2,(2001) 405-413 DOI:10.1016/s0031-3203(99)00216-2.

A. Dempster, N. Laird and D. Rubin: Maximum likelihood estimation from incomplete data. Journal of the Royal Statistical Society, Series B. 39 (1977), 1-38.

D. V. Stashkov, V. I. Orlov, I. R. Nasyrov and L. A. Kazakovtsev: Application of the EM algorithm with a spherical Gaussian distribution to the problem of industrial product classification. Economics and Management Management Systems. 23 (1.1) (2017), 185-193.

D. V. Stashkov, M. N. Gudyma, L. A. Kazakovtsev, I. P. Rozhnov and V. I. Orlov: Algorithm for Series of Mixture Distribution Separation Problems. Reshetnevskie chteniya Krasnoyarsk, 1 (21) (2017), 327{328. (In Russ.).

P. Hansen and N. Mladenovic: Variable Neighborhood Search. Search Methodology /E.K.Bruke, G.Kendall [eds.]. Springer US. P. (2005), 211-238. DOI:10.1007/0-387-28356-0-8.

L. A. Kazakovtsev, A. A. Stupina and V. I. Orlov: Genetic algorithm modification with greedy heuristics for continuous allocation and classification problems. Management systems and information technology 2(56) (2014), 35-39.

R. Farahani, M. Hekmatfar and (eds.): Facility location: Concepts, models, algorithms and case studies. Berlin Heidelberg:Springer-Verlag. (2009), 429.

L. A. Kazakovtsev: The greedy heuristics method for systems of automatic grouping of objects. Diss ... Dr. tech. of science. Krasnoyarsk, (2016), 429.

L. A. Kazakovtsev, A. N. Antamoshkin and I. S. Masich : Fast Deterministic Algorithm for EEE Components Classification. IOP Conf. Series: Materials Science and Engineering. 94 (2015), article ID 012015, 10 P. DOI: 10.1088/1757-899X/04/1012015.

P. Hansen, J. Brimberg, D. Urosevic and N. Mladenovic: Solving large p-median clustering problems by primal dual variable neighborhood search. Data Mining and Knowledge Discovery. 19 (3) (2009), 351-375.

P. Hansen and N. Mladenovic. Neighborhood Search. Search Methodology /E.K. Bruke, G. Kendall [eds.]. Springer US. (2005), 211-238 DOI: 10.1007/0-387-28356-0-8.

P. Hansen and N. Mladenovic : Variable neighborhood search: principles and applications Eur. J. Oper. Res, 130 (2001), 449-467.

Y. Kochetov, N. Mladenovic and P. Hansen: Local search with alternating surroundings. Discrete. analysis and research oper. 2, 10:1 (2003), 11-43.

UCI Machine Learning Repository [http://archive.ics.uci.edu/ml], access date 28.03.2019.

Clustering basic benchmark [http://cs.joensuu./sipu/datasets], access date 28.03.2019.

W. Sheng, and X. Liu: A genetic k-medoids clustering algorithm. Journal of Heuristics. 12. 6. (2006). 447-466.

DOI: https://doi.org/10.22190/FUMI1905957R

Refbacks

There are currently no refbacks.

ISSN 2406-047X (Online)

Username
Password
Remember me