Feature Selection Methods for Data Dimensionality Reduction

Authors

    Sara Dehghani Department of Computer Engineering, Yas.C., Islamic Azad University, Yasuj, Iran
    Razieh Malekhosseini * Department of Computer Engineering, Yas.C., Islamic Azad University, Yasuj, Iran malekhoseini.r@iau.ac.ir
    Karamollah Bagherifard Department of Computer Engineering, Yas.C., Islamic Azad University, Yasuj, Iran
    Seyed Hadi Yaghoubyan Department of Computer Engineering, Yas.C., Islamic Azad University, Yasuj, Iran

Keywords:

Big data, Feature clustering, Feature selection, Optimization algorithms, classification

Abstract

Despite creating opportunities, platforms with large-scale data also pose significant computational challenges. An issue with high-dimensional data is that in many cases, not all of the data's features are important or vital for uncovering the knowledge hidden within it. Due to this, reducing the dimensionality of data remains a significant topic in many areas of data mining. Using feature selection techniques is one effective method for reducing the dimensionality of data. During the process of feature selection, a subset of the original features is selected by eliminating irrelevant and redundant features. This article analyzes and categorizes different feature selection techniques from different perspectives. After that, it provides an overview of data clustering concepts and categorizes different clustering algorithms. This article also investigates the use of optimization algorithms in feature selection methods and presents methods based on this approach. Next, this article compares and analyzes feature selection methods, emphasizing their strengths and weaknesses.

References

M. Alirezaei, S. T. A. Niaki, and S. A. A. Niaki, "A bi-objective hybrid optimization algorithm to reduce noise and data dimension in diabetes diagnosis using support vector machines," Expert Systems with Applications, vol. 127, pp. 47-57, 2019.

F. Asdaghi and A. Soleimani, "An effective feature selection method for web spam detection," Knowledge-Based Systems, vol. 166, pp. 198-206, 2019.

Y. Liu and Y. F. Zheng, "FS_SFS: A novel feature selection method for support vector machines," Pattern recognition, vol. 39, no. 7, pp. 1333-1345, 2006.

J. M. Cadenas, M. C. Garrido, and R. MartíNez, "Feature subset selection filter–wrapper based on low quality data," Expert systems with applications, vol. 40, no. 16, pp. 6241-6252, 2013.

X. Sun, Y. Liu, J. Li, J. Zhu, H. Chen, and X. Liu, "Feature evaluation and selection with cooperative game theory," Pattern recognition, vol. 45, no. 8, pp. 2992-3002, 2012.

S. M. H. Fard, A. Hamzeh, and S. Hashemi, "Using reinforcement learning to find an optimal set of features," Computers & Mathematics with Applications, vol. 66, no. 10, pp. 1892-1904, 2013.

H. Uğuz, "A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm," Knowledge-Based Systems, vol. 24, no. 7, pp. 1024-1032, 2011.

M. H. Aghdam, N. Ghasem-Aghaee, and M. E. Basiri, "Text feature selection using ant colony optimization," Expert systems with applications, vol. 36, no. 3, pp. 6843-6853, 2009.

D. Mladenić, "Feature selection for dimensionality reduction," in International Statistical and Optimization Perspectives Workshop" Subspace, Latent Structure and Feature Selection", 2005: Springer, pp. 84-102.

H. Liu and L. Yu, "Toward integrating feature selection algorithms for classification and clustering," IEEE Transactions on knowledge and data engineering, vol. 17, no. 4, pp. 491-502, 2005.

B. Chen, L. Chen, and Y. Chen, "Efficient ant colony optimization for image feature selection," Signal processing, vol. 93, no. 6, pp. 1566-1576, 2013.

Y. Saeys, I. Inza, and P. Larranaga, "A review of feature selection techniques in bioinformatics," bioinformatics, vol. 23, no. 19, pp. 2507-2517, 2007.

G. Chandrashekar and F. Sahin, "A survey on feature selection methods," Computers & Electrical Engineering, vol. 40, no. 1, pp. 16-28, 2014.

Y. Yang, Z. Ma, A. G. Hauptmann, and N. Sebe, "Feature selection for multimedia analysis by sharing information among multiple tasks," IEEE Transactions on Multimedia, vol. 15, no. 3, pp. 661-669, 2012.

I. Guyon, "A practical guide to model selection," Proc. Mach. Learn. Summer School Springer Text Stat, pp. 1-37, 2009.

X. Zhao, W. Deng, and Y. Shi, "Feature selection with attributes clustering by maximal information coefficient," Procedia Computer Science, vol. 17, pp. 70-79, 2013.

H. Liu, J. Sun, L. Liu, and H. Zhang, "Feature selection with dynamic mutual information," Pattern Recognition, vol. 42, no. 7, pp. 1330-1339, 2009.

M. Dorigo, V. Maniezzo, and A. Colorni, "Ant system: optimization by a colony of cooperating agents," IEEE transactions on systems, man, and cybernetics, part b (cybernetics), vol. 26, no. 1, pp. 29-41, 1996.

J. H. Holland, Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT press, 1992.

J. Kennedy and R. Eberhart, "Particle swarm optimization," in Proceedings of ICNN'95-international conference on neural networks, 1995, vol. 4: IEEE, pp. 1942-1948.

L. E. Raileanu and K. Stoffel, "Theoretical comparison between the gini index and information gain criteria," Annals of Mathematics and Artificial Intelligence, vol. 41, pp. 77-93, 2004.

T. M. Mitchell, "Machine learning," ed: McGraw-hill, 1997.

J. Biesiada and W. Duch, "Feature selection for high-dimensional data—a Pearson redundancy based filter," in Computer recognition systems 2: Springer, 2008, pp. 242-249.

Q. Gu, Z. Li, and J. Han, "Generalized fisher score for feature selection," arXiv preprint arXiv:1202.3725, 2012.

S. Theodoridis, A. Pikrakis, K. Koutroumbas, and D. Cavouras, Introduction to pattern recognition: a matlab approach. Academic Press, 2010.

X. He, D. Cai, and P. Niyogi, "Laplacian score for feature selection," Advances in neural information processing systems, vol. 18, 2005.

H. Peng, F. Long, and C. Ding, "Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy," IEEE Transactions on pattern analysis and machine intelligence, vol. 27, no. 8, pp. 1226-1238, 2005.

A. J. Ferreira and M. A. Figueiredo, "An unsupervised approach to feature discretization and selection," Pattern Recognition, vol. 45, no. 9, pp. 3048-3060, 2012.

M. Haindl, P. Somol, D. Ververidis, and C. Kotropoulos, "Feature selection based on mutual correlation," in Progress in Pattern Recognition, Image Analysis and Applications: 11th Iberoamerican Congress in Pattern Recognition, CIARP 2006 Cancun, Mexico, November 14-17, 2006 Proceedings 11, 2006: Springer, pp. 569-577.

Downloads

Published

2026-07-01

Submitted

2025-09-02

Revised

2026-01-01

Accepted

2026-01-05

Issue

Section

Articles

How to Cite

Dehghani, S. ., Malekhosseini, R., Bagherifard, K. ., & Yaghoubyan, S. H. (2026). Feature Selection Methods for Data Dimensionality Reduction. Management Strategies and Engineering Sciences, 1-11. https://msesj.com/index.php/mses/article/view/336

Similar Articles

41-50 of 205

You may also start an advanced similarity search for this article.