Combination of Feature Selection and Learning Methods for IoT Data Fusion

Document Type : Research Article

Authors

Dept. of Computer Engineering, Shahid Bahonar University of Kerman, Kerman, Iran

Abstract

In this paper, we propose five data fusion schemes for the Internet of Things (IoT) scenario,
which are Relief and Perceptron (Re-P), Relief and Genetic Algorithm Particle Swarm Optimization (Re-
GAPSO), Genetic Algorithm and Artificial Neural Network (GA-ANN), Rough and Perceptron (Ro-P)
and Rough and GAPSO (Ro-GAPSO). All the schemes consist of four stages, including preprocessing
the data set based on curve fitting, reducing the data dimension and identifying the most effective feature
sets according to data correlation, training classification algorithms, and finally predicting new data
based on classification algorithms. The results derived from five compound schemes are investigated and
compared with each other with three metrics, namely, Quality of Train (QoT) Accuracy (Ac) and Storage
Capacity (SC). While the Re-P scheme is only capable of separating classes that are linearly separable,
Re-GAPSO one is a dynamic method, appropriate for constantly changing problems of the real life. On
the other hand, GA-ANN is a Wrapper method and despite Relief can adapt itself to the machine learning
algorithm. Meanwhile, Ro-P scheme is useful for analyzing vague and imprecise information and, unlike
GA-ANN, has less calculative costs. Among these five schemes, Ro-GAPSO is a more precise one, which
has less calculative cost and does not become stuck in local minima. Experimental results show that Re-P
outperforms other proposed and existing methods in terms of computational time complexity.

Keywords

Main Subjects


[1] X. Qin, Y. Gu, Data fusion in the Internet of Things, Procedia Engineering, 15 (2011) 3023-3026.
[2] H.Y. Shwe, X.-H. Jiang, S. Horiguchi, Energy saving in wireless sensor networks, Journal of Communication and Computer, 6(5) (2009) 20-27.
[3] G. Anastasi, M. Conti, M. Di Francesco, A. Passarella, Energy conservation in wireless sensor networks: A survey, Ad hoc networks, 7(3) (2009) 537-568.
[4] M. Lewitt, R. Polikar, An ensemble approach for data fusion with Learn++, Multiple Classifier Systems, (2003) 161-161.
[5] W.-T. Sung, M.-H. Tsai, Data fusion of multi-sensor for IOT precise measurement based on improved PSO algorithms, Computers & Mathematics with Applications, 64(5) (2012) 1450-1461.
[6] J. Zhou, L. Hu, F. Wang, H. Lu, K. Zhao, An efficient multidimensional fusion algorithm for IoT data based on partitioning, tsinghua science and technology, 18(4) (2013) 369-378.
[7] A.R. Pinto, C. Montez, G. Araújo, F. Vasques, P. Portugal, An approach to implement data fusion techniques in wireless sensor networks using genetic machine learning algorithms, Information fusion, 15 (2014) 90-101.
[8] R. Gravina, P. Alinia, H. Ghasemzadeh, G. Fortino, Multi-sensor fusion in body sensor networks: State-of-the-art and research challenges, Information Fusion, 35 (2017) 68-80.
[9] M.M. Fouad, N.E. Oweis, T. Gaber, M. Ahmed, V. Snasel, Data mining and fusion techniques for WSNs as a source of the big data, Procedia Computer Science, 65 (2015) 778-786.
[10] M. Marjani, F. Nasaruddin, A. Gani, A. Karim, I.A.T. Hashem, A. Siddiqa, I. Yaqoob, Big IoT Data Analytics: Architecture, Opportunities, and Open Research Challenges, IEEE Access, 5 (2017) 5247-5261.
[11] D.C. Mocanu, E. Mocanu, P.H. Nguyen, M. Gibescu, A. Liotta, Big IoT data mining for real-time energy disaggregation in buildings, in: Systems, Man, and Cybernetics (SMC), 2016 IEEE International Conference on, IEEE, 2016, pp. 003765-003769.
[12] L. Wald, Some terms of reference in data fusion, IEEE Transactions on geoscience and remote sensing, 37(3) (1999) 1190-1193.
[13] E.F. Nakamura, A.A. Loureiro, A.C. Frery, Information fusion for wireless sensor networks: Methods, models, and classifications, ACM Computing Surveys (CSUR), 39(3) (2007) 9.
[14] H. Almuallim, T.G. Dietterich, Learning With Many Irrelevant Features, in: AAAI, 1991, pp. 547-552.
[15] Y. Sun, D. Wu, A relief based feature extraction algorithm, in: Proceedings of the 2008 SIAM International Conference on Data Mining, SIAM, 2008, pp. 188-195.
[16] M.S. Mohamad, Feature selection method using genetic algorithm for the classification of small and high dimension data, in: Proc. Int. Symp. Info. Com. Tech., 2004, 2004, pp. 13-16.
[17] A. Golmohammadi, N. Shams Ghareneh, A. Keramati, B. Jahandideh, Importance analysis of travel attributes using a rough set-based neural network: The case of Iranian tourism industry, Journal of Hospitality and Tourism Technology, 2(2) (2011) 155-171.
[18] B. Ahn, S. Cho, C. Kim, The integrated methodology of rough set theory and artificial neural network for business failure prediction, Expert systems with applications, 18(2) (2000) 65-74.
[19] G.H. John, R. Kohavi, K. Pfleger, Irrelevant features and the subset selection problem, in: Machine learning: proceedings of the eleventh international conference, 1994, pp. 121-129.
[20] S. Yang, J. Gu, Feature selection based on mutual information and redundancy-synergy coefficient, Journal of Zhejiang University-Science A, 5(11) (2004) 1382-1391.
[21] D. Wei, Clustering algorithms for sensor networks and mobile ad hoc networks to improve energy efficiency, University of Cape Town, 2007.
[22] Y. LiCF, W. ChenGH, An Energy-Efficient Unequal Clustering Mechanism for Wireless Sensor Networks, Proceedings of the Second IEEE International Conference on Mobile Ad-Hoc and Sensor Systems (MASS2005), Washing ton, DC, (2005).
[23] L. Fausett, L. Fausett, Fundamentals of neural networks: architectures, algorithms, and applications, Prentice- Hall, 1994.
[24] J. Langeveld, A.P. Engelbrecht, A generic set-based particle swarm optimization algorithm, in: International conference on swarm intelligence, ICSI, 2011, pp. 1-10.
[25] http://web.mit.edu/cron/group/house_n/data/PlaceLab/ PlaceLab.htm [seen Aug., 2017]
[26] D. Roobaert, G. Karakoulas, N. Chawla, Information gain, correlation and support vector machines, Feature extraction, (2006) 463-470.
[27] L. Yu, H. Liu, Feature selection for high-dimensional data: A fast correlation-based filter solution, in: Proceedings of the 20th international conference on machine learning (ICML-03), 2003, pp. 856-863.