The Evolution of Data Mining Techniques to Big Data Analytics: An Extensive Study with Application to Renewable Energy Data Analytics
Keywords:
Big Data Analytics, Data Mining, Renewable Energy, Wind Energy, Wind FarmsAbstract
Recently big data have become a buzzword, which forced the researchers to expand the existing data mining techniques to cope with the evolved nature of data and to develop new analytic techniques. Big data analytic techniques are serving many domains. In this paper, we provide a detailed comprehensive analysis and discussion of the data mining techniques, studying the changes that have been introduced to some of them that have been successfully developed into big data analytic techniques. The analysis also investigates the reasons behind the rest of data mining techniques that could not be evolved to big data analytics. A detailed study is also presented to discuss the application of big data analytics in the field of renewable energy studies.
Â
References
Xindong Wu,Vipin Kumar ,J. Ross Quinlan ,Joydeep Ghosh , Qiang Yang ,Hiroshi Motoda , Geoffrey J, McLachlan , Angus Ng ,Bing Liu , Philip S,Yu ,Zhi-Hua Zhou ,Michael Steinbach •,David J. Hand ,Dan Steinberg, :Top 10 algorithms in data mining", Springer-Verlag, 4 December 2007.
Hsinchun Chen, Roger H. L. Chiang, Veda C. Storey, "Business Intelligence And Analytics: From Big Data To Big Impact, Big Data Analytics An Oracle White Paper", MIS Quarterly vol. 36 no. 4, pp. 1165-1188/December 2012.
S. San M. Negnevitsky, N. Hatziargyriou, "Applications of Data Mining and Analysis Techniques in Wind Power Systems", 42440178X/06/$20.00 ©2006 IEEE.
Anushree A. Wasu,Harshada M .Kariya,Shreyas S. Tote ,"Evaluating renewable energy using data mining techniques in developing India", Journal of IJSER, IJSER(International Journal of Scientific & Engineering Research), vol. 4, Issue 12, December 2013.
Lionel Fugon, J´er´emieJuban and George Kariniotakis," Data mining for wind power forecasting", European Wind Energy Conference - Brussels, Belgium, April 2008.
Muhammad Shaheen, Muhammad Shahbaz, Khalid Afsar Khan Jadoon, "Data Mining For Wind Energy Site Selection", Proceedings of the World Congress on Engineering and Computer Science 2012 vol I WCECS 2012, October 24-26, 2012, San Francisco, USA.
Youssef, M., Gamal ATTIYA, and EL-SAYED Ayman. "New Framework For Improving Big Data Analysis Using Mobile Agent."
Krioukov, Andrew, "Integrating Renewable Energy Using Data Analytics Systems: Challenges and Opportunities." IEEE Data Eng. Bull. 34.1 (2011): 3-11.
Niu, Kun, Fang Zhao, and Shubo Zhang. "A fast classification algorithm for big data based on knn." Journal of Applied Sciences 13, no. 12, pp.2208.
Arinto Murdopo, "Distributed Decision Tree Learning for Mining Big Data Streams", July 2013.
A Min Tjoa, Iman Paryudi, Ahmad Ashari, "Performance Comparison between Naïve Bayes, Decision Tree and k-Nearest Neighbor in Searching Alternative Design in an Energy Simulation Tool", Journal of IJACSA, IJACSA (International Journal of Advanced Computer Science and Applications), vol. 4, no. 11, 2013.
Abu-Taha, Rimal. "Multi-criteria applications in renewable energy analysis: A literature review." , Proceedings of PICMET (Technology Management in the Energy Smart World), 11: IEEE, 2011.
Riondato, Matteo, and Eli Upfal. "Efficient discovery of association rules and frequent itemsets through sampling with tight performance guarantees. “Machine Learning and Knowledge Discovery in Databases. Springer Berlin Heidelberg, 2012. 25-41.
Machová, KristÃna, Frantisek Barcak, and Peter Bednár. "A bagging method using decision trees in the role of base classifiers." Acta Polytechnica Hungarica3.2 (2006): 121-132.
"Big data & green energy opportunities", Copyright IBM Corporation 2010, Copyright IBM Corporation 2010.
Shahrokni, van der Heijde, Lazarevic, Brandt, "Big Data GIS Analytics Towards Efficient Waste Management in Stockholm", 2nd International Conference on ICT for Sustainability (ICT4S 2014).
Chinmay Bhawe, "Big Data Classification Using Decision Trees On The Cloud", Master's Projects. Paper 317.
Chanchal Yadav, Shuliang Wang, Manoj Kumar,"Algorithm and approaches to handle large Data-A Survey", Journal of IJCSN, IJCSN (International Journal of Computer Science and Network), vol. 2, no. 3, 2013.
Erdi Ölmezoğulları, Ismail Ari, Online Association Rule Mining over Fast Data, 2013 IEEE International Congress on Big Data.
Suthaharan, Shan, "Big data classification: problems and challenges in network intrusion prediction with machine learning.â€, ACM SIGMETRICS Performance Evaluation Review 41.4 (2014): 70-73.
Evans, Michael R., Enabling Spatial Big Data via CyberGIS: Challenges and Opportunities." CyberGIS: Fostering a New Wave of Geospatial Innovation and Discovery, Springer Book, 2013.
Mark J. Embrechts, "bigDAARE: Big Data Analytics for Renewable Energy", CFES 2012-2013 Annual Conference January 25, 2013.
Anoop Verma, Andrew Kusiak, "Prediction of Status Patterns of Wind Turbines: A Data-Mining Approach", Journal of JSEE, JSEE (Journal of Solar Energy Engineering), FEBRUARY 2011.
Kuncheva, Ludmila I., and Juan J. RodrÃguez. "An experimental study on rotation forest ensembles." Multiple Classifier Systems. Springer Berlin Heidelberg, 2007. 459-468.
Kale Suvarna Vilas, "Big Data Mining", Journal of CSMR, CSMR( International Journal of Computer Science and Management Research eETECME), October 2013.
Mrs. Deepali KishorJadhav, "The New Challenges in Data Mining", Journal of IJIRCST, IJIRCST (International Journal of Innovative Research in Computer Science & Technology), September 2013.
Rong Liu, Qicheng Li, Feng Li, Lijun Mei, Juhnyoung Lee, Big Data Architecture for IT Incident Management, 2014 IEEE.
Han, Jiawei, Micheline Kamber, and Jian Pei," Data mining: concepts and techniques: concepts and techniques.", Elsevier, 2011.
Minaei-Bidgoli, Behrouz, and William F. Punch. "Using genetic algorithms for data mining optimization in an educational web-based system." Genetic and Evolutionary Computation—GECCO 2003. Springer Berlin Heidelberg, 2003.
Slimani, Thabet. "Application of rough set theory in data mining." arXiv preprint arXiv: 1311.4121 (2013).
Zdzisław Pawlak, ROUGH SETS AND DATA MINING, Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, ul. Baltycka 5, 44 100 Gliwice, Poland
Hegland, Markus. "Data mining techniques." Acta Numerica 2001 10 (2001): 313-355.
Mohammed J. Zaki, Limsoon Wong, DATA MINING TECHNIQUES, August 9, 2003 WSPC/Lecture Notes.
Freitas, Alex A, "A survey of evolutionary algorithms for data mining and knowledge discovery." Advances in evolutionary computing. Springer Berlin Heidelberg, 2003. 819-845.
Ozer, Patrick, "Data Mining Algorithms for Classification." Radboud University Nijmegen, January 2008.
Berkhin, Pavel, "A survey of clustering data mining techniques." Grouping multidimensional data. Springer Berlin Heidelberg, 2006. 25-71.
Aloisioa, G.,â€Scientific big data analytics challenges at large scale“Proceedings of Big Data and Extreme-scale Computing (BDEC) (2013).
Ularu, Elena Geanina, "Perspectives on Big Data and Big Data Analytics“, Journal of DBSJ, DBSJ (Database Systems Journal) pp.3-14.
Labrinidis, Alexandros, and H. V. Jagadish. "Challenges and opportunities with big data." Proceedings of the VLDB Endowment 5.12 (2012): 2032-2033.
Ms. Ashwini Mandale, and Prof.Shriniwas Gadage, "Big Data Analytics: Challenges, Tools", Journal of IJIRCST, IJIRCST (nternational Journal of Innovative Research in Computer Science & Technology), vol.3, no.3, May 2015.
Yadav, Chanchal, Shuliang Wang, and Manoj Kumar, "Algorithm and approaches to handle large Data-A Survey." arXiv preprint arXiv: 1307.5437(2013).
Wu, Xindong,â€Data mining with big data.†Knowledge and Data Engineering, IEEE Transactions on 26.1 (2014): 97-107.
Li, Deren, and Shuliang Wang. "Concepts, principles and applications of spatial data mining and knowledge discovery." Proceedings of the International Symposium on Spatio-Temporal Modeling, (STM’05), Beijing, China. 2005.
Gupta, Richa, "Journey from Data Mining to Web Mining to Big Data." arXiv preprint arXiv: 1404.4140 (2014).
Fan, Wei, and Albert Bifet, "Mining big data: current status, and forecast to the future." ACM SIGKDD Explorations Newsletter 14.2 (2013): 1-5.
Davenport, Thomas H., and Jill Dyché,"Big data in big companies." May 2013(2013).
Zaki, Mohammed J., and Wagner Meira Jr, "Data Mining and Analysis: Fundamental Concepts and Algorithms", Cambridge University Press, 2014.
Shunxiang, Xu, and Chen Dezhi. "2013 Third International Conference on Intelligent System Design and Engineering Applications ISDEA 2013."
Han, Jiawei, Micheline Kamber, and Jian Pei,"Data mining, southeast asia edition: Concepts and techniques", 2006.
Sastry, Kumara, David Goldberg, and Graham Kendall. "Genetic algorithms."Search methodologies. Springer US, 2005. 97-125.
Washio, Takashi, and Hiroshi Motoda,"State of the art of graph-based data mining.â€,AcmSigkdd Explorations Newsletter 5.1 (2003): 59-68.
Tamhane, Deepak S., and Sultana N. Sayyad, "Big Data Analysis Using Hace Theorem.", Journal of IJARCET , IJARCET (International Journal of Advanced Research in Computer Engineering & Technology), vol.4,2015.
Shafaque, Uzma, and Parag D. Thakare, "Algorithm and Approaches to Handle Big Data." IJCA Proceedings on National Level Technical Conference X-PLORE 2014.no. 1. Foundation of Computer Science (FCS), 2014.
Ularu, Elena Geanina, "Perspectives on Big Data and Big Data Analytics. “, Journal of DBSJ, DBSJ (Database Systems Journal) 2012.
De Francisci Morales, Gianmarco, "SAMOA: A platform for mining big data streams." Proceedings of the 22nd international conference on World Wide Web companion. International World Wide Web Conferences Steering Committee, 2013.
Cai, Xiao, FeipingNie, and Heng Huang,"Multi-view k-means clustering on big data." Proceedings of the Twenty-Third international joint conference on Artificial Intelligence, 2013.
Lim, A., L. Breiman, and A. Cutler, "bigrf: Big Random Forests: Classification and Regression Forests for Large Data Sets, 2014.
Kleiner, Ariel, "The big data bootstrap.â€.
Hand, David J.,"Statistics and data mining: intersecting disciplines." ACM SIGKDD Explorations Newsletter 1.1 (1999): 16-19.
Ceci, Michelangelo, “Big Data Techniques For Renewable Energy Market.
Buck, Samuel F, "A method of estimation of missing values in multivariate data suitable for use with an electronic computer.â€, Journal of the Royal Statistical Society, 1960.
Downloads
Published
Issue
Section
License
- Papers must be submitted on the understanding that they have not been published elsewhere (except in the form of an abstract or as part of a published lecture, review, or thesis) and are not currently under consideration by another journal published by any other publisher.
- It is also the authors responsibility to ensure that the articles emanating from a particular source are submitted with the necessary approval.
- The authors warrant that the paper is original and that he/she is the author of the paper, except for material that is clearly identified as to its original source, with permission notices from the copyright owners where required.
- The authors ensure that all the references carefully and they are accurate in the text as well as in the list of references (and vice versa).
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Attribution-NonCommercial 4.0 International that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
- The journal/publisher is not responsible for subsequent uses of the work. It is the author's responsibility to bring an infringement action if so desired by the author.