Machine Learning for Big Data Analytics: A Comprehensive Review

Authors

  • Tsendayush Erdenetsogt University of the Potomac, USA Author
  • Mehtab Jamal Gomal University, Pakistan Author

DOI:

https://doi.org/10.70445/gtst.2.2.2026.199-218

Keywords:

Machine learning, Big data analytics, Deep learning, Data Mining, Scalable systems, AI, Real time processing

Abstract

This review examines the convergence of machine learning and big data analytics, showcasing how machine learning has contributed to big data analytics and the extraction of valuable insights from large-scale data. It explores the properties of big data, principles of machine learning, scalable algorithms and data pipelines that are backed by distributed and cloud computing platforms. The research explores use cases in healthcare, finance, retail, social media, and smart cities, showing the broad reach of data science. Challenges including scalability, data quality, privacy, and interpretability are discussed, as are emerging areas such as federated learning, explainable AI, and edge computing. It concludes that this integration is critical to future "smart, fast and fair" analytics.

References

[1]. Gupta P, Sharma A, Jindal R. Scalable machine‐learning algorithms for big data analytics: a comprehensive review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2016 Nov;6(6):194-214.’

[2]. Rane NL, Paramesha M, Choudhary SP, Rane J. Machine learning and deep learning for big data analytics: A review of methods and applications. Partners Universal International Innovation Journal. 2024 Jun 25;2(3):172-97.

[3]. Sen S, Agarwal S, Chakraborty P, Singh KP. Astronomical big data processing using machine learning: A comprehensive review. Experimental Astronomy. 2022 Feb;53(1):1-43.

[4]. Kirola M, Memoria M, Dumka A, Joshi K. A comprehensive review study on: optimized data mining, machine learning and deep learning techniques for breast cancer prediction in big data context. Biomedical and Pharmacology Journal. 2022 Mar 31;15(1):13-25.

[5]. Li W, Chai Y, Khan F, Jan SR, Verma S, Menon VG, Kavita F, Li X. A comprehensive survey on machine learning-based big data analytics for IoT-enabled smart healthcare system. Mobile networks and applications. 2021 Feb;26(1):234-52.

[6]. Devaraj J, Madurai Elavarasan R, Shafiullah GM, Jamal T, Khan I. A holistic review on energy forecasting using big data and deep learning models. International journal of energy research. 2021 Jul;45(9):13489-530.

[7]. Olaniyi OO, Okunleye OJ, Olabanji SO. Advancing data-driven decision-making in smart cities through big data analytics: A comprehensive review of existing literature. Current Journal of Applied Science and Technology. 2023 Aug 18;42(25):10-8.

[8]. Sarker S, Arefin MS, Kowsher M, Bhuiyan T, Dhar PK, Kwon OJ. A comprehensive review on big data for industries: challenges and opportunities. Ieee Access. 2022 Dec 26;11:744-69.

[9]. El-Sayed A, Abougabal M, Lazem S. Practical big data techniques for end-to-end machine learning deployment: a comprehensive review. Discover Data. 2025 Apr 15;3(1):11.

[10]. Salkuti SR. A survey of big data and machine learning. International Journal of Electrical and Computer Engineering (IJECE). 2020 Feb 15;10(1):575-80.

[11]. Nti IK, Quarcoo JA, Aning J, Fosu GK. A mini-review of machine learning in big data analytics: Applications, challenges, and prospects. Big Data Mining and Analytics. 2022 Jan 25;5(2):81-97.

[12]. Al-Jarrah OY, Yoo PD, Muhaidat S, Karagiannidis GK, Taha K. Efficient machine learning for big data: A review. Big Data Research. 2015 Sep 1;2(3):87-93.

[13]. Singh N, Singh DP, Pant B. A comprehensive study of big data machine learning approaches and challenges. In2017 International Conference on Next Generation Computing and Information Systems (ICNGCIS) 2017 Dec 11 (pp. 80-85). IEEE.

[14]. Ameen DD, Kareem SW, Hasan SB. A Big Data, Bigger Impact: A Comprehensive Review of Machine Learning Advancements. In2024 International Conference on Electrical Engineering and Computer Science (ICECOS) 2024 Sep 25 (pp. 1-6). IEEE.

[15]. Naeem S, Ali A, Anam S, Ahmed MM. An unsupervised machine learning algorithms: Comprehensive review. International Journal of Computing and Digital Systems. 2023 Mar 2.

[16]. Ponnusamy VK, Kasinathan P, Madurai Elavarasan R, Ramanathan V, Anandan RK, Subramaniam U, Ghosh A, Hossain E. A comprehensive review on sustainable aspects of big data analytics for the smart grid. Sustainability. 2021 Dec 1;13(23):13322.

[17]. Zhang W, Gu X, Tang L, Yin Y, Liu D, Zhang Y. Application of machine learning, deep learning and optimization algorithms in geoengineering and geoscience: Comprehensive review and future challenge. Gondwana Research. 2022 Sep 1;109:1-7.

[18]. Szymańska E. Modern data science for analytical chemical data–A comprehensive review. Analytica chimica acta. 2018 Oct 22;1028:1-0.

[19]. Ahmed A, Xi R, Hou M, Shah SA, Hameed S. Harnessing big data analytics for healthcare: A comprehensive review of frameworks, implications, applications, and impacts. IEEE Access. 2023 Oct 10;11:112891-928.

[20]. Ashqar RI, Ramos CM. Machine-learning holistic review in tourism and hospitality. InThe International Conference on Global Economic Revolutions 2023 Feb 27 (pp. 78-84). Cham: Springer Nature Switzerland.

[21]. Sharma A, Jain A, Gupta P, Chowdary V. Machine learning applications for precision agriculture: A comprehensive review. IEEE access. 2020 Dec 31;9:4843-73.

[22]. Mohammadi M, Al-Fuqaha A, Sorour S, Guizani M. Deep learning for IoT big data and streaming analytics: A survey. IEEE Communications Surveys & Tutorials. 2018 Jun 6;20(4):2923-60.

[23]. Zhang Q, Yang LT, Chen Z, Li P. A survey on deep learning for big data. Information Fusion. 2018 Jul 1;42:146-57.

[24]. Wang J, Xu C, Zhang J, Zhong R. Big data analytics for intelligent manufacturing systems: A review. Journal of Manufacturing Systems. 2022 Jan 1;62:738-52.

[25]. Jha K, Doshi A, Patel P, Shah M. A comprehensive review on automation in agriculture using artificial intelligence. Artificial Intelligence in Agriculture. 2019 Jun 1;2:1-2.

[26]. Liakos KG, Busato P, Moshou D, Pearson S, Bochtis D. Machine learning in agriculture: A review. Sensors. 2018 Aug 14;18(8):2674.

[27]. Kakani AB, Nandiraju SK, Chundru SK, Vangala SR, Polam RM, Kamarthapu B. Big Data and Predictive Analytics for Customer Retention: Exploring the Role of Machine Learning in E-Commerce. International Journal of Emerging Trends in Computer Science and Information Technology. 2021 Jun 30;2(2):26-34.

[28]. Ezugwu AE, Ikotun AM, Oyelade OO, Abualigah L, Agushaka JO, Eke CI, Akinyelu AA. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Engineering applications of artificial intelligence. 2022 Apr 1;110:104743.

[29]. Himeur Y, Elnour M, Fadli F, Meskin N, Petri I, Rezgui Y, Bensaali F, Amira A. AI-big data analytics for building automation and management systems: a survey, actual challenges and future perspectives. Artificial intelligence review. 2022 Oct 15;56(6):4929.

[30]. Sharifani K, Amini M. Machine learning and deep learning: A review of methods and applications. World Information Technology and Engineering Journal. 2023;10(07):3897-904.

[31]. Ma L, Liu Y, Zhang X, Ye Y, Yin G, Johnson BA. Deep learning in remote sensing applications: A meta-analysis and review. ISPRS journal of photogrammetry and remote sensing. 2019 Jun 1;152:166-77.

[32]. Mienye ID, Swart TG, Obaido G. Recurrent neural networks: A comprehensive review of architectures, variants, and applications. Information. 2024 Aug 25;15(9):517.

[33]. Kibria MG, Nguyen K, Villardi GP, Zhao O, Ishizu K, Kojima F. Big data analytics, machine learning, and artificial intelligence in next-generation wireless networks. IEEE access. 2018 May 17;6:32328-38.

[34]. Botín-Sanabria DM, Mihaita AS, Peimbert-García RE, Ramírez-Moreno MA, Ramírez-Mendoza RA, Lozoya-Santos JD. Digital twin technology challenges and applications: A comprehensive review. Remote Sensing. 2022 Mar 9;14(6):1335.

[35]. Terven J, Córdova-Esparza DM, Romero-González JA. A comprehensive review of yolo architectures in computer vision: From yolov1 to yolov8 and yolo-nas. Machine learning and knowledge extraction. 2023 Nov 20;5(4):1680-716.

[36]. Sarker IH, Kayes AS, Badsha S, Alqahtani H, Watters P, Ng A. Cybersecurity data science: an overview from machine learning perspective. Journal of Big data. 2020 Jul 1;7(1):41.

[37]. Emmert-Streib F, Yang Z, Feng H, Tripathi S, Dehmer M. An introductory review of deep learning for prediction models with big data. Frontiers in artificial intelligence. 2020 Feb 28;3:4.

[38]. Rai R, Tiwari MK, Ivanov D, Dolgui A. Machine learning in manufacturing and industry 4.0 applications. International Journal of Production Research. 2021 Aug 18;59(16):4773-8.

[39]. Wang J, Lu S, Wang SH, Zhang YD. A review on extreme learning machine. Multimedia Tools and Applications. 2022 Dec;81(29):41611-60.

[40]. Alzubaidi L, Bai J, Al-Sabaawi A, Santamaría J, Albahri AS, Al-Dabbagh BS, Fadhel MA, Manoufali M, Zhang J, Al-Timemy AH, Duan Y. A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications. Journal of Big Data. 2023 Apr 14;10(1):46.

[41]. Bahroun Z, Anane C, Ahmed V, Zacca A. Transforming education: A comprehensive review of generative artificial intelligence in educational settings through bibliometric and content analysis. Sustainability. 2023 Aug 29;15(17):12983.

[42]. Nath AG, Udmale SS, Singh SK. Role of artificial intelligence in rotor fault diagnosis: a comprehensive review. Artificial Intelligence Review. 2021 Apr 1;54(4).

[43]. Baryannis G, Validi S, Dani S, Antoniou G. Supply chain risk management and artificial intelligence: state of the art and future research directions. International journal of production research. 2019 Apr 3;57(7):2179-202.

[44]. Zebari R, Abdulazeez A, Zeebaree D, Zebari D, Saeed J. A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. Journal of Applied Science and Technology Trends. 2020 May 15;1(1):56-70.

[45]. Mehta N, Pandit A. Concurrence of big data analytics and healthcare: A systematic review. International journal of medical informatics. 2018 Jun 1;114:57-65.

[46]. Sircar A, Yadav K, Rayavarapu K, Bist N, Oza H. Application of machine learning and artificial intelligence in oil and gas industry. Petroleum Research. 2021 Dec 1;6(4):379-91.

[47]. Wang S, Cao J, Philip SY. Deep learning for spatio-temporal data mining: A survey. IEEE transactions on knowledge and data engineering. 2020 Sep 22;34(8):3681-700.

[48]. Pouyanfar S, Sadiq S, Yan Y, Tian H, Tao Y, Reyes MP, Shyu ML, Chen SC, Iyengar SS. A survey on deep learning: Algorithms, techniques, and applications. ACM computing surveys (CSUR). 2018 Sep 18;51(5):1-36.

[49]. Roh Y, Heo G, Whang SE. A survey on data collection for machine learning: a big data-ai integration perspective. IEEE Transactions on Knowledge and Data Engineering. 2019 Oct 8;33(4):1328-47.

[50]. Wang Y, Chen Q, Hong T, Kang C. Review of smart meter data analytics: Applications, methodologies, and challenges. IEEE Transactions on smart Grid. 2018 Mar 22;10(3):3125-48.

[51]. Da Costa KA, Papa JP, Lisboa CO, Munoz R, de Albuquerque VH. Internet of Things: A survey on machine learning-based intrusion detection approaches. Computer Networks. 2019 Mar 14;151:147-57.

[52]. Çınar ZM, Abdussalam Nuhu A, Zeeshan Q, Korhan O, Asmael M, Safaei B. Machine learning in predictive maintenance towards sustainable smart manufacturing in industry 4.0. Sustainability. 2020 Oct 5;12(19):8211.

[53]. Chen K, Chen H, Zhou C, Huang Y, Qi X, Shen R, Liu F, Zuo M, Zou X, Wang J, Zhang Y. Comparative analysis of surface water quality prediction performance and identification of key water parameters using different machine learning models based on big data. Water research. 2020 Mar 15;171:115454.

[54]. Çınar ZM, Abdussalam Nuhu A, Zeeshan Q, Korhan O, Asmael M, Safaei B. Machine learning in predictive maintenance towards sustainable smart manufacturing in industry 4.0. Sustainability. 2020 Oct 5;12(19):8211.

[55]. Zhang L, Wen J, Li Y, Chen J, Ye Y, Fu Y, Livingood W. A review of machine learning in building load prediction. Applied Energy. 2021 Mar 1;285:116452.

[56]. Song H, Kim M, Park D, Shin Y, Lee JG. Learning from noisy labels with deep neural networks: A survey. IEEE transactions on neural networks and learning systems. 2022 Mar 7;34(11):8135-53.

[57]. Sarker IH. Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions. SN computer science. 2021 Nov;2(6):1-20.

[58]. Anwar SM, Majid M, Qayyum A, Awais M, Alnowami M, Khan MK. Medical image analysis using convolutional neural networks: a review. Journal of medical systems. 2018 Nov;42(11):226.

[59]. Nguyen G, Dlugolinsky S, Bobák M, Tran V, Lopez Garcia A, Heredia I, Malík P, Hluchý L. Machine learning and deep learning frameworks and libraries for large-scale data mining: a survey. Artificial Intelligence Review. 2019 Jun;52(1):77-124.

[60]. Minaee S, Boykov Y, Porikli F, Plaza A, Kehtarnavaz N, Terzopoulos D. Image segmentation using deep learning: A survey. IEEE transactions on pattern analysis and machine intelligence. 2021 Feb 17;44(7):3523-42.

[61]. Dargan S, Kumar M, Ayyagari MR, Kumar G. A Survey of Deep Learning and Its Applications: A New Paradigm to Machine Learning: S. Dargan et al. Archives of computational methods in engineering. 2020 Sep;27(4):1071-92.

[62]. Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S. Deep learning approach for intelligent intrusion detection system. IEEE access. 2019 Apr 3;7:41525-50.

[63]. Misra NN, Dixit Y, Al-Mallahi A, Bhullar MS, Upadhyay R, Martynenko A. IoT, big data, and artificial intelligence in agriculture and food industry. IEEE Internet of things Journal. 2020 May 29;9(9):6305-24.

[64]. Ahmad Z, Shahid Khan A, Wai Shiang C, Abdullah J, Ahmad F. Network intrusion detection system: A systematic study of machine learning and deep learning approaches. Transactions on Emerging Telecommunications Technologies. 2021 Jan;32(1):e4150.

[65]. Ma X, Wu J, Xue S, Yang J, Zhou C, Sheng QZ, Xiong H, Akoglu L. A comprehensive survey on graph anomaly detection with deep learning. IEEE transactions on knowledge and data engineering. 2021 Oct 8;35(12):12012-38.

[66]. Najjar R. Redefining radiology: a review of artificial intelligence integration in medical imaging. Diagnostics. 2023 Aug 25;13(17):2760.

[67]. Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, Santamaría J, Fadhel MA, Al-Amidie M, Farhan L. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. Journal of big Data. 2021 Mar 31;8(1):53.

[68]. Choudhary K, DeCost B, Chen C, Jain A, Tavazza F, Cohn R, Park CW, Choudhary A, Agrawal A, Billinge SJ, Holm E. Recent advances and applications of deep learning methods in materials science. npj Computational Materials. 2022 Apr 5;8(1):59.

[69]. Zhang D, Yin J, Zhu X, Zhang C. Network representation learning: A survey. IEEE transactions on Big Data. 2018 Jun 25;6(1):3-28.

Downloads

Published

2026-04-25