Data - Driven Leukemia Classification: Enhancing Diagnosis with AI Sirajddola Nadaf 1 and Shivaji Lamani 2 1 Senior Lecturer, Government Polytechnic Vijayapur 2 Senior Lecturer, Government Polytechnic Vijayapur Abstract: Leukemia , a life - threatening blood cancer, disrupts the normal production and function of white blood cells, leading to severe health complications. early and accurate diagnosis is crucial to improving patient outcomes, yet traditional diagnostic methods often fall short in detecting Leukemia in its early stages. This research explores the potential of Artificial Intelligence (AI) and Machine Learning (ML) algorithms in enhancing Leukemia detection. By analyzing key white blood cell (WBC) parameters, our AI - driven system not only determines the presence of Leukemia but also classifies its specific type, such as Acute Lymphoblastic Leukemia (ALL) or Chronic M yeloid Leukemia (CML). Additionally, we compare various AI/ML models in terms of accuracy, precision, and recall, aiming to establish the most effective algorithm for Leukemia classification. Our findings highlight the transformative potential of AI in med ical diagnostics, paving the way for faster, more reliable, and less invasive Leukemia detection methods. Keywords: Leukemia, AIML, Classification, diagnosis 1. Introduction: Leukemia is a malignant disorder that affects the hematopoietic system, leading to uncontrolled proliferation of abnormal white blood cells. These cancerous cells interfere with normal immune function, resulting in recurrent infections, anemia , and bleeding disorders. Early diagnosis is critical, as delays can lead to severe complicatio ns and reduced survival rates. However, conventional diagnostic techniques, such as blood tests and bone marrow biopsies, often fail to provide timely and precise identification of Leukemia types. With advancements in AI and ML, there is a growing interest in leveraging computational methods for medical diagnostics ( Tabasum et al. 2024) . AI - driven models can analyze key blood https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:423 parameters — including leucocyte count, hematocrit , monocyte count, and lymphocyte count — to detect Leukemia with greater accuracy and efficiency. This study focuses on the development of an intelligent diagnostic system that not only identifies Leukemia but also classifies its subtypes, such as Hodgkin Ly mphoma (HL) and Non - Hodgkin Lymphoma (NHL). Furthermore, this research presents a comparative analysis of multiple AI algorithms based on their diagnostic performance metrics, such as accuracy, precision, and recall. By integrating AI into Leukemia detecti on, we aim to reduce diagnostic delays, minimize invasive procedures, and ultimately improve patient care. The study underscores the potential of AI in revolutionizing cancer diagnostics and setting a new standard for precision medicine. To address the cha llenges associated with leukemia diagnosis, we propose an AI - powered automated learning system designed to accurately detect and classify leukemia types based on key white blood cell (WBC) parameters . This system leverages advanced Machine Learning (ML) al gorithms to enhance diagnostic precision and reduce reliance on manual analysis. Section 2 presents the literature review, followed by Section 3, which outlines the methodology. Section 4 discusses the results and analysis, while Section 5 concludes the st udy. 2. Literature Review In this section we briefly present the literature review. The following chart presents a comprehensive overview of various research studies conducted by different scholars in the field of leukemia detection using AI and Machine Learning techniques. These studies explore diverse approaches, including Support Vecto r Machines (SVM), K - Means Clustering, Decision Tree Algorithms, and Hybrid Models, to enhance the accuracy and efficiency of leukemia diagnosis. By analyzing blood smear images, gene expression data, and key blood parameters, these researchers have contrib uted significantly to improving early detection and classification of leukemia subtypes. This chart provides a structured summary of their findings, highlighting the advancements made in automated and AI - driven leukemia detection methods. https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:424 Table 2.1 Leuke mia Detection Using AI & Machine Learning – Research Insights Sl. No Research Title Authors & Year Brief Description 1 Leukemia Image Segmentation Using K - Means Clustering Algorithm Hannah Inbarani H., Ahmad Taher Azar, Jothi G (2020) Uses K - means clustering to segment leukemia images, isolating and identifying leukemic cells for better diagnosis. 2 Machine Learning - Based Classification of Leukemia from Blood Smear Images Kokeb Dese, Hakkins Raj, G elan Ayana (2021) Implements ML techniques for blood smear image classification , improving accuracy in detecting leukemic cells. 3 Detection of Blood Cancer ( Leukemia ) Using K - Means Algorithm Ranjitha P, Sudharshan, Duth P (2021) Investigates K - means clustering for identifying leukemic cells and enhancing early detection accuracy. 4 Machine Learning - Based System for Automatic Detection of Leukemia Cancer Supriya Mandal, Vani Daivajna, Rajagopalan V (2021) Uses ML - driven Decision Tree classification for automatic and highly accurate Leukemia detection. 5 Blood Cancer Prediction Using Leukemia Microarray Gene Data and Hybrid Logistic Vector Trees Model Vaibhav Rupapara, Furqan Rustam, Hina Fatima (2022) Introduces a hybrid logistic vector trees model that combines logistic regression and decision trees for improved leukemia diagnosis. 6 Innovative Approach for Leukemia Detection Using Microscopic Images Anuj Sharma, Deepak Prashar, Faizan Ahmed Khan (2022) Develops an auto mated image analysis system to isolate leukemic cells and improve blood cancer diagnosis accuracy. 7 Artificial Intelligence Model for Leukemia Detection Using Decision Tree Algorithm Mohammad Akter Hossain, K.M. Muzahidul Islam (2022) Develops an AI - powered Decision Tree model to classify leukemia cases with high diagnostic accuracy. 8 Classification for Leukemia Detection Using Multi - Class SVM Classifier Pranav More, Rekha Sugandhi (2023) Proposes a multi - class SVM for leucocyte detec tion and Leukemia https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:425 3. Methodology This approach focuses on predicting leukemia types using WBC parameters and evaluating machine learning algorithms to identify the most effective model. The system classifies leukemia as CML, ALL, NHL, HL, or benign based on parameters like leucocyte, hematocrit, monocyte, lymphocyte, and eosinophils. Trained on a labeled dataset, it ensures accurate predict ions. To enhance reliability, algorithms such as SVC, Decision Tree, K - Means, and Dense Forest were compared using accuracy, precision, recall, and F1 - score. By combining classification and algorithm evaluation, this method aims to develop an efficient too l for leukemia diagnosis. The entire methodology is illustrated in the following flowchart. classification, improving early diagnosis efficiency. 9 Classification of Peripheral Haemoglobin Smudge Pictures Using K - Means Ahmed I. Saleh, H. Arafat Ali, Mohamed M. (2023) Uses K - means clustering to classify peripheral blood smear im ages, automating blood cell classification for Leukemia detection. 10 Leukemia Detection Using Decision Tree Algorithm Nakul Magotra (2023) Implements Decision Tree classification for automated blood cell analysis, simplifying Leukemia detection. 11 An Efficient Acute Lymphocyte Leukemia Quantification in Blood Cell Images Using SVM Algorithm Iskandarova S.N., Tulaganova F.K., Akbarova M.A., Khaitov Kh.S. (2024) Uses SVM to quanti fy Acute Lymphoblastic Leukemia (ALL) in blood cell image analysis, achieving high accuracy in classifying leukemic cells. https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:426 Figure 3.1 Summary of the methodology 3.1 Working of the Algorithms 1. Dense Forest: An advanced variation of the Random Forest algorithm, Dense Forest uses an ensemble of decision trees to improve classification accuracy. Instead of relying on traditional feature selection, it incorporates deep learning - inspired techniques to enhance dec ision - making. Each tree in the forest votes, and the majority decision determines the final classification, making it robust and efficient for complex datasets. 2. K - Means: A clustering algorithm that groups data points based on similarity. It assigns each da ta point to the nearest cluster center (centroid) and iteratively updates these centroids to minimize intra - cluster variance. In leukemia classification, K - Means helps https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:427 identify patterns in WBC parameters by grouping similar cases together, making it useful for unsupervised learning tasks. 3. The Decision Tree algorithm is a supervised learning method that classifies data by recursively splitting it based on feature values, forming a tree - like structure. It selects the most significant feature using criteria li ke Gini Index or Information Gain and creates decision nodes that lead to different branches, ultimately reaching leaf nodes that represent final classifications. In leukemia diagnosis, the algorithm evaluates WBC parameters such as leucocyte and monocyte levels to systematically determine whether a case falls into categories like CML, ALL, NHL, HL, or benign . Decision Trees are interpretable and effective for structured medical data but require pruning to prevent overfitting. 4. Support Vector Machine (SVM): A supervised learning algorithm that finds the optimal hyperplane to separate different classes in high - dimensional space. SVM works by maximizing the margin between different categories, ensuring better generalization to unseen data. In leukemia diagnosis , it efficiently classifies WBC parameters into specific leukemia types by mapping them into a higher - dimensional space using kernel functions. 4. Results and discussion The below table shows the patient report with 5 basic CBC parameters. Table 4.1 CBC Report of patient - 1 Total Leuco cyte count Hemato crit Lymph ocyte Monoc yte Eosinophi ls Class 19200 38 21 3 0 Non - Hodgkin Lymphoma The output as observed in MATLAB IDE is shown, we have input each value and got the predicted results https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:428 Figure 4.2 Input and Output of Hematocrit count Figure 4.1 Input and Output of Leucocyte count https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:429 Figure 4. 3 nput and Output of Lymphocyte count Figure 4. 4 Input and Output of Monocyte count Figure 4. 5 Input and Output of Eosinophil https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:430 We input all five CBC parameters one by one, and the output predicted the class as Non - Hodgkin Lymphoma (NHL). This indicates that the patient is likely affected by NHL. The diagnosis was medically confirmed, verifying that our program accurately classifies the disease. 4 .1. Comparative analysis of algorithms To compare the effectiveness of our algorithms, we use three key performance metrics: accuracy, precision, and recall. We use the formula : Accuracy = (Number of Correct Predictions) / (Total Number of Predictions) We've plotted the accuracy of our algorithms against the number of neighbors (k) to see how they perform. Take a look at the graphs below: Figure 4. 6 Accuracy of Dense Forest Algorith m https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:431 Figure 4 .8 Accuracy of Decision tree Figure 4.7 Accuracy of K - means Algorithm https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:432 Figure 4. 9 Accuracy for SVM Accuracy Precision helps us understand how accurate our algorithm is when predicting positive results. It's the ratio of correct positive predictions to the total number of positive predictions made. The Precision Formula: Precision = True Positives (correctly predicted) / (True Positives + False Positives) Recall: The indicator for Sensitivity Recall, furthermore called the sensitivity, gauge is well our algorithm detects all relevant instances (true positives) within a dataset. It's the ratio of correct positive predictions to the actual n umber of positive cases. The Recall Formula: Recall = True Positives (correctly predicted) / (True Positives + False Negatives) https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:433 Figure 4.11 Precision and Recall of Dense Forest Algorithm Figure 4.12 Precision and Recall of K - means Algorithm https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:434 Figure 4.13 Precision and Recall for Decision Algo i rthm Figure 4.14 Precision and Recall for SVM Algorithm https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:435 Table 4.1 Table of comparison Algorithm Accuracy Precision Recall Dense Forest 86.67% 87.00% 86.67% K - Means 83.89% 83.31% 85.00% Decision Tree 81.67% 79.44% 81.67% SVM 71.67% 57.12% 70.00% We compared four algorithms — Dense Forest, K - Means, Decision Tree, and SVM — using three metrics: accuracy, precision, and recall. The results showed that Dense Forest outperformed the others, demonstrating superior accuracy and reliability in leukemia classification. 5. Conclusion: Our study highlights the role of AI and machine learning in enhancing leukemia diagnosis by analyzing WBC parameters. We co mpared four algorithms — Dense Forest, K - Means, SVM, and Decision Tree — using accuracy, precision, and recall as evaluation metrics. Among them, Dense Forest outperformed the others, demonstrating superior accuracy and reliability in classifying leukemia type s. This breakthrough reinforces the potential of AI - assisted technologies in early detection, improved patient outcomes, and enhanced diagnostic support for pathologists, ultimately transforming leukemia diagnosis and treatment. REFERENCES Ahmed I. Saleh A , H. Arafat Ali, Mohamed M.” Classification Of Peripheral Haemoglobin Smudge Pictures Using K - Means” (2023) Anuj Sharma, Deepak Prashar Faizan Ahmed, Khan “Innovative approach for Leukemia Detection Hemoglobin Cancer Classification using Microscopic Iskand arova S. N. Tulaganova F. K. Akbarova M. A Khaitov Kh. S.” An Efficient Acute Lymphocyte Leukemia Quantification In Blood Cell Images Using S V M Algorithm” International Journal of Theoretical and Applied Issues Of Digital Technologies (2024) Kokebdese, A Hakkins Raj, Agelan Ayana (2021)” Machine - Learning - Based Classification Of Leukemia From Blood Smear Images” (2021) Magotra, Nakul. "A Step Toward the Future: Using Machine Learning to Detect Leukemia." (2023). https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:436 Mohammad Akter Hossain, K.M. Muzahidul Islam ” Artificial Intelligence Model for leukemia Detection Using Decision Tree Algorithm” (2022) Supriya Mandal, Vani Daivajna, Rajagopalan V” Machine Learning Based System For Automatic Detection of Leukemia Cancer” (2021) Pranav More, Rekha Sugandhi “Classif ication for Leukemia detection Using Multi - Class Svm Classifier” (2023), Vaibhav Rupapara Furqan Rustam, Hina Fatima” Blood cancer prediction using, Leukemia Microarray Gene Data and Hybrid Logistic Vector Trees Model” (2022) Ranjitha, P Sudharshan, Duth P ” Detection Of Blood Cancer - Leukemia Using K - Means Algorithm”, Hannah Inbarani H., Ahmad Taher Azar and Jothi G” Leukemia image Segmentation Using K - Means clustering Algorithm” (2020). Tabasum Guledgudd, Noorullah Shariff and S.A. Quadri, “A Comparative St udy of K - Means, GMM, SVM, and Random Forest for Enhancing Machine Learning in Leukemia Diagnosis”, African Journal of biomedical Research, Vol. 27(4s), 4257 - 4268, 2024. Tabasum Guledgudd, Noorullah Shariff, C., & Quadri, S. A. “A comprehensive review: Sta te of art integrated technologies in IoHT applications”. African Journal of Science, Technology, Innovation and Development, 17(1), 32 – 55. https://doi.org/10.1080/20421338.2024.2417447, 2024 Tabasum Guledgudd, Noorullah Shariff, S.A. Quadri and Sayed Abdu lhayan, “Leukemia Disease: Overview and Detection Approaches”, Journal of Systems Engineering and Electronics (ISSN NO: 1671 - 1793) Vol: 34 Issue 6, pp.126 - 142, 2024 https://aegaeum.com/index.php/volume-13-issue-03-2025/ Volume 17, Issue 03, March/2025 ISSN NO: 0886-9367 Page No:437