Please use this identifier to cite or link to this item: http://dspace.cas.upm.edu.ph:8080/xmlui/handle/123456789/2681
Full metadata record
DC FieldValueLanguage
dc.contributor.authorBachini, Tristan Paul-
dc.date.accessioned2024-05-06T05:25:07Z-
dc.date.available2024-05-06T05:25:07Z-
dc.date.issued2023-06-
dc.identifier.urihttp://dspace.cas.upm.edu.ph:8080/xmlui/handle/123456789/2681-
dc.description.abstractAround the world, breast cancer remains to be the most frequent type of all cancers, and the major cause of death in women worldwide. A major factor in why the diagnosis of breast cancer through Fine Needle Aspiration results is still done after manual review of doctors, is because of the lack of explainability by the traditional black box machine learning models. This paper aims to incorporate a simple web user interface, and explainibility through the LIME python package. The performance of four machine learning models (K-Nearest Neighbors, Logistic Regression, Random Forest, and Support Vector Machine) were compared by its metrics (accuracy, precision, f1-score, and area-under-curve) produced when predicting breast cancer diagnosis, and its applicability with the LIME python package. The four models were utilized with the Breast Cancer Wisconsin Diagnostic Dataset with 10 different configurations a) only scaling applied, b) scaling then random oversampling, c) scaling, random oversampling, then feature extraction, d) scaling then feature extraction, e) scaling, feature extraction, then random oversampling. Configurations f-j are similar configurations, except it does not include scaling. The results show that in terms of metrics and applicability towards the LIME model, random forest with random oversampling produced the best results. As such, random forest with random oversampling was the model and configuration chosen to be applied towards the web application.en_US
dc.subjectLIMEen_US
dc.subjectRandom oversamplingen_US
dc.subjectAccuracyen_US
dc.subjectPrecisionen_US
dc.subjectf1-scoreen_US
dc.subjectArea-under-curveen_US
dc.subjectExplainabilityen_US
dc.subjectSupport vector machineen_US
dc.subjectLogistic regressionen_US
dc.subjectRandom foresten_US
dc.subjectK-nearest-neighborsen_US
dc.subjectFine needle aspirationen_US
dc.titleMachine Learning-Driven Breast Cancer Diagnosis Software Integrated with Explainable Artificial Intelligence Based on Fine Needle Aspirate Findingsen_US
dc.typeThesisen_US
Appears in Collections:Computer Science SP

Files in This Item:
File Description SizeFormat 
CD-CS105.pdf2.2 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.