Please use this identifier to cite or link to this item:
http://dspace.cas.upm.edu.ph:8080/xmlui/handle/123456789/2681
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Bachini, Tristan Paul | - |
dc.date.accessioned | 2024-05-06T05:25:07Z | - |
dc.date.available | 2024-05-06T05:25:07Z | - |
dc.date.issued | 2023-06 | - |
dc.identifier.uri | http://dspace.cas.upm.edu.ph:8080/xmlui/handle/123456789/2681 | - |
dc.description.abstract | Around the world, breast cancer remains to be the most frequent type of all cancers, and the major cause of death in women worldwide. A major factor in why the diagnosis of breast cancer through Fine Needle Aspiration results is still done after manual review of doctors, is because of the lack of explainability by the traditional black box machine learning models. This paper aims to incorporate a simple web user interface, and explainibility through the LIME python package. The performance of four machine learning models (K-Nearest Neighbors, Logistic Regression, Random Forest, and Support Vector Machine) were compared by its metrics (accuracy, precision, f1-score, and area-under-curve) produced when predicting breast cancer diagnosis, and its applicability with the LIME python package. The four models were utilized with the Breast Cancer Wisconsin Diagnostic Dataset with 10 different configurations a) only scaling applied, b) scaling then random oversampling, c) scaling, random oversampling, then feature extraction, d) scaling then feature extraction, e) scaling, feature extraction, then random oversampling. Configurations f-j are similar configurations, except it does not include scaling. The results show that in terms of metrics and applicability towards the LIME model, random forest with random oversampling produced the best results. As such, random forest with random oversampling was the model and configuration chosen to be applied towards the web application. | en_US |
dc.subject | LIME | en_US |
dc.subject | Random oversampling | en_US |
dc.subject | Accuracy | en_US |
dc.subject | Precision | en_US |
dc.subject | f1-score | en_US |
dc.subject | Area-under-curve | en_US |
dc.subject | Explainability | en_US |
dc.subject | Support vector machine | en_US |
dc.subject | Logistic regression | en_US |
dc.subject | Random forest | en_US |
dc.subject | K-nearest-neighbors | en_US |
dc.subject | Fine needle aspiration | en_US |
dc.title | Machine Learning-Driven Breast Cancer Diagnosis Software Integrated with Explainable Artificial Intelligence Based on Fine Needle Aspirate Findings | en_US |
dc.type | Thesis | en_US |
Appears in Collections: | Computer Science SP |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
CD-CS105.pdf | 2.2 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.