Abstract
Breast cancer is one of the leading causes of death among women worldwide. Accurate and early detection of breast cancer can ensure long-term survival for the patients. However, traditional classification algorithms usually aim only to maximize the classification accuracy, and cost failing to take into consideration the misclassification costs between different categories. Furthermore, the costs associated with missing a cancer case (false negative) are much higher than those of mislabeling a benign one (false positive).
To overcome this drawback and further improving the classification accuracy of the breast cancer diagnosis, in this work, present several machine learning algorithms such as Decision Tree (DR) , Random Forest (RF) , Logistic Regression (LR), and Support Vector Machine (SVM) . For all the phases of the work that required data treatment and machine learning techniques are going to use this tool. In technical terms ,the intended output of the work that enables the achievement of the business objectives described before is find the algorithms that can classify more efficiently the different types of breast cancer.
The result of the machine learning by calculate the accuracy of each model obtain , the random forest achieved 0.9857 % accuracy , decision tree achieved 0.9571% accuracy, SVM achieved 0.9714% accuracy, Logistic Regression achieved 0.9643 % accuracy , in an other word the Robust forest archive high accuracy.