Abstract
Introduction. Lung adenocarcinoma is a prevalent form of lung cancer, and mutations in the epidermal growth factor receptor (EGFR) gene are known to play a crucial role in its pathogenesis. This study aimed to develop a machine-learning model to predict EGFR mutations in lung adenocarcinoma patients using clinical and radiological features.
Methods. A case-control study was conducted using a dataset comprising 160 patients with lung adenocarcinoma. Several machine learning algorithms, including decision tree, linear regression, Naive Bayes, support vector machine, K-nearest neighbor, and random forest, were employed to predict EGFR mutations based on variables such as smoking status, tumor diameter, tumor location, bubble-like appearance on CT-scan, air-bronchogram on CT-scan, and tumor distribution.
Results. Most study subjects were over 50 years old (83.75%) and female (53.13%). The analysis results indicated that the random forest model demonstrated the best performance, achieving an accuracy of 83.33%, precision of 86.96%, recall of 80.00%, and an Area Under the Curve (AUC) of 90.0. The Naive Bayes model also performed well, with an accuracy of 85.42%, precision of 82.61%, recall of 86.36%, and an AUC of 91.0.
Conclusions. The study highlights the potential of machine learning techniques, particularly random forest and Naive Bayes, in accurately predicting EGFR mutations in lung adenocarcinoma patients based on readily available clinical and radiological features. These findings could contribute to the development of non-invasive, cost-effective, and efficient tools for EGFR mutation detection, ultimately facilitating personalized treatment approaches for lung adenocarcinoma patients.
Bahasa Abstract
Pendahuluan. Adenokarsinoma paru adalah salah satu bentuk kanker paru yang banyak ditemukan, dan mutasi gen reseptor faktor pertumbuhan epidermal (Epidermal Growth Factor Receptor/EGFR) diketahui berperan penting dalam patogenesisnya. Penelitian ini bertujuan mengembangkan model pembelajaran mesin untuk memprediksi mutasi EGFR pada pasien adenokarsinoma paru dengan menggunakan fitur klinis dan radiologis.
Metode. Sebuah studi kasus-kontrol dilakukan dengan menggunakan kumpulan data, terdiri dari 160 pasien adenokarsinoma paru. Beberapa algoritma pembelajaran mesin, termasuk decision tree, regresi linear, Naive Bayes, support vector machine, K-nearest neighbor, dan random forest, digunakan untuk memprediksi mutasi EGFR berdasarkan variabel-variabel seperti status merokok, diameter tumor, lokasi tumor, tampilan seperti gelembung pada CT scan, bronkogram pada CT scan, dan distribusi tumor.
Hasil. Mayoritas subjek berusia >50 tahun (83,75%), berjenis kelamin perempuan (53,13%). Hasil analisis menunjukkan bahwa model random forest menunjukkan kinerja terbaik dalam memprediksi mutasi EGFR, mencapai akurasi 83,33%, presisi 86,96%, recall 80,00%, dan Area Under the Curve (AUC) 90,0. Model Naive Bayes juga menunjukkan kinerja yang baik, dengan akurasi 85,42%, presisi 82,61%, recall 86,36%, dan AUC 91,0.
Kesimpulan. Penelitian ini menunjukkan potensi teknik pembelajaran mesin, khususnya random forest dan Naive Bayes, dalam memprediksi secara akurat mutasi EGFR pada pasien adenokarsinoma paru berdasarkan fitur klinis dan radiologis yang tersedia. Temuan ini dapat berkontribusi pada pengembangan alat yang non-invasif, hemat biaya, dan efisien untuk deteksi mutasi EGFR, yang pada akhirnya memfasilitasi pendekatan pengobatan yang dipersonalisasi untuk pasien adenokarsinoma paru.
Kata Kunci: Adenokarsinoma paru, deteksi dini, Indonesia, mutasi EGFR, pencegahan
Recommended Citation
Njoto, Edwin Nugroho; Pamungkas, Yuri; Putri, Atina I.W.; Haykal, Muhammad. Najib; Eljatin, Dwinka Syafira; and Djaputra, Edith Maria
(2024)
"Analisis Prediktif Mutasi EGFR pada Adenokarsinoma Paru Menggunakan Pendekatan Pembelajaran Mesin,"
Jurnal Penyakit Dalam Indonesia: Vol. 11:
Iss.
4, Article 6.
DOI: 10.7454/jpdi.v11i4.1641
Available at:
https://scholarhub.ui.ac.id/jpdi/vol11/iss4/6