Metagenomic data analysis with machine learning to discover colorectal cancer-associated enzymes

dc.contributor.authorErsoz, Nur Sebnem
dc.contributor.authorKuzudisli, Cihan
dc.contributor.authorYousef, Malik
dc.contributor.authorBakir-Gungor, Burcu
dc.date.accessioned2024-09-02T10:27:17Z
dc.date.available2024-09-02T10:27:17Z
dc.date.issued2024en_US
dc.departmentHKÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümüen_US
dc.description.abstractThe human gut microbiome comprises over 10 trillion microbes and plays important roles in maintaining metabolism, body homeostasis, impacting immune function. Metagenomics which studies genomic data from clinical and environmental samples is crucial in understanding the interplay between the host and the gut microbiome. Recently, functional profiling of metagenomes helps to identify alterations in microbial functions, particularly enzyme-encoding genes. Colorectal cancer (CRC) is known as one of the leading causes of cancer-related deaths. In this study, we aimed to find the CRC-associated enzymes by analyzing metagenomic data with different machine learning methods. A total of 1262 samples including CRC and control groups from different countries were used in this study. This dataset was obtained by functionally profiling metagenomics data and estimating community level enzyme commission (EC) abundance values. For the analysis of this dataset, RCE-IFE and SVM-RCE machine learning methods, which are group-based feature selection methods, were compared with 6 different individual feature selection methods. 10 times Monte-Carlo Cross Validation was used in our experiments. It was observed that RCE-IFE, Extreme Gradient Boosting and Select K Best methods similarly provided the best performances. Especially in this study, besides the its high performance, the group-based feature selection method RCE-IFE grouped enzymes into clusters unlike TFS, and then identified biologically relevant CRC-associated enzymes. © 2024 IEEE.en_US
dc.identifier.citationErsoz N.S., Kuzudisli C., Yousef M. & Bakir-Gungor B. (2024). Metagenomic data analysis with machine learning to discover colorectal cancer-associated enzymes. 32nd IEEE Conference on Signal Processing and Communications Applications, SIU 2024 - Proceedings. https://doi.org/10.1109/SIU61531.2024.10601144.en_US
dc.identifier.doi10.1109/SIU61531.2024.10601144
dc.identifier.isbn979-835038896-1
dc.identifier.orcid0000-0003-4774-152Xen_US
dc.identifier.scopus2-s2.0-85200856780
dc.identifier.scopusqualityN/A
dc.identifier.urihttps://doi.org/10.1109/SIU61531.2024.10601144
dc.identifier.urihttps://hdl.handle.net/20.500.11782/4361
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherInstitute of Electrical and Electronics Engineers Inc.en_US
dc.relation.ispartof32nd IEEE Conference on Signal Processing and Communications Applications, SIU 2024 - Proceedings
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/restrictedAccessen_US
dc.subjectcolorectal cancer diagnosisen_US
dc.subjectcommunity-level enzyme commission (EC) abundance valuesen_US
dc.subjectgrouping based feature selectionen_US
dc.subjectmachine learningen_US
dc.subjectmetagenomics data analysisen_US
dc.titleMetagenomic data analysis with machine learning to discover colorectal cancer-associated enzymes
dc.typeArticle

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
101109SIU61531202410601144.pdf
Boyut:
418.04 KB
Biçim:
Adobe Portable Document Format
Açıklama:
Makale Dosyası

Lisans paketi

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
license.txt
Boyut:
1.44 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: