| Ovarian cancer is the second most dangerous gynecologic cancer with a high mortality
rate. The classification of gene expression data from high-dimensional and small-sample
gene expression data is a challenging task. The discovery of miRNAs, a small non-coding
RNA with 18–25 nucleotides in length that regulates gene expression, has revealed the
existence of a new array for regulation of genes and has been reported as playing a serious
role in cancer. By using LASSO and Elastic Net as embedded algorithms of feature
selection techniques, the present study identified 10 miRNAs that were regulated in
ovarian serum cancer samples compared to non-cancer samples in public available
dataset GSE106817: hsa-miR-5100, hsa-miR-6800-5p, hsa-miR-1233-5p, hsa-miR-
4532, hsa-miR-4783-3p, hsa-miR-4787-3p, hsa-miR-1228-5p, hsa-miR-1290, hsamiR-
3184-5p, and hsa-miR-320b. Further, we implemented state-of-the-art machine
learning classifiers, such as logistic regression, random forest, artificial neural network,
XGBoost, and decision trees to build clinical prediction models. Next, the diagnostic
performance of these models with identified miRNAs was evaluated in the internal
(GSE106817) and external validation dataset (GSE113486) by ROC analysis. The
results showed that first four prediction models consistently yielded an AUC of 100%.
Our findings provide significant evidence that the serum miRNA profile represents a
promising diagnostic biomarker for ovarian cancer. |