Kan Yu

Author Identifier

Kan Yu

Date of Award


Document Type

Thesis - ECU Access Only


Edith Cowan University

Degree Name

Master of Science (Computer Science) Research


School of Science

First Supervisor

Jitian Xiao

Second Supervisor

Leisa Armstrong

Third Supervisor

Brad (Guicheng) Zhang


Congenital Heart Defects (CHDs) are the most common type of human congenital anomaly, representing 0.8~1.2% of infants at birth and accounting for over 40% of prenatal deaths. Although the exact aetiology remains a significant challenge, epigenetic modifications, such as Deoxyribonucleic Acid (DNA) methylation, are thought to contribute to the pathogenesis of CHDs.

We aimed to investigate the value of machine learning (ML) in enhancing CHDs diagnosis, particularly for identifying susceptive genes by exploring high-throughput DNA methylation data. The Illumina Human Methylation EPIC BeadChip was used to screen the genome-wide DNA methylation profiles of 24 infants diagnosed with CHDs and 24 healthy infants without heart diseases. Primary preprocessing was conducted by using RnBeads and limma packages. The significantly differentially-methylated CpG sites in top 660 genes with the lowest p-value were selected and further investigated by using a random forest (RF) algorithm.

After training the algorithm, the RF classifiers were applied to a validation dataset of the testing samples with an accuracy rate of 100%. Three genes (MIR663, FGF3 and FAM64A) were identified not only for diagnosing CHDs, but also for predicting CHDs by RF model, with an average sensitivity and specificity of 85% and 95%, respectively. This finding highlights that aberrant DNA methylation plays a significant role in the pathogenesis of CHDs, which may provide us with a potential approach in understanding CHDs. The sample size is limited in the current study. Future research works may consider replicating and refining our key findings in large-scale studies.