Genetic classification of various familial relationships using the stacking ensemble machine learning approaches

  • Su Jin Jeong (Statistics Support Part, Medical Science Research Institute, Kyung Hee University Medical Center) ;
  • Hyo-Jung Lee (Product Development HQ, Dong-A ST) ;
  • Soong Deok Lee (Department of Forensic Medicine, College of Medicine, Seoul National University) ;
  • Ji Eun Park (Department of Statistics, Korea University) ;
  • Jae Won Lee (Department of Statistics, Korea University)
  • Received : 2023.05.24
  • Accepted : 2024.01.12
  • Published : 2024.05.31


Familial searching is a useful technique in a forensic investigation. Using genetic information, it is possible to identify individuals, determine familial relationships, and obtain racial/ethnic information. The total number of shared alleles (TNSA) and likelihood ratio (LR) methods have traditionally been used, and novel data-mining classification methods have recently been applied here as well. However, it is difficult to apply these methods to identify familial relationships above the third degree (e.g., uncle-nephew and first cousins). Therefore, we propose to apply a stacking ensemble machine learning algorithm to improve the accuracy of familial relationship identification. Using real data analysis, we obtain superior relationship identification results when applying meta-classifiers with a stacking algorithm rather than applying traditional TNSA or LR methods and data mining techniques.



This research was supported and funded by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. RS-2023-00208882) and the Korean National Police Agency [Project Name: Advancing the Appraisal Techniques of Forensic Entomology / Project Number: PR10-04-000-22].


