DOI QR코드

DOI QR Code

Genomic data Analysis System using GenoSync based on SQL in Distributed Environment

  • Seine Jang (Graduate School of Smart Convergence, Kwangwoon University) ;
  • Seok-Jae Moon (Graduate School of Smart Convergence, Kwangwoon University)
  • Received : 2024.07.19
  • Accepted : 2024.07.31
  • Published : 2024.09.30

Abstract

Genomic data plays a transformative role in medicine, biology, and forensic science, offering insights that drive advancements in clinical diagnosis, personalized medicine, and crime scene investigation. Despite its potential, the integration and analysis of diverse genomic datasets remain challenging due to compatibility issues and the specialized nature of existing tools. This paper presents the GenomeSync system, designed to overcome these limitations by utilizing the Hadoop framework for large-scale data handling and integration. GenomeSync enhances data accessibility and analysis through SQL-based search capabilities and machine learning techniques, facilitating the identification of genetic traits and the resolution of forensic cases. By pre-processing DNA profiles from crime scenes, the system calculates similarity scores to identify and aggregate related genomic data, enabling accurate prediction models and personalized treatment recommendations. GenomeSync offers greater flexibility and scalability, supporting complex analytical needs across industries. Its robust cloud-based infrastructure ensures data integrity and high performance, positioning GenomeSync as a crucial tool for reliable, data-driven decision-making in the genomic era.

Keywords

Acknowledgement

This paper was supported by the KwangWoon University Research Grant of 2024.

References

  1. O. A. Montesinos-Lopez et al., "A review of deep learning applications for genomic selection," BMC Genomics, vol. 22, no. 1. Springer Science and Business Media LLC, 06-Jan-2021. DOI: https://doi.org/10.1186/s12864-020-07319-x
  2. F. S. Collins and H. Varmus, "A New Initiative on Precision Medicine," New England Journal of Medicine, vol. 372, no. 9. Massachusetts Medical Society, pp. 793-795, 26-Feb-2015. DOI: https://doi.org/10.1056/NEJMp1500523
  3. J. M. Butler, "The future of forensic DNA analysis," Philosophical Transactions of the Royal Society B: Biological Sciences, vol. 370, no. 1674. The Royal Society, p. 20140252, 05-Aug-2015. DOI: https://doi.org/10.1098/rstb.2014.0252
  4. M. Jiang, C. Bu, J. Zeng, Z. Du, and J. Xiao, "Applications and challenges of high performance computing in genomics," CCF Transactions on High Performance Computing, vol. 3, no. 4. Springer Science and Business Media LLC, pp. 344-352, 19-Oct-2021. DOI: https://doi.org/10.1007/s42514-021-00081-w
  5. S. Das, C. J. McClain, and S. N. Rai, "Fifteen Years of Gene Set Analysis for High-Throughput Genomic Data: A Review of Statistical Approaches and Future Challenges," Entropy, vol. 22, no. 4. MDPI AG, p. 427, 10-Apr-2020 DOI: https://doi.org/10.3390/e22040427
  6. V. Marx, "The big challenges of big data," Nature, vol. 498, no. 7453. Springer Science and Business Media LLC, pp. 255-260, 12-Jun-2013. DOI: https://doi.org/10.1038/498255a
  7. Z. D. Stephens et al., "Big Data: Astronomical or Genomical?," PLOS Biology, vol. 13, no. 7. Public Library of Science (PLoS), p. e1002195, 07-Jul-2015. DOI: https://doi.org/10.1371/journal.pbio.1002195
  8. S. Hedayati, N. Maleki, T. Olsson, F. Ahlgren, M. Seyednezhad, and K. Berahmand, "MapReduce scheduling algorithms in Hadoop: a systematic study," Journal of Cloud Computing, vol. 12, no. 1. Springer Science and Business Media LLC, 10-Oct-2023. DOI: https://doi.org/10.1186/s13677-023-00520-9
  9. S. Walsh et al., "The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA," Forensic Science International: Genetics, vol. 7, no. 1. Elsevier BV, pp. 98-115, Jan-2013. DOI: https://doi.org/10.1016/j.fsigen.2012.07.005
  10. B. Langmead, M. C. Schatz, J. Lin, M. Pop, and S. L. Salzberg, "Searching for SNPs with cloud computing," Genome Biology, vol. 10, no. 11. Springer Science and Business Media LLC, p. R134, 2009. DOI: https://doi.org/10.1186/gb-2009-10-11-r134
  11. K. L. Hart, S. L. Kimura, V. Mushailov, Z. M. Budimlija, M. Prinz, and E. Wurmbach, "Improved eye- and skin-color prediction based on 8 SNPs," Croatian Medical Journal, vol. 54, no. 3. Croatian Medical Journals, pp. 248-256, Jun-2013. DOI: https://doi.org/10.3325/cmj.2013.54.248