DOI QR코드

DOI QR Code

Resolving CTGAN-based data imbalance for commercialization of public technology

공공기술 사업화를 위한 CTGAN 기반 데이터 불균형 해소

  • Received : 2021.11.22
  • Accepted : 2021.12.03
  • Published : 2022.01.31

Abstract

Commercialization of public technology is the transfer of government-led scientific and technological innovation and R&D results to the private sector, and is recognized as a key achievement driving economic growth. Therefore, in order to activate technology transfer, various machine learning methods are being studied to identify success factors or to match public technology with high commercialization potential and demanding companies. However, public technology commercialization data is in the form of a table and has a problem that machine learning performance is not high because it is in an imbalanced state with a large difference in success-failure ratio. In this paper, we present a method of utilizing CTGAN to resolve imbalances in public technology data in tabular form. In addition, to verify the effectiveness of the proposed method, a comparative experiment with SMOTE, a statistical approach, was performed using actual public technology commercialization data. In many experimental cases, it was confirmed that CTGAN reliably predicts public technology commercialization success cases.

공공기술 사업화는 정부가 주도하는 과학기술의 혁신과 R&D 성과를 민간에 이전하는 것으로 경제 성장을 주도하는 핵심 성과로 인식되고 있다. 따라서 기술 이전을 활성화시키기 위해 성공 요인을 식별하거나 사업화 가능성이 높은 공공기술과 수요기업을 매칭하는 다양한 기계학습의 방법들이 연구되고 있다. 하지만 공공기술 사업화 데이터는 표 형태로 구성되어 있고, 성공-실패 비율이 큰 차이를 보이는 불균형 상태이기 때문에 기계학습 성능이 높지 않는 문제점을 가지고 있다. 이 논문에서는 표 형태로 구성된 공공기술 데이터에서 불균형을 해소하기 위해 CTGAN을 활용하는 방법을 제시한다. 또한 제시된 방법의 효과를 검증하기 위해 실제 공공기술 사업화 데이터를 활용하여 통계적 접근방법인 SMOTE와 비교 실험을 수행하였다. 다수의 실험 사례에서 CTGAN은 공공기술 사업화 성공사례를 안정적으로 예측하는 것을 확인하였다.

Keywords

References

  1. G. M. Grossman and E. Helpman, "Innovation and growth in the global economy," MIT Press, 1991.
  2. T. H. Kwon, "What makes Korean firms transfer public technology and commercialize well? : An empirical study on public technology licensee firms," Ph. D. dissertation, Hanyang University, Seoul, Korea, 2020.
  3. KISTI Institutional Repository. Finding a way for innovative growth of SMEs in data-based technology commercialization [Internet]. Available: https://repository.kisti.re.kr/handle/10580/15035.
  4. KISTI Institutional Repository. KISTI's technology commercialization platform continues to spread [Internet]. Available: https://repository.kisti.re.kr/handle/10580/15243.
  5. L. Xu, M. Skoularidou, A. Cuesta-Infante, and K. Veeramachaneni, "Modeling Tabular Data using Conditional GAN," in Proceeding of the 33th Conference on Neural Information Processing Systems(NeurIPS2019), Vancouver: NY, 2019.
  6. K. S. Hwang, "R&S accountability and dilemma within the Korean science and technology context," Korean Public Administration Review, vol. 50, no. 2, pp. 189-213, Jun. 2016. https://doi.org/10.18333/KPAR.50.2.189
  7. D. H. Jo, S. H. Choi, S. K. Kim, and H. J. Lee, "The Effect of Public Technology Value on Technology Transfer Performance," Journal of Digital Convergence, vol. 16, no. 3, pp. 189-199, Mar. 2018. https://doi.org/10.14400/JDC.2018.16.3.189
  8. H. S. Lee and S. S. Kim, "Naive Bayes Clasifier based Anomalous Propagation Echo Identification using Class Imbalanced Data," Journal of the Korea Institute of Information and Communication Engineering, vol. 20, no. 6, pp. 1063-1068, Jun. 2016. https://doi.org/10.6109/JKIICE.2016.20.6.1063
  9. H. S. Kim and H. S. Lee, "Generative Adversarial Networks based Data Generation Framework for Overcoming Imbalanced Manufacturing Process Data," Journal of Korean Institute of Intelligent Systems, vol. 29, no. 1, pp. 1-8, Feb. 2019. https://doi.org/10.5391/jkiis.2019.29.1.1