• Title/Summary/Keyword: 산업/직업 자동코딩

Search Result 4, Processing Time 0.018 seconds

An Automated Industry and Occupation Coding System using Deep Learning (딥러닝 기법을 활용한 산업/직업 자동코딩 시스템)

  • Lim, Jungwoo;Moon, Hyeonseok;Lee, Chanhee;Woo, Chankyun;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.4
    • /
    • pp.23-30
    • /
    • 2021
  • An Automated Industry and Occupation Coding System assigns statistical classification code to the enormous amount of natural language data collected from people who write about their industry and occupation. Unlike previous studies that applied information retrieval, we propose a system that does not need an index database and gives proper code regardless of the level of classification. Also, we show our model, which utilized KoBERT that achieves high performance in natural language downstream tasks with deep learning, outperforms baseline. Our method achieves 95.65%, 91.51%, and 97.66% in Occupation/Industry Code Classification of Population and Housing Census, and Industry Code Classification of Census on Basic Characteristics of Establishments. Moreover, we also demonstrate future improvements through error analysis in the respect of data and modeling.

An Automatic Coding System of Korean Standard Industry/Occupation Code Using Example-based Learning (예제기반의 학습을 이용한 한국어 표준 산업/직업 자동 코딩 시스템)

  • Lim Heui-Seok
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.4
    • /
    • pp.169-179
    • /
    • 2005
  • Standard industry and occupation code are usually assigned manually in Korean census. The manual coding is very labor intensive and expensive task. Furthermore, inconsistent coding is resulted from the ability of human experts and their working environments. This paper proposes an automatic code classification system which converts natural language responses on survey questionnaires into corresponding numeric codes by using manually constructed rule base and example-based machine learning. The system was trained with 400,000 records of which standard codes was assigned. It was evaluated with 10-fold cross validation and was tested with three code sets: population occupation set, industry set, and industry survey set. The proposed system showed 76.63%, 82.24 and 99.68% accuracy for each code set.

  • PDF

산업/직업 분류 자동코딩 시스템

  • 강유경
    • Proceedings of the Korean Association for Survey Research Conference
    • /
    • 2001.11a
    • /
    • pp.33-45
    • /
    • 2001
  • Korean standard industrial/occupational classification has been the basis of producing accurate statistical data related with our industrial structure and distribution of industry and occupation since 1960. But coding over several million records not only requires high cost in the aspects of time and manpower but also has many problems in accuracy and consistency. Therefore, we got to develop the automatic coding system in order to work out these problems of manual coding. This paper shows the structure of our system and the result of experiment over survey data of 2,000 Census.

  • PDF

Standard Industrial Classification in Short Sentence Based on Machine Learning Approach (기계학습 기반 단문에서의 문장 분류 방법을 이용한 한국표준산업분류)

  • Oh, Kyo-Joong;Choi, Ho-Jin;An, Hweongak
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.394-398
    • /
    • 2020
  • 산업/직업분류 자동코딩시스템은 고용조사 등을 함에 있어 사업체 정보, 업무, 직급, 부서명 등 사용자의 다양한 입력을 표준 산업/직업분류에 맞춰 코드 정보를 제공해주는 시스템이다. 입력 데이터로부터 비지도학습 기반의 색인어 추출 모델을 학습하고, 부분단어 임베딩이 적용된 색인어 임베딩 모델을 통해 입력 벡터를 추출 후, 출력 분류 코드를 인코딩하여 지도학습 모델에서 학습하는 방법을 적용하였다. 기존 시스템의 분류 결과 데이터를 통해 대, 중, 소, 세분류에서 높은 정확도의 모델을 구축할 수 있으며, 기계학습 기술의 적용이 가능한 시스템임을 알 수 있다.

  • PDF