Korean Patent ELECTRA : a pre-trained Korean Patent language representation model for the study of Korean Patent natural language processing(KorPatELECTRA)

Min, Jae-Ok;Jang, Ji-Mo;Jo, Yu-Jeong;Noh, Han-Sung;

Proceedings of the Korean Society of Computer Information Conference (한국컴퓨터정보학회:학술대회논문집)

2021.07a
/
Pages.69-71
/
2021

Korean Society of Computer Information (한국컴퓨터정보학회)

Korean Patent ELECTRA : a pre-trained Korean Patent language representation model for the study of Korean Patent natural language processing(KorPatELECTRA)

Korean Patent ELECTRA : 한국 특허문헌 자연어처리 연구를 위한 사전 학습된 언어모델(KorPatELECTRA)

Min, Jae-Ok (Korea Institute of Patent Information) ;
Jang, Ji-Mo (Korea Institute of Patent Information) ;
Jo, Yu-Jeong (Korea Institute of Patent Information) ;
Noh, Han-Sung (Korea Institute of Patent Information)

민재옥 (한국특허정보원) ;
장지모 (한국특허정보원) ;
조유정 (한국특허정보원) ;
노한성 (한국특허정보원)

Published : 2021.07.14

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

특허분야에서 자연어처리 태스크는 특허문헌의 언어적 특이성으로 문제 해결의 난이도가 높은 과제임에 따라 한국 특허문헌에 최적화된 언어모델의 연구가 시급한 실정이다. 본 논문에서는 대량의 한국 특허문헌 데이터를 최적으로 사전 학습(pre-trained)한 Korean Patent ELECTRA 모델과 tokenize 방식을 제안하며 기존 범용 목적의 사전학습 모델과 비교 실험을 통해 한국 특허문헌 자연어처리에 대한 발전 가능성을 확인하였다.

Proceedings of the Korean Society of Computer Information Conference (한국컴퓨터정보학회:학술대회논문집)

Korean Patent ELECTRA : a pre-trained Korean Patent language representation model for the study of Korean Patent natural language processing(KorPatELECTRA)

Korean Patent ELECTRA : 한국 특허문헌 자연어처리 연구를 위한 사전 학습된 언어모델(KorPatELECTRA)

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)