Search | Korea Science

네이버 AI플랫폼 CLOVA 그리고 초대규모 AI HyperCLOVA

Ha, Jeong-U;Park, Heung-Seok;Lee, Ba-Do;Hwang, Min-Je
- Korea Information Processing Society Review
- /
- v.28 no.3
- /
- pp.56-66
- /
- 2021
PDF KSCI

Extract Snippets Suitable for Search Intent (검색의도에 적합한 스니펫 추출)

Lee, Hyeon-gu;Yang, Yunyeong;Kim, Eunbyul;Cha, Woojune;Roh, Yunyoung;Kim, Eunyoung;Choi, Gyuhyeon;Shin, Dongwook;Park, Chanhoon;Kang, Inho
- Annual Conference on Human and Language Technology
- /
- 2021.10a
- /
- pp.241-246
- /
- 2021
스니펫 추출은 정보검색에서 주요한 문서 정보를 짧은 문단 형태로 보여주는 것으로 사용자가 검색결과를 좀 더 효율적으로 확인할 수 있게 도와준다. 그러나 기존 스니펫은 어휘가 일치하는 문장을 찾아 보여주기에 검색의도가 반영되기 어렵다. 또한 의미적 정답을 찾기 위해 질의응답 방법론이 응용되고 있지만 오픈 도메인 환경에서 품질이 낮은 문제가 있다. 본 논문은 이러한 문제를 해결하기 위해 스니펫 추출, 의도 부착, 검증 3단계로 스니펫을 추출하여 추출된 스니펫이 질의 의도에 적합하게 추출되도록 하는 방법을 제안한다. 실험 결과 전통적인 스니펫보다 만족도가 높은 것을 보였고, 스니펫 추출만 했을 때보다 의도 부착, 검증을 하였을 때 정확도가 0.3165만큼 향상되는 것을 보였다.
PDF

Multi-labeled Domain Detection Using CNN (CNN을 이용한 발화 주제 다중 분류)

Choi, Kyoungho;Kim, Kyungduk;Kim, Yonghe;Kang, Inho
- Annual Conference on Human and Language Technology
- /
- 2017.10a
- /
- pp.56-59
- /
- 2017
CNN(Convolutional Neural Network)을 이용하여 발화 주제 다중 분류 task를 multi-labeling 방법과, cluster 방법을 이용하여 수행하고, 각 방법론에 MSE(Mean Square Error), softmax cross-entropy, sigmoid cross-entropy를 적용하여 성능을 평가하였다. Network는 음절 단위로 tokenize하고, 품사정보를 각 token의 추가한 sequence와, Naver DB를 통하여 얻은 named entity 정보를 입력으로 사용한다. 실험결과 cluster 방법으로 문제를 변형하고, sigmoid를 output layer의 activation function으로 사용하고 cross entropy cost function을 이용하여 network를 학습시켰을 때 F1 0.9873으로 가장 좋은 성능을 보였다.
PDF

Query Normalization Using P-tuning of Large Pre-trained Language Model (Large Pre-trained Language Model의 P-tuning을 이용한 질의 정규화)

Suh, Soo-Bin;In, Soo-Kyo;Park, Jin-Seong;Nam, Kyeong-Min;Kim, Hyeon-Wook;Moon, Ki-Yoon;Hwang, Won-Yo;Kim, Kyung-Duk;Kang, In-Ho
- Annual Conference on Human and Language Technology
- /
- 2021.10a
- /
- pp.396-401
- /
- 2021
초거대 언어모델를 활용한 퓨샷(few shot) 학습법은 여러 자연어 처리 문제에서 좋은 성능을 보였다. 하지만 데이터를 활용한 추가 학습으로 문제를 추론하는 것이 아니라, 이산적인 공간에서 퓨샷 구성을 통해 문제를 정의하는 방식은 성능 향상에 한계가 존재한다. 이를 해결하기 위해 초거대 언어모델의 모수 전체가 아닌 일부를 추가 학습하거나 다른 신경망을 덧붙여 연속적인 공간에서 추론하는 P-tuning과 같은 데이터 기반 추가 학습 방법들이 등장하였다. 본 논문에서는 문맥에 따른 질의 정규화 문제를 대화형 음성 검색 서비스에 맞게 직접 정의하였고, 초거대 언어모델을 P-tuning으로 추가 학습한 경우 퓨샷 학습법 대비 정확도가 상승함을 보였다.
PDF

Self-supervised Learning Method using Heterogeneous Mass Corpus for Sentence Embedding Model (이종의 말뭉치를 활용한 자기 지도 문장 임베딩 학습 방법)

Kim, Sung-Ju;Suh, Soo-Bin;Park, Jin-Seong;Park, Sung-Hyun;Jeon, Dong-Hyeon;Kim, Seon-Hoon;Kim, Kyung-Duk;Kang, In-Ho
- Annual Conference on Human and Language Technology
- /
- 2020.10a
- /
- pp.32-36
- /
- 2020
문장의 의미를 잘 임베딩하는 문장 인코더를 만들기 위해 비지도 학습과 지도 학습 기반의 여러 방법이 연구되고 있다. 지도 학습 방식은 충분한 양의 정답을 구축하는데 어려움이 있다는 한계가 있다. 반면 지금까지의 비지도 학습은 단일 형식의 말뭉치에 한정해서 입력된 현재 문장의 다음 문장을 생성 또는 예측하는 형식으로 문제를 정의하였다. 본 논문에서는 위키피디아, 뉴스, 지식 백과 등 문서 형태의 말뭉치에 더해 지식인이나 검색 클릭 로그와 같은 구성이 다양한 이종의 대량 말뭉치를 활용하는 자기 지도 학습 방법을 제안한다. 각 형태의 말뭉치에 적합한 자기 지도 학습 문제를 설계하고 학습한 경우 KorSTS 데이셋의 비지도 모델 성능 평가에서 기준 모델 대비 7점 가량의 성능 향상이 있었다.
PDF

Employment Effects Evaluation of Naver Shopping in 2018 (2018년 네이버 쇼핑의 고용영향 평가)

KIM, Heung-Kyu;JUNG, Yeon-Sung
- The Journal of Industrial Distribution & Business
- /
- v.10 no.5
- /
- pp.27-36
- /
- 2019
Purpose - Naver has emerged as a new leader in the open market. While existing open markets such as Gmarket, 11th Street, and so on are suffering from profitability deterioration, Naver is attracting sellers based on low commission and powerful search engine. We would like to analyze the impact of Naver shopping on the national economy, especially on employment, in a situation where the market reaction to Naver's strength as a leader in online shopping is mixed. Research Design, Data, and Methodology - Through the demand inducing inter-industry analysis, we estimate the employment inducement effect by Naver shopping from its shopping transaction. In turn, through the supply inducing inter-industry analysis, we estimate the employment inducement effect by Naver shopping from its low commission and powerful search engine. For the purpose of inter-industry analysis, as of 2018, the most recently announced 2014 inter-industry table (extension table) from the Bank of Korea is used. Results - The results of this study are as follows. First, Naver Shopping is expected to generate 7.8 trillion won's trade in 2018, resulting in 244,225 of job inducement, and 158,598 of employment inducement. In addition, Naver Shopping is estimated to benefit KRW 213 billion to its sellers due to low commission and powerful search function, resulting in 8,667 of job inducement, and 5,655 of employment inducement. Second, in terms of job inducement and employment inducement due to Naver Shopping's trade, transportation, business support service, information and communication, broadcasting, restaurants and lodging were ranked. Third, in terms of job inducement and employment inducement due to Naver Shopping's low commission and powerful search function, restaurants and hospitality, f/b and cigarette manufacturing, construction, and transportation equipment manufacturing were ranked. Conclusions - The number of job inducement resulting from low commission and powerful search engine of Naver shopping in 2018 was 8,667 (3.7% of 244,225, which was caused by transaction in Naver shopping in 2018), and employment inducement was 5,655 (3.7% of 158,598, which was caused by transaction in Naver shopping in 2018), which can be considered as additional employment impacts of Naver Shopping compared to the other online shopping operators.
https://doi.org/10.13106/ijidb.2019.vol10.no5.27. 인용 PDF HTML

Natural question generation based on consistency between generated questions and answers (생성된 질의응답 간 일관성을 이용한 자연어 질의 생성)

Jaehong Lee;Hwiyeol Jo;Sookyo In;Sungju Kim;Kiyoon Moon;Taehong Min;Kyungduk Kim
- Annual Conference on Human and Language Technology
- /
- 2022.10a
- /
- pp.109-114
- /
- 2022
질의 생성 모델은 스마트 스피커, 챗봇, QA 시스템, 기계 독해 등 다양한 서비스에 사용되고 있다. 모델을 다양한 서비스에 잘 적용하기 위해서는 사용자들의 실제 질의 특성을 반영한 자연스러운 질의를 만드는 것이 중요하다. 본 논문에서는 사용자 질의 특성을 반영한 간결하고 자연스러운 질의 자동 생성 모델을 소개한다. 제안 모델은 topic 키워드를 통해 모델에게 생성 자유도를 주었으며, 키워드형 질의→자연어 질의→응답으로 연결되는 chain-of-thought 형태의 다중 출력 구조를 통해 인과관계를 고려한 결과를 만들도록 했다. 최종적으로 MRC 필터링과 일관성 필터링을 통해 고품질 질의를 선별했다. 베이스라인 모델과 비교해 제안 모델은 질의의 유효성을 크게 높일 수 있었다.
PDF

A Study on NaverZ's Metaverse Platform Scaling Strategy

Song, Minzheong
- International journal of advanced smart convergence
- /
- v.11 no.3
- /
- pp.132-141
- /
- 2022
We look at the rocket life stages of NaverZ's metaverse platform scaling and investigate the ignition and scale-up stage of its metaverse platform brand, Zepeto based on the Rocket Model (RM). The results are derived as follows: Firstly, NaverZ shows the event strategy by collaborating with K-pops, the piggybacking strategy by utilizing other SNSs, and the VIP strategy by investing in game and entertainment content genres in the 'attract' function. In the second 'match' function, based on the matching rule of Zepeto, the users can generate their own characters and "World" with Zepeto Studio. However, for strengthening the matching quality, NaverZ is investing in the artificial intelligence (AI) based companies consistently. In the 'connect' function, NaverZ's maximization of the positive interaction is possible by inducing feed activities in Zepeto & other SNSs and by uploading attractive content for viral effects in the ignition. For facilitating this, NaverZ expands the scale to other continents like Southeast Asia and Middle East with the localization strategy inclusive investment. Lastly, in the 'transact' function, based on three monetization experiments like Coin & ZEM, user generated content (UGC) fee, and advertising revenue in the ignition, NaverZ starts to invest in NFT platforms and abroad blockchain companies.
https://doi.org/10.7236/IJASC.2022.11.3.132 인용 PDF KSCI

Question, Document, Response Validator for Question Answering System (질의 응답 시스템을 위한 질의, 문서, 답변 검증기)

Tae Hong Min;Jae Hong Lee;Soo Kyo In;Kiyoon Moon;Hwiyeol Jo;Kyungduk Kim
- Annual Conference on Human and Language Technology
- /
- 2022.10a
- /
- pp.604-607
- /
- 2022
본 논문은 사용자의 질의에 대한 답변을 제공하는 질의 응답 시스템에서, 제공하는 답변이 사용자의 질의에 대하여 문서에 근거하여 올바르게 대답하였는지 검증하는 QDR validator에 대해 기술한 논문이다. 본 논문의 과제는 문서에 대한 주장을 판별하는 자연어 추론(Natural Language inference, NLI)와 유사한 과제이지만, 문서(D)와 주장(R)을 포함하여 질의(Q)까지 총 3가지 종류의 입력을 받아 NLI 과제보다 난도가 높다. QDR validation 과제를 수행하기 위하여, 약 16,000 건 데이터를 생성하였으며, 다양한 입력 형식 실험 및 NLI 과제 데이터 추가 학습, 임계 값 조절 실험을 통해 최종 83.05% 우수한 성능을 기록하였다
PDF

Self-learning Method Based Slot Correction for Spoken Dialog System (자기 학습 방법을 이용한 음성 대화 시스템의 슬롯 교정)

Choi, Taekyoon;Kim, Minkyoung;Lee, Injae;Lee, Jieun;Park, Kyuyon;Kim, Kyungduk;Kang, Inho
- Annual Conference on Human and Language Technology
- /
- 2021.10a
- /
- pp.353-360
- /
- 2021
음성 대화 시스템에서는 사용자가 잘못된 슬롯명을 말하거나 음성인식 오류가 발생해 사용자의 의도에 맞지 않는 응답을 하는 경우가 있다. 이러한 문제를 해결하고자 말뭉치나 사전 데이터를 활용한 질의 교정 방법들이 제안되지만, 이는 지속적으로 사람이 개입하여 데이터를 주입해야하는 한계가 있다. 본 논문에서는 축적된 로그 데이터를 활용하여 사람의 개입 없이 음악 재생에 필요한 슬롯을 교정하는 자기 학습(Self-learning) 기반의 모델을 제안한다. 이 모델은 사용자가 특정 음악을 재생하고자 유사한 질의를 반복하는 상황을 이용하여 비지도 학습 기반으로 학습하고 음악 재생에 실패한 슬롯을 교정한다. 그리고, 학습한 모델 결과의 정확도에 대한 불확실성을 해소하기 위해 질의 슬롯 관계 유사도 모델을 이용하여 교정 결과에 대한 검증을 하고 슬롯 교정 결과에 대한 안정성을 보장한다. 모델 학습을 위한 데이터셋은 사용자가 연속으로 질의한 세션 데이터로부터 추출하며, 음악 재생 슬롯 세션 데이터와 질의 슬롯 관계 유사도 데이터를 각각 구축하여 슬롯 교정 모델과 질의 슬롯 관계 유사도 모델을 학습한다. 교정된 슬롯을 분석한 결과 발음 정보가 유사한 슬롯 뿐만 아니라 의미적인 관계가 있는 슬롯으로도 교정하여 사전 기반 방식보다 다양한 유형의 교정이 가능한 것을 보였다. 3 개월 간 수집된 로그 데이터로 학습한 음악 재생 슬롯 교정 모델은 일주일 동안 반복한 고유 질의 기준, 음악 재생 실패의 12%를 개선하는 성능을 보였다.
PDF

Search Result 663, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)