Search | Korea Science

Factual consistency checker through a question-answer test based on the named entity (개체명 기반 질문-답변 검사를 통한 요약문 사실관계 확인)

Jung, Jeesu;Ryu, Hwijung;Chang, Dusung;Chung, Riwoo;Jung, Sangkeun
- Annual Conference on Human and Language Technology
- /
- 2021.10a
- /
- pp.112-117
- /
- 2021
기계 학습을 활용하여 요약문을 생성했을 경우, 해당 요약문의 정확도를 측정할 수 있는 도구는 필수적이다. 원문에 대한 요약문의 사실관계 일관성의 파악을 위해 개체명 유사도, 기계 독해를 이용한 질문-답변 생성을 활용한 방법이 시도되었으나, 충분한 데이터 확보가 필요하거나 정확도가 부족하였다. 본 논문은 딥러닝 모델을 기반한 개체명 인식기와 질문-답변쌍 정확도 측정기를 활용하여 생성, 필터링한 질문-답변 쌍에 대해 일치도를 점수화하는 방법을 제안하였다. 이러한 기계적 사실관계 확인 점수와 사람의 평가 점수의 분포를 비교하여 방법의 타당성을 입증하였다.
PDF

Valid Conversation Recognition for Restoring Entity Ellipsis in Chat Bot (대화 시스템의 개체 생략 복원을 위한 유효 발화문 인식)

So, Chan Ho;Wang, Ji Hyun;Lee, Chunghee;Lee, Yeonsoo;Kang, Jaewoo
- Annual Conference on Human and Language Technology
- /
- 2019.10a
- /
- pp.54-59
- /
- 2019
본 논문은 대화 시스템인 챗봇의 성능 향상을 위한 생략 복원 기술의 정확률을 올리기 위한 유효 발화문 인식 모델을 제안한다. 생략 복원 기술은 챗봇 사용자의 현재 발화문의 생략된 정보를 이전 발화문으로부터 복원하는 기술이다. 유효 발화문 인식 모델은 현재 발화문의 생략된 정보를 보유한 이전 발화문을 인식하는 역할을 수행한다. 유효 발화문 인식 모델은 BERT 기반 이진 분류 모델이며, 사용된 BERT 모델은 한국어 문서를 기반으로 새로 학습된 한국어 사전 학습 BERT 모델이다. 사용자의 현재 발화문과 이전 발화문들의 토큰 임베딩을 한국어 BERT를 통해 얻고, CNN 모델을 이용하여 각 토큰의 지역적인 정보를 추출해서 발화문 쌍의 표현 정보를 구해 해당 이전 발화문에 생략된 개체값이 있는지를 판단한다. 제안한 모델의 효과를 검증하기 위해 유효 발화문 인식 모델에서 유효하다고 판단한 이전 발화문만을 생략 복원 모델에 적용한 결과, 생략 복원 모델의 정확률이 약 5% 정도 상승한 것을 확인하였다.
PDF

Conversation Context Annotation using Speaker Detection (화자인식을 이용한 대화 상황정보 어노테이션)

Park, Seung-Bo;Kim, Yoo-Won;Jo, Geun-Sik
- Journal of Korea Multimedia Society
- /
- v.12 no.9
- /
- pp.1252-1261
- /
- 2009
One notable challenge in video searching and summarizing is extracting semantic from video contents and annotating context for video contents. Video semantic or context could be obtained by two methods to extract objects and contexts between objects from video. However, the method that use just to extracts objects do not express enough semantic for shot or scene as it does not describe relation and interaction between objects. To be more effective, after extracting some objects, context like relation and interaction between objects needs to be extracted from conversation situation. This paper is a study for how to detect speaker and how to compose context for talking to annotate conversation context. For this, based on this study, we proposed the methods that characters are recognized through face recognition technology, speaker is detected through mouth motion, conversation context is extracted using the rule that is composed of speaker existing, the number of characters and subtitles existing and, finally, scene context is changed to xml file and saved.
PDF

Neighbor-Stranger Discrimination of Yellow-throated Buntings (Emberiza elegans) and Gray-headed Buntings (Emberiza fucata) to Playback of Song (노랑턱멧새 (Emberiza elegans)와 붉은뺨멧새의 (Emberiza fucata)에서 Song의 Playback을 통한 이웃-낯선 개체의 인식)

황보연;박시룡
- The Korean Journal of Zoology
- /
- v.39 no.1
- /
- pp.89-97
- /
- 1996
Songs of the Yellow-throated Bunting (Emberiza elegans) and the Gray-headed Bunting (Emberiza fucata) in allopatric populations in Gangnae-meon, Cheongwon-gun, Chungbuk in Korea, were recorded during the breeding season and analyzed in sound spectroraphs. Males of E. elegans and E. fucata were tested to investigate whether territorial males can discriminate between neighbor and stranger based on playback of natural and artificial song repertoires. In addition, E. fucata was stimulated by playback of only the individually specific section as well as of only the posterior portion of the song. Males of E. elegans were able to discriminate individually between neighbor and stranger in response to natural song repertoires, but they did not respond to playback of the artificial song repertoires of neighbor and stranger. Males of E. fucata were able to discriminate individually between neighbor and stranger in response to natural, artificial song repertoires, and the anterior section of the song, while males did not respond to playback of the posterior section of the song.
PDF

Re-defining Named Entity Type for Personal Information De-identification and A Generation method of Training Data (개인정보 비식별화를 위한 개체명 유형 재정의와 학습데이터 생성 방법)

Choi, Jae-hoon;Cho, Sang-hyun;Kim, Min-ho;Kwon, Hyuk-chul
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2022.05a
- /
- pp.206-208
- /
- 2022
As the big data industry has recently developed significantly, interest in privacy violations caused by personal information leakage has increased. There have been attempts to automate this through named entity recognition in natural language processing. In this paper, named entity recognition data is constructed semi-automatically by identifying sentences with de-identification information from de-identification information in Korean Wikipedia. This can reduce the cost of learning about information that is not subject to de-identification compared to using general named entity recognition data. In addition, it has the advantage of minimizing additional systems based on rules and statistics to classify de-identification information in the output. The named entity recognition data proposed in this paper is classified into twelve categories. There are included de-identification information, such as medical records and family relationships. In the experiment using the generated dataset, KoELECTRA showed performance of 0.87796 and RoBERTa of 0.88.
PDF

A Fuzzy Weights Decision Method based on Degree of Contribution for Recognition of Insect Footprints (곤충 발자국 인식을 위한 기여도 기반의 퍼지 가중치 결정 방법)

Shin, Bok-Suk;Cha, Eui-Young;Woo, Young-Woon
- Journal of the Korea Society of Computer and Information
- /
- v.14 no.12
- /
- pp.55-62
- /
- 2009
This paper proposes a decision method of fuzzy weights by utilizing degrees of contribution in order to classify insect footprint patterns having difficulties to classify species clearly. Insect footprints revealed delicately in the form of scattered spots since they are very small. Therefore it is not easy to define shape of footprints unlike other species, and there are lots of noises in the footprint patterns so that it is difficult to distinguish those from correct data. For these reasons, the extracted feature set has obvious feature values with some uncertain feature values, so we estimate weights according to degrees of contribution. If the one of feature values has distinct difference enough to decide a class among other classes, high weight is assigned to make classification. A calculated weight determines the membership values by fuzzy functions and objects are classified into the class having a superior value.atu present experimental resultseighrontribution. Iinsect footprints with noises by the proposed method.
https://doi.org/10.9708/jksci.2009.14.12.055 인용 PDF

Automatic Construction of Restaurant Menu Dictionary (음식메뉴 개체명 인식을 위한 음식메뉴 사전 자동 구축)

Gu, Yeong-Hyeon;Yoo, Seong-Joon
- Annual Conference on Human and Language Technology
- /
- 2013.10a
- /
- pp.102-106
- /
- 2013
레스토랑 리뷰 분석을 위해서는 음식메뉴 개체명 인식이 매우 중요하다. 그러나 현재의 개체명 사전을 이용하여 리뷰 분석을 할 경우 구체적이고 복잡한 음식메뉴명을 표현하는데 충분하지 않으며 지속적인 업데이트가 힘들어 새로운 트렌드의 음식 메뉴명 등이 반영되지 않는 문제가 있다. 본 논문에서는 레스토랑 전문 사이트와 레시피 제공 사이트에서 각 레스토랑의 메뉴 정보와 음식명 등을 래퍼기반 웹 크롤러로 수집하였다. 그런 다음 빈도수가 낮은 음식메뉴와 레스토랑 온라인 리뷰에서 쓰이지 않는 음식메뉴를 제거하여 레스토랑 음식 메뉴 사전을 자동으로 구축하였다. 그리고 레스토랑 온라인 리뷰 문서를 이용해 음식 메뉴 사전의 엔티티들이 어느 유형의 레스토랑 리뷰에서 발견되는지를 찾아 빈도수를 구하고 분류 정보에 따른 비율을 사전에 추가하였다. 이 정보를 이용해 여러 분류 유형에 해당되는 음식메뉴를 구분할 수 있다. 실험 결과 한국관광공사 외국어 용례사전의 음식 메뉴명은 1,104개의 메뉴가 실제 레스토랑 리뷰에서 쓰인데 비해 본 논문에서 구축한 사전은 1,602개의 메뉴가 실제 레스토랑 리뷰에서 쓰여 498개의 어휘가 더 구성되어 있는 것을 확인 할 수 있었다. 이와 아울러, 자동으로 수집한 메뉴의 정확도와 재현율을 분석한다. 실험 결과 정확률은 96.2였고 재현율은 78.4, F-Score는 86.4였다.
PDF

Playback Expedments on Bush Warbiers (Cettia diphone): Their Song Recognition of Intra- and Inter-Population (휘파람새의 Intra-and Inter-Population Songs 인식에 관한 Playback실험)

박시룡;박대식;김수일;윤무부
- The Korean Journal of Zoology
- /
- v.38 no.4
- /
- pp.443-448
- /
- 1995
Playback experiments were performed to clarify the degree of song recognition using inter - and intra-populatlon songs of Bush Warbler at Cheongwon, Chungbuk area. Six territorial males were strongly responded to inter- as well as intrapopulation songs. Their responses to the inter- and Intra-population songs were not differed significantly in all measures of latency time, staying time, and closest distance. This result imply that Bush Warbiers in the region did not discriminate the difference between intra- and inter-population songs. It may be the reason that the regional males have little Interactions In song exchange with neighbors by keeping a long Individual distance. In order to investigate the signal value as species recognition releaser, playback of partial songs, prepared from tow distinct regional populations of the spedes were presented to males of the study area. The partial songs presented were made of two portions for each presentation, a whlsde portion only, and a complex syllable portion only. Territorial males responded stronger to the complex syllable portion than the whistle portion of the song. This result indicate that the complex syllable portion conveys more information on the species recognition. As 'releaser' hypothesis suggested formerly, a function of the complex syllable portion In Bush Warbler song is understood In which conveys most spedesIdentifying information. Thus, the result of this playback experiments supports the releaser hypothesis.
PDF

ITU-T SG17 텔레바이오인식 국제표준화 동향

Jason Kim;Taehyun Kim;Eunjeong Park
- Review of KIISC
- /
- v.33 no.4
- /
- pp.89-93
- /
- 2023
포스트 코로나시대에는 다가올 디지털 인프라 사회에서의 안전하고 편리한 사이버 경제활동을 구현하기 위해서 비대면 인증수단으로서 텔레바이오인식기술의 중요성은 증가할 것으로 전망된다. 이에 따라 바이오인식 관련 국제표준화기구인 ITU-T SG17 Q10(ID Management & Telebiometrics)에서 추진중인 반려동물 개체식별 인증서비스, 헬스케어 응용서비스, 인공지능서비스에서의 생체정보 보호기술 등에 관한 텔레바이오인식기술 표준화 현황을 면밀히 분석하여, 향후 디지털 사회로의 대전환 시대가 도래함에 따른 동물 개체식별 인증서비스, 디지털 헬스케어 응용서비스, 주율주행 응용서비스 등 폭넓게 생활속에 적용되고 진화하고 있는 텔레바이오인식기술 국제표준화 추진현황에 대하여 고찰하고자 한다.
PDF HTML

Automatic Named Entities Extraction Using the Graph-based Measurement Technique of the Mutual Importance (그래프 기반의 상호 중요도 측정 기법을 이용한 영역별 개체명 자동 추출)

Bae, Sangjoon;Ko, Youngjoong
- Annual Conference on Human and Language Technology
- /
- 2008.10a
- /
- pp.17-22
- /
- 2008
본 논문에서는 영역별로 자동으로 개체명을 추출하기 위하여 씨앗단어를 이용하고, 웹페이지와 개체명 후보들 간의 상호 중요도를 측정하여 개체명 후보들의 순위를 정하는 방식을 제안한다. 제안된 방식은 크게 세 단계에 의해서 수행되어 지는데 먼저 씨앗단어 정보를 이용하여 웹페이지를 검색하고, 검색되어진 웹 페이지와 씨앗단어 정보를 이용하여 패턴 규칙을 추출한다. 추출된 패턴 규칙을 웹페이지에 적용하여 개체명 후보들을 추출하고 추출된 후보들과 웹페이지 사이의 상호 중요도를 재귀적으로 계산하여 최종적으로 개체명 후보들의 순위가 정해 진다. 한국어와 영어 개체명 영역에 제안된 기법을 적용하여 실험한 결과 한국어에서는 78.72%의 MAP를 얻을 수 있었고, 영어에서는 96.48%의 MAP를 얻었다. 특히 영어 개체명 인식에서의 성능은 구글에서 제공하고 있는 구글셋의 결과보다도 높은 성능을 보였다.
PDF

Search Result 447, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)