Search | Korea Science

Sentence model based subword embeddings for a dialog system

Chung, Euisok;Kim, Hyun Woo;Song, Hwa Jeon
- ETRI Journal
- /
- v.44 no.4
- /
- pp.599-612
- /
- 2022
This study focuses on improving a word embedding model to enhance the performance of downstream tasks, such as those of dialog systems. To improve traditional word embedding models, such as skip-gram, it is critical to refine the word features and expand the context model. In this paper, we approach the word model from the perspective of subword embedding and attempt to extend the context model by integrating various sentence models. Our proposed sentence model is a subword-based skip-thought model that integrates self-attention and relative position encoding techniques. We also propose a clustering-based dialog model for downstream task verification and evaluate its relationship with the sentence-model-based subword embedding technique. The proposed subword embedding method produces better results than previous methods in evaluating word and sentence similarity. In addition, the downstream task verification, a clustering-based dialog system, demonstrates an improvement of up to 4.86% over the results of FastText in previous research.
https://doi.org/10.4218/etrij.2020-0245 인용 PDF KSCI

Captive Portal Recommendation System Based on Word Embedding Model (단어 임베딩 모델 기반 캡티브 포털 메뉴 추천 시스템)

Dong-Hun Yeo;Byung-Il Hwang;Dong-Ju Kim
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2023.07a
- /
- pp.11-12
- /
- 2023
본 논문에서는 상점 내 캡티브 포털을 활용하여 수집된 주문 정보 데이터를 바탕으로 사용자가 선호하는 메뉴를 추천하는 시스템을 제안한다. 이 시스템은 식품 관련 공공 데이터셋으로 학습된 단어 임베딩 모델(Word Embedding Model)로 메뉴명을 벡터화하여 그와 유사한 벡터를 가지는 메뉴를 추천한다. 이 기법은 캡티브 포털에서 수집되는 데이터 특성상 사용자의 개인정보가 비식별화 되고 선택 항목에 대한 정보도 제한되므로 기존의 단어 임베딩 모델을 추천 시스템에 적용하는 경우에 비해 유리하다. 본 논문에서는 실제 동일한 시스템을 사용하는 상점들의 구매 기록 데이터를 활용한 검증 데이터를 확보하여 제안된 추천 시스템이 Precision@k(k=3) 구매 예측에 유의미함을 보인다.
PDF

Preliminary Studies on Embedding Qualitative Reasoning into Qualitative Analysis and Laboratory Simulation

Pang, Jen-Sen;Syed Mustapha, S.M.F.D;Mohd.Zain, Sharifuddin
- Proceedings of the Korea Inteligent Information System Society Conference
- /
- 2001.01a
- /
- pp.230-236
- /
- 2001
In this paper, we explored the possibilities of embedding Qualitative Reasoning techniques, the Qualitative Process Theory (QPT), and its implementation in the field of inorganic chemistry. The target field of implementation is Qualitative Chemical Analysis and Laboratory Simulation. By embedding such technique in this education software we aim to combine theory and practice into a single package. The system, are able to generate reasoning and explanation based on chemical theories, helping student in mastering basic chemistry knowledge and practical skill as well. We also review the suitability of embedding QPT techniques into chemistry in general, by comparing some examples from both fields.
PDF

An Intelligence Embedding Quadruped Pet Robot with Sensor Fusion (센서 퓨전을 통한 인공지능 4족 보행 애완용 로봇)

Lee Lae-Kyoung;Park Soo-Min;Kim Hyung-Chul;Kwon Yong-Kwan;Kang Suk-Hee;Choi Byoung-Wook
- Journal of Institute of Control, Robotics and Systems
- /
- v.11 no.4
- /
- pp.314-321
- /
- 2005
In this paper an intelligence embedding quadruped pet robot is described. It has 15 degrees of freedom and consists of various sensors such as CMOS image, voice recognition and sound localization, inclinometer, thermistor, real-time clock, tactile touch, PIR and IR to allows owners to interact with pet robot according to human's intention as well as the original features of pet animals. The architecture is flexible and adopts various embedded processors for handling sensors to provide modular structure. The pet robot is also used for additional purpose such like security, gaming visual tracking, and research platform. It is possible to generate various actions and behaviors and to download voice or music files to maintain a close relation of users. With cost-effective sensor, the pet robot is able to find its recharge station and recharge itself when its battery runs low. To facilitate programming of the robot, we support several development environments. Therefore, the developed system is a low-cost programmable entertainment robot platform.
https://doi.org/10.5302/J.ICROS.2005.11.4.314 인용 PDF KSCI

Proper Noun Embedding Model for the Korean Dependency Parsing

Nam, Gyu-Hyeon;Lee, Hyun-Young;Kang, Seung-Shik
- Journal of Multimedia Information System
- /
- v.9 no.2
- /
- pp.93-102
- /
- 2022
Dependency parsing is a decision problem of the syntactic relation between words in a sentence. Recently, deep learning models are used for dependency parsing based on the word representations in a continuous vector space. However, it causes a mislabeled tagging problem for the proper nouns that rarely appear in the training corpus because it is difficult to express out-of-vocabulary (OOV) words in a continuous vector space. To solve the OOV problem in dependency parsing, we explored the proper noun embedding method according to the embedding unit. Before representing words in a continuous vector space, we replace the proper nouns with a special token and train them for the contextual features by using the multi-layer bidirectional LSTM. Two models of the syllable-based and morpheme-based unit are proposed for proper noun embedding and the performance of the dependency parsing is more improved in the ensemble model than each syllable and morpheme embedding model. The experimental results showed that our ensemble model improved 1.69%p in UAS and 2.17%p in LAS than the same arc-eager approach-based Malt parser.
https://doi.org/10.33851/JMIS.2022.9.2.93 인용 PDF KSCI HTML

Gated Multi-channel Network Embedding for Large-scale Mobile App Clustering

Yeo-Chan Yoon;Soo Kyun Kim
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.17 no.6
- /
- pp.1620-1634
- /
- 2023
This paper studies the task of embedding nodes with multiple graphs representing multiple information channels, which is useful in a large volume of network clustering tasks. By learning a node using multiple graphs, various characteristics of the node can be represented and embedded stably. Existing studies using multi-channel networks have been conducted by integrating heterogeneous graphs or limiting common nodes appearing in multiple graphs to have similar embeddings. Although these methods effectively represent nodes, it also has limitations by assuming that all networks provide the same amount of information. This paper proposes a method to overcome these limitations; The proposed method gives different weights according to the source graph when embedding nodes; the characteristics of the graph with more important information can be reflected more in the node. To this end, a novel method incorporating a multi-channel gate layer is proposed to weigh more important channels and ignore unnecessary data to embed a node with multiple graphs. Empirical experiments demonstrate the effectiveness of the proposed multi-channel-based embedding methods.
https://doi.org/10.3837/tiis.2023.06.005 인용 PDF HTML

An Exploratory Approach to Discovering Salary-Related Wording in Job Postings in Korea

Ha, Taehyun;Coh, Byoung-Youl;Lee, Mingook;Yun, Bitnari;Chun, Hong-Woo
- Journal of Information Science Theory and Practice
- /
- v.10 no.spc
- /
- pp.86-95
- /
- 2022
Online recruitment websites discuss job demands in various fields, and job postings contain detailed job specifications. Analyzing this text can elucidate the features that determine job salaries. Text embedding models can learn the contextual information in a text, and explainable artificial intelligence frameworks can be used to examine in detail how text features contribute to the models' outputs. We collected 733,625 job postings using the WORKNET API and classified them into low, mid, and high-range salary groups. A text embedding model that predicts job salaries based on the text in job postings was trained with the collected data. Then, we applied the SHapley Additive exPlanations (SHAP) framework to the trained model and discovered the significant words that determine each salary class. Several limitations and remaining words are also discussed.
https://doi.org/10.1633/JISTaP.2022.10.S.9 인용 PDF KSCI

Korean Phoneme Sequence based Word Embedding (한국어 음소열 기반 워드 임베딩 기술)

Chung, Euisok;Jeon, Hwa Jeon;Lee, Sung Joo;Park, Jeon-Gue
- Annual Conference on Human and Language Technology
- /
- 2017.10a
- /
- pp.225-227
- /
- 2017
본 논문은 한국어 서브워드 기반 워드 임베딩 기술을 다룬다. 미등록어 문제를 가진 기존 워드 임베딩 기술을 대체할 수 있는 새로운 워드 임베딩 기술을 한국어에 적용하기 위해, 음소열 기반 서브워드 자질 검증을 진행한다. 기존 서브워드 자질은 문자 n-gram을 사용한다. 한국어의 경우 특정 단음절 발음은 단어에 따라 달라진다. 여기서 음소열 n-gram은 특정 서브워드 자질의 변별력을 확보할 수 있다는 장점이 있다. 본 논문은 서브워드 임베딩 기술을 재구현하여, 영어 환경에서 기존 워드 임베딩 사례와 비교하여 성능 우위를 확보한다. 또한, 한국어 음소열 자질을 활용한 실험 결과에서 의미적으로 보다 유사한 어휘를 벡터 공간상에 근접시키는 결과를 보여 준다.
PDF

Korean Phoneme Sequence based Word Embedding (한국어 음소열 기반 워드 임베딩 기술)

Chung, Euisok;Jeon, Hwa Jeon;Lee, Sung Joo;Park, Jeon-Gue
- 한국어정보학회:학술대회논문집
- /
- 2017.10a
- /
- pp.225-227
- /
- 2017
본 논문은 한국어 서브워드 기반 워드 임베딩 기술을 다룬다. 미등록어 문제를 가진 기존 워드 임베딩 기술을 대체할 수 있는 새로운 워드 임베딩 기술을 한국어에 적용하기 위해, 음소열 기반 서브워드 자질 검증을 진행한다. 기존 서브워드 자질은 문자 n-gram을 사용한다. 한국어의 경우 특정 단음절 발음은 단어에 따라 달라진다. 여기서 음소열 n-gram은 특정 서브워드 자질의 변별력을 확보할 수 있다는 장점이 있다. 본 논문은 서브워드 임베딩 기술을 재구현하여, 영어 환경에서 기존 워드 임베딩 사례와 비교하여 성능 우위를 확보한다. 또한, 한국어 음소열 자질을 활용한 실험 결과에서 의미적으로 보다 유사한 어휘를 벡터 공간상에 근접시키는 결과를 보여 준다.
PDF

Tea Leaf Disease Classification Using Artificial Intelligence (AI) Models (인공지능(AI) 모델을 사용한 차나무 잎의 병해 분류)

K.P.S. Kumaratenna;Young-Yeol Cho
- Journal of Bio-Environment Control
- /
- v.33 no.1
- /
- pp.1-11
- /
- 2024
In this study, five artificial intelligence (AI) models: Inception v3, SqueezeNet (local), VGG-16, Painters, and DeepLoc were used to classify tea leaf diseases. Eight image categories were used: healthy, algal leaf spot, anthracnose, bird's eye spot, brown blight, gray blight, red leaf spot, and white spot. Software used in this study was Orange 3 which functions as a Python library for visual programming, that operates through an interface that generates workflows to visually manipulate and analyze the data. The precision of each AI model was recorded to select the ideal AI model. All models were trained using the Adam solver, rectified linear unit activation function, 100 neurons in the hidden layers, 200 maximum number of iterations in the neural network, and 0.0001 regularizations. To extend the functionality of Orange 3, new add-ons can be installed and, this study image analytics add-on was newly added which is required for image analysis. For the training model, the import image, image embedding, neural network, test and score, and confusion matrix widgets were used, whereas the import images, image embedding, predictions, and image viewer widgets were used for the prediction. Precisions of the neural networks of the five AI models (Inception v3, SqueezeNet (local), VGG-16, Painters, and DeepLoc) were 0.807, 0.901, 0.780, 0.800, and 0.771, respectively. Finally, the SqueezeNet (local) model was selected as the optimal AI model for the detection of tea diseases using tea leaf images owing to its high precision and good performance throughout the confusion matrix.
https://doi.org/10.12791/KSBEC.2024.33.1.001 인용 PDF

Search Result 80, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)