• Title/Summary/Keyword: Korean text classification

Search Result 417, Processing Time 0.021 seconds

Design of Artificial Intelligence Course for Humanities and Social Sciences Majors

  • KyungHee Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.187-195
    • /
    • 2023
  • This study propose to develop artificial intelligence liberal arts courses for college students in the humanities and social sciences majors using the entry artificial intelligence model. A group of experts in computer, artificial intelligence, and pedagogy was formed, and the final artificial intelligence liberal arts course was developed using previous research analysis and Delphi techniques. As a result of the study, the educational topics were largely composed of four categories: image classification, image recognition, text classification, and sound classification. The training consisted of 1) Understanding the principles of artificial intelligence, 2) Practice using the entry artificial intelligence model, 3) Identifying the Ethical Impact, and 4) Based on learned, team idea meeting to solve real-life problems. Through this course, understanding the principles of the core technology of artificial intelligence can be directly implemented through the entry artificial intelligence model, and furthermore, based on the experience of solving various real-life problems with artificial intelligence, and it can be expected to contribute positively to understanding technology, exploring the ethics needed in the artificial intelligence era.

An OTT Interlocked Platform with Emotion Classification based on Polar Coordinates for Customized Healing Contents (맞춤형 힐링 콘텐츠용 극좌표 기반 감정분류 OTT 연동 플랫폼)

  • Min-goo Kang
    • Journal of Internet Computing and Services
    • /
    • v.25 no.6
    • /
    • pp.1-7
    • /
    • 2024
  • In this paper, a platform design was presented for polar coordinate-based emotion classification, specifically aimed at enhancing OTT healing contents. The platform utilizes the BERT model to analyze emotional words in user-provided text, and maps these emotions onto a polar coordinate system (distance r, angle θ) to accurately classify emotional intensity and types. It recommends personalized healing contents tailored to individual user needs. Additionally, Wi-Fi beacons and geofencing technologies are employed to detect user locations, enabling the provision of location-based contents. The platform incorporates multi-factor authentication (MFA) methods, integrating biometric recognition, passwords, and FIDO (Fast ID Online) authentication to ensure precise user identification. This system combined emotion analysis algorithms, location detection, and user identification mechanisms to deliver personalized healing experiences while ensuring reliable user identification within OTT services.

On-line dynamic hand gesture recognition system for the korean sign language (KSL) (한글 수화용 동적 손 제스처의 실시간 인식 시스템의 구현에 관한 연구)

  • Kim, Jong-Sung;Lee, Chan-Su;Jang, Won;Bien, Zeungnam
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.2
    • /
    • pp.61-70
    • /
    • 1997
  • Human-hand gestures have been used a means of communication among people for a long time, being interpreted as streams of tokens for a language. The signed language is a method of communication for hearing impaired person. Articulated gestures and postures of hands and fingers are commonly used for the signed language. This paper presents a system which recognizes the korean sign language (KSL) and translates the recognition results into a normal korean text and sound. A pair of data-gloves are used a sthe sensing device for detecting motions of hands and fingers. In this paper, we propose a dynamic gesture recognition mehtod by employing a fuzzy feature analysis method for efficient classification of hand motions, and applying a fuzzy min-max neural network to on-line pattern recognition.

  • PDF

A computational algorithm for F0 contour generation in Korean developed with prosodically labeled databases using K-ToBI system (K-ToBI 기호에 준한 F0 곡선 생성 알고리듬)

  • Lee YongJu;Lee Sook-hyang;Kim Jong-Jin;Go Hyeon-Ju;Kim Yeong-Il;Kim Sang-Hun;Lee Jeong-Cheol
    • MALSORI
    • /
    • no.35_36
    • /
    • pp.131-143
    • /
    • 1998
  • This study describes an algorithm for the F0 contour generation system for Korean sentences and its evaluation results. 400 K-ToBI labeled utterances were used which were read by one male and one female announcers. F0 contour generation system uses two classification trees for prediction of K-ToBI labels for input text and 11 regression trees for prediction of F0 values for the labels. Evaluation results of the system showed 77.2% prediction accuracy for prediction of IP boundaries and 72.0% prediction accuracy for AP boundaries. Information of voicing and duration of the segments was not changed for F0 contour generation and its evaluation. Evaluation results showed 23.5Hz RMS error and 0.55 correlation coefficient in F0 generation experiment using labelling information from the original speech data.

  • PDF

Research on ITB Contract Terms Classification Model for Risk Management in EPC Projects: Deep Learning-Based PLM Ensemble Techniques (EPC 프로젝트의 위험 관리를 위한 ITB 문서 조항 분류 모델 연구: 딥러닝 기반 PLM 앙상블 기법 활용)

  • Hyunsang Lee;Wonseok Lee;Bogeun Jo;Heejun Lee;Sangjin Oh;Sangwoo You;Maru Nam;Hyunsik Lee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.11
    • /
    • pp.471-480
    • /
    • 2023
  • The Korean construction order volume in South Korea grew significantly from 91.3 trillion won in public orders in 2013 to a total of 212 trillion won in 2021, particularly in the private sector. As the size of the domestic and overseas markets grew, the scale and complexity of EPC (Engineering, Procurement, Construction) projects increased, and risk management of project management and ITB (Invitation to Bid) documents became a critical issue. The time granted to actual construction companies in the bidding process following the EPC project award is not only limited, but also extremely challenging to review all the risk terms in the ITB document due to manpower and cost issues. Previous research attempted to categorize the risk terms in EPC contract documents and detect them based on AI, but there were limitations to practical use due to problems related to data, such as the limit of labeled data utilization and class imbalance. Therefore, this study aims to develop an AI model that can categorize the contract terms based on the FIDIC Yellow 2017(Federation Internationale Des Ingenieurs-Conseils Contract terms) standard in detail, rather than defining and classifying risk terms like previous research. A multi-text classification function is necessary because the contract terms that need to be reviewed in detail may vary depending on the scale and type of the project. To enhance the performance of the multi-text classification model, we developed the ELECTRA PLM (Pre-trained Language Model) capable of efficiently learning the context of text data from the pre-training stage, and conducted a four-step experiment to validate the performance of the model. As a result, the ensemble version of the self-developed ITB-ELECTRA model and Legal-BERT achieved the best performance with a weighted average F1-Score of 76% in the classification of 57 contract terms.

E-Mail Classification Using Text and Domain Name (텍스트와 도메인 네임을 이용한 메일 분류)

  • 김원화;이일병
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04c
    • /
    • pp.256-258
    • /
    • 2003
  • 정보화 시대에는 사람들의 모든 활동이 인터넷을 통해서 대부분 이루어진다. 이중에서 전자 메일이 차지하는 비중은 매우 크다. 고객 유치를 위한 기업들의 광고와 배움을 위한 강의, 자신의 관심 분야에 대한 정보 등을 전자 매일로 받아보게 되는 것이 더 많아 질것이다. 이러한 상황에서 사람들은 자신이 필요로 하는 메일과 필요로 하지 않는 메일을 분류하는데 많은 시간을 낭비한다. 사람들은 이러한 시간 낭비를 줄이기 위해서 메일 분류 시스템을 사용한다. 현재 사용되고 있는 매일 분류 시스템은 스팸 매일을 기준으로 하고 있다. 그러나 오분류되는 메일들이 있어 사용자가 스팸 메일을 다시 보는 경우가 있어 한계를 보인다. 본 논문에서는 사람들이 자신이 원하는 메일과 그렇지 않은 메일을 분류하기 위해서 1차 분류로 긍정어와 부정어를 이용하여 전자 메일을 분류하고 2차 분류로 도메인 네임을 이용하여 분류한다.

  • PDF

Reputation Analysis of Document Using Probabilistic Latent Semantic Analysis Based on Weighting Distinctions (가중치 기반 PLSA를 이용한 문서 평가 분석)

  • Cho, Shi-Won;Lee, Dong-Wook
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.3
    • /
    • pp.632-638
    • /
    • 2009
  • Probabilistic Latent Semantic Analysis has many applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas. In this paper, we propose an algorithm using weighted Probabilistic Latent Semantic Analysis Model to find the contextual phrases and opinions from documents. The traditional keyword search is unable to find the semantic relations of phrases, Overcoming these obstacles requires the development of techniques for automatically classifying semantic relations of phrases. Through experiments, we show that the proposed algorithm works well to discover semantic relations of phrases and presents the semantic relations of phrases to the vector-space model. The proposed algorithm is able to perform a variety of analyses, including such as document classification, online reputation, and collaborative recommendation.

Translation study on the Gageum's Sanghanbuik (가금(柯琴) "상한부익(傷寒附翼)" 번역(飜譯) 연구(硏究))

  • Jeong, Chang-Hyun;Jang, Woo-Chang
    • Journal of Korean Medical classics
    • /
    • v.18 no.3 s.30
    • /
    • pp.183-206
    • /
    • 2005
  • 'Sanghallonju'(傷寒論注) reorganized the formation according to method of 'the classification of similar symptoms' and annotated the text of Sanghallon, introducing his new methodology and 'Sanghallonik'(傷寒論翼) proclaimed his new finding of the science of the Sanghan. Meanwhile, 'Sanghanbuik' (傷寒附翼) explains various prescriptions in the 'Sanghallon'. It categorizes prescriptions according to the six Meridians and sum up Gageum's research by commenting on the target symtoms and the use of medicine on each prescriptions. Gageum's study is consistent in desire for embodying the universality of the differentiation of syndromes in accordance with the theory of the six Meridians.(六經辨證) in the medical scene. From his work, the substantiality of the six 'Sanghandbuik' is a publication that shows the essence of Gageum's medical science from his inclination, conclusion and concrete methodology.

  • PDF

Efficient dimension reduction using QR-decomposition and its application to text categorization (QR-분해를 이용한 효율적인 차원 감소 방법과 문서 분류에의 응용)

  • Lee Moon-Hwi;Park Cheong-Hee
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06b
    • /
    • pp.358-360
    • /
    • 2006
  • LDA는 그룹간 간격을 최대화하고 그룹내 분산을 최소화하는 선형변환을 구함으로써 차원 감소된 공간에서 분별력(classification performance)을 높이는 선형 차원 감소 방법이다. 본 논문에서는 저샘플 문제(undersampled problem)에서 LDA를 적용할 수 있도록 QR-분해를 이용한 효율적인 차원 감소 방법을 제안한다. 특히 제안되는 방법은 문서 분류 문제에서처럼 한 문서가 몇 개의 카테고리에 중복적으로 속하는 경우 등 데이터의 독립성이 보장되지 않는 경우에도 효과적으로 적용될 수 있다는 장점이 있다.

  • PDF

A Distributed Processing Model for Automatic Classification of Text Documents based Personalized Information Using RTI (RTI 통신을 이용한 개인환경기반 자동문서 분산처리 기술)

  • In, Joo-Ho;Kim, Myung-Kyu;Chae, Soo-Hoan
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.06d
    • /
    • pp.365-369
    • /
    • 2007
  • 인터넷이 폭 넓게 보급되어 온라인 상에서 얻을 수 있는 텍스트 정보의 양이 급증함에 따라 산재해 있는 문서들에 대한 효과적인 정보 관리 및 검색이 요구되고 있다. 자동 문서분류란 문서의 내용에 기반하여 미리 정의되어 있는 범주에 문서를 자동으로 할당하는 작업으로써 효율적인 정보 관리 및 검색을 가능하게 한다. 하지만 자동문서 분류를 하기 위해서는 방대한 양의 데이터를 수집 보관하기 위한 분산 환경이 반드시 필요하다. 본 논문에서는 자동 문서분류를 위한 분산기반 환경의 조성에 있어서 RTI(Run Time Infrastructure)를 통한 분산 시스템 환경으로 구성하였다.

  • PDF