• Title/Summary/Keyword: Korean text classification

Search Result 413, Processing Time 0.025 seconds

유휘와 구장산술

  • 홍성사;홍영희
    • Journal for History of Mathematics
    • /
    • v.11 no.1
    • /
    • pp.27-35
    • /
    • 1998
  • As Chinese philosophy has developed by commentary for the original texts, the Nine Chapters has been greatly improved by the commentary given by Liu Hui and it was transformed from an arithmetic text to Mathematics. Comparing his commentary and Chinese philosophical development up to his date, we conclude that Liu Hui was able to make such a great leap by his thorough understanding of philosophical development.

  • PDF

Text Classification Using Heterogeneous Knowledge Distillation

  • Yu, Yerin;Kim, Namgyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.10
    • /
    • pp.29-41
    • /
    • 2022
  • Recently, with the development of deep learning technology, a variety of huge models with excellent performance have been devised by pre-training massive amounts of text data. However, in order for such a model to be applied to real-life services, the inference speed must be fast and the amount of computation must be low, so the technology for model compression is attracting attention. Knowledge distillation, a representative model compression, is attracting attention as it can be used in a variety of ways as a method of transferring the knowledge already learned by the teacher model to a relatively small-sized student model. However, knowledge distillation has a limitation in that it is difficult to solve problems with low similarity to previously learned data because only knowledge necessary for solving a given problem is learned in a teacher model and knowledge distillation to a student model is performed from the same point of view. Therefore, we propose a heterogeneous knowledge distillation method in which the teacher model learns a higher-level concept rather than the knowledge required for the task that the student model needs to solve, and the teacher model distills this knowledge to the student model. In addition, through classification experiments on about 18,000 documents, we confirmed that the heterogeneous knowledge distillation method showed superior performance in all aspects of learning efficiency and accuracy compared to the traditional knowledge distillation.

A Study on Environmental research Trends by Information and Communications Technologies using Text-mining Technology (텍스트 마이닝 기법을 이용한 환경 분야의 ICT 활용 연구 동향 분석)

  • Park, Boyoung;Oh, Kwan-Young;Lee, Jung-Ho;Yoon, Jung-Ho;Lee, Seung Kuk;Lee, Moung-Jin
    • Korean Journal of Remote Sensing
    • /
    • v.33 no.2
    • /
    • pp.189-199
    • /
    • 2017
  • Thisstudy quantitatively analyzed the research trendsin the use ofICT ofthe environmental field using the text mining technique. To that end, the study collected 359 papers published in the past two decades(1996-2015)from the National Digital Science Library (NDSL) using 38 environment-related keywords and 16 ICT-related keywords. It processed the natural languages of the environment and ICT fields in the papers and reorganized the classification system into the unit of corpus. It conducted the text mining analysis techniques of frequency analysis, keyword analysis and the association rule analysis of keywords, based on the above-mentioned keywords of the classification system. As a result, the frequency of the keywords of 'general environment' and 'climate' accounted for 77 % of the total proportion and the keywords of 'public convergence service' and 'industrial convergence service' in the ICT field took up approximately 30 % of the total proportion. According to the time series analysis, the researches using ICT in the environmental field rapidly increased over the past 5 years (2011-2015) and the number of such researches more than doubled compared to the past (1996-2010). Based on the environmental field with generated association rules among the keywords, it was identified that the keyword 'general environment' was using 16 ICT-based technologies and 'climate' was using 14 ICT-based technologies.

Effective Text Question Analysis for Goal-oriented Dialogue (목적 지향 대화를 위한 효율적 질의 의도 분석에 관한 연구)

  • Kim, Hakdong;Go, Myunghyun;Lim, Heonyeong;Lee, Yurim;Jee, Minkyu;Kim, Wonil
    • Journal of Broadcast Engineering
    • /
    • v.24 no.1
    • /
    • pp.48-57
    • /
    • 2019
  • The purpose of this study is to understand the intention of the inquirer from the single text type question in Goal-oriented dialogue. Goal-Oriented Dialogue system means a dialogue system that satisfies the user's specific needs via text or voice. The intention analysis process is a step of analysing the user's intention of inquiry prior to the answer generation, and has a great influence on the performance of the entire Goal-Oriented Dialogue system. The proposed model was used for a daily chemical products domain and Korean text data related to the domain was used. The analysis is divided into a speech-act which means independent on a specific field concept-sequence and which means depend on a specific field. We propose a classification method using the word embedding model and the CNN as a method for analyzing speech-act and concept-sequence. The semantic information of the word is abstracted through the word embedding model, and concept-sequence and speech-act classification are performed through the CNN based on the semantic information of the abstract word.

A Study on "HuatuoXuanmenNeizhaotu" ("화타현문내조도(華陀玄門內照圖)"에 대(對)한 연구(硏究))

  • Sim, Hyun-A;Keum, Kyung-Soo;Jung, Hyen-Young;Choi, Hyun-Bae;Eom, Dong-Myung
    • Journal of the Korean Institute of Oriental Medical Informatics
    • /
    • v.18 no.1
    • /
    • pp.1-63
    • /
    • 2012
  • Objective : "Huatuoxuanmenneizhaotu" is a Huatuo's about 5~6 century works are estimated to be voted for. Scored the first Anatomical Pictures. Expand your knowledge on the anatomy of the later set the foundation. This books is part of two volumes, which is largely divided into six parts. We have concern on the content and features. Method : Through "Huatuoxuanmenneizhaotu" text translation, we will try to categorize two ways : classifying 1) The first volume of Chapter 1, Pictures 2) The second volume divided into four parts, Chapter 2, Viscera Disease and Chapter 3, Viscera metastasize and Chapter 4, Viscera and Bowel each for metastasize, explained. Result : In consideration against Disease symptom classification, Medicine classification, processing of medicinals examine. Viscera Disease symptom each wind pattern(風證), qi pattern(氣證), heat pattern(熱證), cold pattern(冷證), deficiency pattern(虛證) was classified as. Same method were explained as Viscera into Viscera, as Viscera into Bowel. Viscera Disease also not mentioned in the Bowel Disease symptoms were found to be viewed. Conclusion : These results explain "HuatuoXuanmenNeizhaotu" were really diverse and various.

  • PDF

Classification of Characters in Movie by Correlation Analysis of Genre and Linguistic Style

  • You, Eun-Soon;Song, Jae-Won;Park, Seung-Bo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.1
    • /
    • pp.49-55
    • /
    • 2019
  • The character dialogue created by AI is unnatural when compared with human-made dialogue, and it can not reveal the character's personality properly in spite of remarkable development of AI. The purpose of this paper is to classify characters through the linguistic style and to investigate the relation of the specific linguistic style with the personality. We analyzed the dialogues of 92 characters selected from total 60 movies categorized four movie genres, such as romantic comedy, action, comedy and horror/thriller, using Linguistic Inquiry and Word Count (LIWC), a text analysis software. As a result, we confirmed that there is a unique language style according to genre. Especially, we could find that the emotional tone than analytical thinking are two important features to classify. They were analyzed as very important features for classification as the precision and recall is over 78% for romantic comedy and action. However, the precision and recall were 66% and 50% for comedy and horror/thriller. Their impact on classification was less than romantic comedy and action genre. The characters of romantic comedy deal with the affection between men and women using a very high value of emotional tone than analytical thinking. The characters of action genre who need rational judgment to perform mission have much greater analytical thinking than emotional tone. Additionally, in the case of comedy and horror/thriller, we analyzed that they have many kinds of characters and that characters often change their personalities in the story.

Korean Mobile Spam Filtering System Considering Characteristics of Text Messages (문자메시지의 특성을 고려한 한국어 모바일 스팸필터링 시스템)

  • Sohn, Dae-Neung;Lee, Jung-Tae;Lee, Seung-Wook;Shin, Joong-Hwi;Rim, Hae-Chang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.7
    • /
    • pp.2595-2602
    • /
    • 2010
  • This paper introduces a mobile spam filtering system that considers the style of short text messages sent to mobile phones for detecting spam. The proposed system not only relies on the occurrence of content words as previously suggested but additionally leverages the style information to reduce critical cases in which legitimate messages containing spam words are mis-classified as spam. Moreover, the accuracy of spam classification is improved by normalizing the messages through the correction of word spacing and spelling errors. Experiment results using real world Korean text messages show that the proposed system is effective for Korean mobile spam filtering.

Adipose Tumor, Fibroblastic/Myofibroblastic Tumors, So-called Fibrohistiocytic Tumors, Smooth Muscle Tumors, Pericytic Tumors and Skeletal Muscle Tumors: An Update Based on the New WHO Soft Tissue Classification (연조직종양의 새로운 WHO 분류를 중심으로: 지방세포종, 섬유모세포성/근육섬유모세포성종, 소위섬유조직구종, 평활근종, 혈관주위종과 근골격종에 대하여)

  • Suh, Kyung-Jin
    • The Journal of the Korean bone and joint tumor society
    • /
    • v.14 no.1
    • /
    • pp.1-9
    • /
    • 2008
  • Soft tissue tumor classifications should be an important part of radiology, oncology and, for clinicians and pathologists, they provide diagnostic instruction and prognostic guidelines. In soft tissue tumor classification systems, the World Health Organization (WHO) classifications have become dominant, enabled by the timely publication of new 'blue books' which included detailed text and numerous good illustrations. The new WHO classification of soft tissue tumors was introduced in 2002. Because the classification represents a broad consensus concept, it has gained widespread acceptance around the globe. This article reviews the changes which were introduced the adipose tumors, fibroblastic/myofibroblastic tumors, so-called fibrohistiocytic tumors, smooth muscle tumors, pericytic tumors and skeletal muscle tumors which have been first recognized or properly classified during the past decade.

  • PDF

Vascular Tumors, Chondroid-osseous Tumors, Tumors of Uncertain Differentiation: An Update Based on the New WHO Soft Tissue Classification (연조직종양의 새로운 WHO 분류를 중심으로: 혈관종, 연골-골종과 불확실한분화종에 대하여)

  • Suh, Kyung-Jin
    • The Journal of the Korean bone and joint tumor society
    • /
    • v.14 no.2
    • /
    • pp.79-85
    • /
    • 2008
  • Soft tissue tumor classifications should be an important part of radiology, oncology and, for orthopedic clinicians and pathologists, they provide diagnostic instruction and prognostic guidelines. In soft tissue tumor classification systems, the World Health Organization (WHO) classifications have become dominant, enabled by the timely publication of new blue books which included detailed text and numerous good illustrations. The new WHO classification of soft tissue tumors was introduced in 2002. Because the classification represents a broad consensus concept, it has gained widespread acceptance around the globe. This article reviews the changes which were introduced the vascular tumors, chondroid-osseous tumors and tumors of uncertain differentiation which have been first recognized or properly classified during the past decade.

  • PDF

A Study on the Improvement of Retrieval Efficiency Based on the CRFMD (공통기술표현포맷에 기반한 다매체자료의 검색효율 향상에 관한 연구)

  • Park, Il-Jong;Jeong, Ki-Tai
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.3 s.61
    • /
    • pp.5-21
    • /
    • 2006
  • In recent years, theories of image and sound analysis have been proposed to work with text retrieval systems and have progressed quickly with the rapid progress in data processing speeds. This study proposes a common representation format for multimedia documents (CRFMD) composed of both images and text to form a single data structure. It also shows that image classification of a given test set is dramatically improved when text features are encoded together with image features. CRFMD might be applicable to other areas of multimedia document retrieval and processing, such as medical image retrieval, World Wide Web searching, and museum collection retrieval.