• Title/Summary/Keyword: 텍스트분류

Search Result 684, Processing Time 0.026 seconds

An Analysis of School Life Sensibility of Students at Korea National College of Agriculture and Fisheries Using Unstructured Data Mining(1) (비정형 데이터 마이닝을 활용한 한국농수산대학 재학생의 학교생활 감성 분석(1))

  • Joo, J.S.;Lee, S.Y.;Kim, J.S.;Song, C.Y.;Shin, Y.K.;Park, N.B.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.21 no.1
    • /
    • pp.99-114
    • /
    • 2019
  • In this study we examined the preferences of eight college living factors for students at Korea National College of Agriculture and Fisheries(KNCAF). Analytical techniques of unstructured data used opinion mining and text mining techniques, and the analysis results of text mining were visualized as word cloud. The college life factors included eight topics that were closely related to students: 'my present', 'my 10 years later', 'friendship', 'college festival', 'student restaurant', 'college dormitory', 'KNCAF', and 'long-term field practice'. In the text submitted by the students, we have established a dictionary of positive words and negative words to evaluate the preference by classifying the emotions of positive and negative. As a result, KNCAF students showed more than 85% positive emotions about the theme of 'student restaurant' and 'friendship'. But students' positive feelings about 'long-term field practice' and 'college dormitory' showed the lowest satisfaction rate of not exceeding 60%. The rest of the topics showed satisfaction of 69.3~74.2%. The gender differences showed that the positive emotions of male students were high in the topics of 'my present', 'my 10 years later', 'friendship', 'college dormitory' and 'long-term field practice'. And those of female were high in 'college festival', 'student restaurant' and 'KNCAF'. In addition, using text mining technique, the main words of positive and negative words were extracted, and word cloud was created to visualize the results.

Analysis of causal factors and physical reactions according to visually induced motion sickness (시각적으로 유발되는 어지럼증(VIMS)에 따른 신체적 반응 및 유발 요인 분석)

  • Lee, Chae-Won;Choi, Min-Kook;Kim, Kyu-Sung;Lee, Sang-Chul
    • Journal of the HCI Society of Korea
    • /
    • v.9 no.1
    • /
    • pp.11-21
    • /
    • 2014
  • We present an experimental framework to analyze the physical reactions and causal factors of Visually Induced Motion Sickness (VIMS) using electroencephalography (EEG) signals and vital signs. We studied eleven subjects who are voluntarily participated in the experiments and conducted online and offline surveys. In order to simulate videos including global motions that could cause the motion sickness, we extracted global motions by optical flow estimation method from hand-held captured video recordings containing intense motions. Then, we applied the extracted global motions to our test videos with action movies and texts. Each genre of video includes three levels of different motions depending on its intensity. EEG signal and vital sign that were measured by a portable electrocorticography device and an electronic monometer in real time while the subjects watch the videos including ones with the extracted motions. We perform an analysis of the EEG signals using Distance Map(DM) calculated by correlation among each channel of brain signal. Analysis using the vital signs and the survey results is also performed to obtain relationship between the VIMS and causal factors. As a result, we clustered subjects into three groups based on the analysis of the physical reaction using the DM and the correlation between vital sign and survey results, which shows high relationships between the VIMS and the intensity of motions.

VRML Model Retrieval System Based on XML (XML 기반 VRML 모델 검색 시스템)

  • Im, Min-San;Gwun, O-Bong;Song, Ju-Whan
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07a
    • /
    • pp.709-711
    • /
    • 2005
  • 컴퓨터 그래픽스 분야의 발전으로 3D 모델의 수가 기하급수적으로 늘고 있다. 기존의 텍스트나 2D 이미지만을 검색하는 시스템으로는 정확한 3D 모델의 검색이 힘들다. 따라서 3D 모델 검색 시스템의 필요성이 대두되고 많은 분야에서 그 정확도와 속도향상을 위한 3D 모델 검색 연산자(Descriptor)와 검색 알고리즘을 개발하기 위한 연구가 진행 중이다. 본 논문에서는 VRML 모델을 XML 데이터로 변환하여 3D 모델 검색에 사용하는 것이 주요 목표이다. 검색 방법은 크게 VRML의 노드 분류화를 통한 기본 도형에 대한 검색과 XML로 변환하면서 생성하는 무게중심(Mass-Center)을 이용한 검색 두 가지이다. 즉, 3D 모델 데이터베이스를 구축함으로써 VRML 노드를 통한 분류화와 라벨화된 3D 모델 데이터베이스 지원 등의 장점을 활용한다. 3D 모델을 Key값(Descriptor)을 생성하여 분류화된 XML 데이터로 저장하고, 처리하여 유사도 비교의 대상과 횟수가 많아질수록, 3D 모델을 바로 데이터베이스에서 검색에 사용할 수 있어 검색의 속도와 성능을 보다 증가시킬 수 있다. 보다 복잡한 3D 모델의 유사도 비교에 있어서는 Princeton Shape Benchmark(PSB)[1]에서 정확도가 가장 높게 평가된 방법인 LFD(Light Field Descriptor)[6] 검색 연산자를 사용한다. 이 방법은 3D 모델에서 2D 이미지를 얻어 검색하는 방법으로 많은 2D 이미지 관측점(View-Point)과 관측된 2D 이미지의 적합도를 비교하는 계산량이 많은 단점이 있다. 그래서 3D 모델 검색을 위한 2D 이미지 관측에 있어 x, y, z축 방향의 관측점을 얻는 방법을 제안함으로써 2D 이미지의 관측점을 줄여 계산량을 대폭 감소시키는 장점을 갖는다.것으로 조사되었으며 40대 이상의 연령층은 점심비용으로 더 많은 지출을 하고 있는 것으로 나타났다. 4) 끼니별 한식에 대한 선호도는 아침식사의 경우가 가장 높았으며, 이는 40대와 50대에서 높게 나타났다. 점심 식사로 가장 선호되는 음식은 중식, 일식이었으며 저녁 식사에서 가장 선호되는 메뉴는 전 연령층에서 일식, 분식류 이었으며, 한식에 대한 선택 정도는 전 연령층에서 매우 낮게 나타났다. 5) 각 연령층에서 선호하는 한식에 대한 조사에서는 된장찌개가 전 연령층에서 가장 높은 선호도를 나타내었고, 김치는 40대 이상의 선호도가 30대보다 높게 나타났으며, 흥미롭게도 30세 이하의 선호도는 30대보다 높게 나타났다. 그 외에도 떡과 죽에 대한 선호도는 전 연령층에서 낮게 조사되었다. 장아찌류의 선호도는 전 연령대에서 낮았으며 특히 30세 이하에서 매우 낮게 조사되었다. 한식의 맛에 대한 만족도 조사에서는 연령이 올라갈수록 한식의 맛에 대한 만족도는 낮아지고 있었으나, 한식의 맛에 대한 만족도가 높을수록 양과 가격에 대한 만족도는 높은 경향을 나타내었다. 전반적으로 한식에 대한 선호도는 식사 때와 식사 목적에 따라 연령대 별로 다르게 나타나고 있으나, 선호도는 성별이나 세대에 관계없이 폭 넓은 선호도를 반영하고 있으며, 이는 대학생들을 대상으로 하는 연구 등에서도 나타난바 같다. 주 5일 근무제의 확산과 초 중 고생들의 토요일 휴무와 더불어 여행과 엔터테인먼트산업은 더욱 더 발전을 거듭하고 있으며, 외식은 여행과 여가 활동의 필수적인 요소로써 그 역할을 일조하고 있다. 이와 같은 여가시간의 증가는 독신자들에게는 좀더 많은 여유시간을 가족을 이루고 있는 가족구성원들에게는 가족과의 유대를 강화하는 휴식과 오락의 소비 트렌드를 창출시켰다. 이와 더불어 외식은 식사를 해결하기 위한

  • PDF

A Study on Construction of Digital Museum Archiving Regarding Dance Costume (무용공연작품 의상을 위한 디지털 뮤지엄 아카이빙 구축)

  • Jeong, Yu-Jin;Yoo, Ji-Young;Baek, Hyun-Soon
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.1
    • /
    • pp.81-88
    • /
    • 2019
  • This article aims to identify the characters and theme shown in dance costume and utilize them from an educational perspective by constructing digital museum archiving, which can be systematically collected, classified and stored from dance costume. It deals with definition of digital museum archiving as theoretical background and examples of how to create digital museum archiving as research content. The role that archiving plays in digital museum and effectiveness have been demonstrated. Archive is a term used to indicate extensive material and its storage and referred to as an integrative model of display in the computer-generated space. When it comes to producing dance costume as a form of digital museum, the museum is to be made in the computer-generated area of dance costume. The museum shows each division of major, medium and minor classification. The major classification divides genre of dance performance into Korean dance, modern dance and ballet. The middle involves choreographers, costume designers. The minor categorization includes newspaper, interviews, performance pictures, and programs. Digital museum has the value of space utilization, creation, culture, utilization of multiple educational programs, offering of digital museum content, two-way communication, and program development of the new display form.

Analysis of Municipal Ordinances for Smart Cities of Municipal Governments: Using Topic Modeling (지방자치단체의 스마트시티 조례 분석: 토픽모델링을 활용하여)

  • Hyungjun Seo
    • Informatization Policy
    • /
    • v.30 no.1
    • /
    • pp.41-66
    • /
    • 2023
  • This study aims to reveal the direction of municipal ordinances for smart cities, while focusing on 74 municipal ordinances from 72 municipal governments through topic modeling. As a result, the main keywords that show a high frequency belong to establishment and operations of the Smart City Committee. From the result of topic modeling Latent Dirichlet Allocation(LDA), it classifies municipal ordinances for smart cities into eight topics as follows: Topic 1(security for process of smart cities), Topic 2(promotion of smart city industry), Topic 3(composition of a smart city consultative body for local residents), Topic 4(support system for smart cities), Topic 5(management for personal information), Topic 6(use of smart city data), Topic 7(implementation for intelligent public administration), and Topic 8(smart city promotion). As for topic categorization by region, Topics 5, 6, and 8 which are mostly related to the practical operation of smart cities have a significant portion of municipal ordinances for smart cities in the Seoul metropolitan area. Then, Topics 2, 3, and 4 which are mostly related to the initial implementation of smart cities have a significant portion of municipal ordinances for smart cities in provincial areas.

An Analysis of Trends in Natural Language Processing Research in the Field of Science Education (과학교육 분야 자연어 처리 기법의 연구동향 분석)

  • Cheolhong Jeon;Suna Ryu
    • Journal of The Korean Association For Science Education
    • /
    • v.44 no.1
    • /
    • pp.39-55
    • /
    • 2024
  • This study aimed to examine research trends related to Natural Language Processing (NLP) in science education by analyzing 37 domestic and international documents that utilized NLP techniques in the field of science education from 2011 to September 2023. In particular, the study systematically analyzed the content, focusing on the main application areas of NLP techniques in science education, the role of teachers when utilizing NLP techniques, and a comparison of domestic and international perspectives. The analysis results are as follows: Firstly, it was confirmed that NLP techniques are significantly utilized in formative assessment, automatic scoring, literature review and classification, and pattern extraction in science education. Utilizing NLP in formative assessment allows for real-time analysis of students' learning processes and comprehension, reducing the burden on teachers' lessons and providing accurate, effective feedback to students. In automatic scoring, it contributes to the rapid and precise evaluation of students' responses. In literature review and classification using NLP, it helps to effectively analyze the topics and trends of research related to science education and student reports. It also helps to set future research directions. Utilizing NLP techniques in pattern extraction allows for effective analysis of commonalities or patterns in students' thoughts and responses. Secondly, the introduction of NLP techniques in science education has expanded the role of teachers from mere transmitters of knowledge to leaders who support and facilitate students' learning, requiring teachers to continuously develop their expertise. Thirdly, as domestic research on NLP is focused on literature review and classification, it is necessary to create an environment conducive to the easy collection of text data to diversify NLP research in Korea. Based on these analysis results, the study discussed ways to utilize NLP techniques in science education.

Enhancing Empathic Reasoning of Large Language Models Based on Psychotherapy Models for AI-assisted Social Support (인공지능 기반 사회적 지지를 위한 대형언어모형의 공감적 추론 향상: 심리치료 모형을 중심으로)

  • Yoon Kyung Lee;Inju Lee;Minjung Shin;Seoyeon Bae;Sowon Hahn
    • Korean Journal of Cognitive Science
    • /
    • v.35 no.1
    • /
    • pp.23-48
    • /
    • 2024
  • Building human-aligned artificial intelligence (AI) for social support remains challenging despite the advancement of Large Language Models. We present a novel method, the Chain of Empathy (CoE) prompting, that utilizes insights from psychotherapy to induce LLMs to reason about human emotional states. This method is inspired by various psychotherapy approaches-Cognitive-Behavioral Therapy (CBT), Dialectical Behavior Therapy (DBT), Person-Centered Therapy (PCT), and Reality Therapy (RT)-each leading to different patterns of interpreting clients' mental states. LLMs without CoE reasoning generated predominantly exploratory responses. However, when LLMs used CoE reasoning, we found a more comprehensive range of empathic responses aligned with each psychotherapy model's different reasoning patterns. For empathic expression classification, the CBT-based CoE resulted in the most balanced classification of empathic expression labels and the text generation of empathic responses. However, regarding emotion reasoning, other approaches like DBT and PCT showed higher performance in emotion reaction classification. We further conducted qualitative analysis and alignment scoring of each prompt-generated output. The findings underscore the importance of understanding the emotional context and how it affects human-AI communication. Our research contributes to understanding how psychotherapy models can be incorporated into LLMs, facilitating the development of context-aware, safe, and empathically responsive AI.

Definition and Division in Intelligent Service Facility for Integrating Management (지능화시설의 통합운영관리를 위한 정의 및 구분에 관한 연구)

  • PARK, Jeong-Woo;YIM, Du-Hyun;NAM, Kwang-Woo;KIM, Jin-Young
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.19 no.4
    • /
    • pp.52-62
    • /
    • 2016
  • Smart City is urban development for complex problem solving that provides convenience and safety for citizens, and it is a blueprint for future cities. In 2008, the Korean government defined the construction, management, and government support of U-Cities in the legislation, Act on the Construction, Etc. of Ubiquitous Cities (Ubiquitous City Act), which included definitions of terms used in the act. In addition, the Minister of Land, Infrastructure and Transport has established a "ubiquitous city master plan" considering this legislation. The concept of U-Cities is complex, due to the mix of informatization and urban planning. Because of this complexity, the foundation of relevant regulations is inadequate, which is impeding the establishment and implementation of practical plans. Smart City intelligent service facilities are not easy to define and classify, because technology is rapidly changing and includes various devices for gathering and expressing information. The purpose of this study is to complement the legal definition of the intelligent service facility, which is necessary for integrated management and operation. The related laws and regulations on U-City were analyzed using text-mining techniques to identify insufficient legal definitions of intelligent service facilities. Using data gathered from interviews with officials responsible for constructing U-Cities, this study identified problems generated by implementing intelligent service facilities at the field level. This strategy should contribute to improved efficiency management, the foundation for building integrated utilization between departments. Efficiencies include providing a clear concept for establishing five-year renewable plans for U-Cities.

Business Application of Convolutional Neural Networks for Apparel Classification Using Runway Image (합성곱 신경망의 비지니스 응용: 런웨이 이미지를 사용한 의류 분류를 중심으로)

  • Seo, Yian;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.1-19
    • /
    • 2018
  • Large amount of data is now available for research and business sectors to extract knowledge from it. This data can be in the form of unstructured data such as audio, text, and image data and can be analyzed by deep learning methodology. Deep learning is now widely used for various estimation, classification, and prediction problems. Especially, fashion business adopts deep learning techniques for apparel recognition, apparel search and retrieval engine, and automatic product recommendation. The core model of these applications is the image classification using Convolutional Neural Networks (CNN). CNN is made up of neurons which learn parameters such as weights while inputs come through and reach outputs. CNN has layer structure which is best suited for image classification as it is comprised of convolutional layer for generating feature maps, pooling layer for reducing the dimensionality of feature maps, and fully-connected layer for classifying the extracted features. However, most of the classification models have been trained using online product image, which is taken under controlled situation such as apparel image itself or professional model wearing apparel. This image may not be an effective way to train the classification model considering the situation when one might want to classify street fashion image or walking image, which is taken in uncontrolled situation and involves people's movement and unexpected pose. Therefore, we propose to train the model with runway apparel image dataset which captures mobility. This will allow the classification model to be trained with far more variable data and enhance the adaptation with diverse query image. To achieve both convergence and generalization of the model, we apply Transfer Learning on our training network. As Transfer Learning in CNN is composed of pre-training and fine-tuning stages, we divide the training step into two. First, we pre-train our architecture with large-scale dataset, ImageNet dataset, which consists of 1.2 million images with 1000 categories including animals, plants, activities, materials, instrumentations, scenes, and foods. We use GoogLeNet for our main architecture as it has achieved great accuracy with efficiency in ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Second, we fine-tune the network with our own runway image dataset. For the runway image dataset, we could not find any previously and publicly made dataset, so we collect the dataset from Google Image Search attaining 2426 images of 32 major fashion brands including Anna Molinari, Balenciaga, Balmain, Brioni, Burberry, Celine, Chanel, Chloe, Christian Dior, Cividini, Dolce and Gabbana, Emilio Pucci, Ermenegildo, Fendi, Giuliana Teso, Gucci, Issey Miyake, Kenzo, Leonard, Louis Vuitton, Marc Jacobs, Marni, Max Mara, Missoni, Moschino, Ralph Lauren, Roberto Cavalli, Sonia Rykiel, Stella McCartney, Valentino, Versace, and Yve Saint Laurent. We perform 10-folded experiments to consider the random generation of training data, and our proposed model has achieved accuracy of 67.2% on final test. Our research suggests several advantages over previous related studies as to our best knowledge, there haven't been any previous studies which trained the network for apparel image classification based on runway image dataset. We suggest the idea of training model with image capturing all the possible postures, which is denoted as mobility, by using our own runway apparel image dataset. Moreover, by applying Transfer Learning and using checkpoint and parameters provided by Tensorflow Slim, we could save time spent on training the classification model as taking 6 minutes per experiment to train the classifier. This model can be used in many business applications where the query image can be runway image, product image, or street fashion image. To be specific, runway query image can be used for mobile application service during fashion week to facilitate brand search, street style query image can be classified during fashion editorial task to classify and label the brand or style, and website query image can be processed by e-commerce multi-complex service providing item information or recommending similar item.

A Study on Differences of Contents and Tones of Arguments among Newspapers Using Text Mining Analysis (텍스트 마이닝을 활용한 신문사에 따른 내용 및 논조 차이점 분석)

  • Kam, Miah;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.53-77
    • /
    • 2012
  • This study analyses the difference of contents and tones of arguments among three Korean major newspapers, the Kyunghyang Shinmoon, the HanKyoreh, and the Dong-A Ilbo. It is commonly accepted that newspapers in Korea explicitly deliver their own tone of arguments when they talk about some sensitive issues and topics. It could be controversial if readers of newspapers read the news without being aware of the type of tones of arguments because the contents and the tones of arguments can affect readers easily. Thus it is very desirable to have a new tool that can inform the readers of what tone of argument a newspaper has. This study presents the results of clustering and classification techniques as part of text mining analysis. We focus on six main subjects such as Culture, Politics, International, Editorial-opinion, Eco-business and National issues in newspapers, and attempt to identify differences and similarities among the newspapers. The basic unit of text mining analysis is a paragraph of news articles. This study uses a keyword-network analysis tool and visualizes relationships among keywords to make it easier to see the differences. Newspaper articles were gathered from KINDS, the Korean integrated news database system. KINDS preserves news articles of the Kyunghyang Shinmun, the HanKyoreh and the Dong-A Ilbo and these are open to the public. This study used these three Korean major newspapers from KINDS. About 3,030 articles from 2008 to 2012 were used. International, national issues and politics sections were gathered with some specific issues. The International section was collected with the keyword of 'Nuclear weapon of North Korea.' The National issues section was collected with the keyword of '4-major-river.' The Politics section was collected with the keyword of 'Tonghap-Jinbo Dang.' All of the articles from April 2012 to May 2012 of Eco-business, Culture and Editorial-opinion sections were also collected. All of the collected data were handled and edited into paragraphs. We got rid of stop-words using the Lucene Korean Module. We calculated keyword co-occurrence counts from the paired co-occurrence list of keywords in a paragraph. We made a co-occurrence matrix from the list. Once the co-occurrence matrix was built, we used the Cosine coefficient matrix as input for PFNet(Pathfinder Network). In order to analyze these three newspapers and find out the significant keywords in each paper, we analyzed the list of 10 highest frequency keywords and keyword-networks of 20 highest ranking frequency keywords to closely examine the relationships and show the detailed network map among keywords. We used NodeXL software to visualize the PFNet. After drawing all the networks, we compared the results with the classification results. Classification was firstly handled to identify how the tone of argument of a newspaper is different from others. Then, to analyze tones of arguments, all the paragraphs were divided into two types of tones, Positive tone and Negative tone. To identify and classify all of the tones of paragraphs and articles we had collected, supervised learning technique was used. The Na$\ddot{i}$ve Bayesian classifier algorithm provided in the MALLET package was used to classify all the paragraphs in articles. After classification, Precision, Recall and F-value were used to evaluate the results of classification. Based on the results of this study, three subjects such as Culture, Eco-business and Politics showed some differences in contents and tones of arguments among these three newspapers. In addition, for the National issues, tones of arguments on 4-major-rivers project were different from each other. It seems three newspapers have their own specific tone of argument in those sections. And keyword-networks showed different shapes with each other in the same period in the same section. It means that frequently appeared keywords in articles are different and their contents are comprised with different keywords. And the Positive-Negative classification showed the possibility of classifying newspapers' tones of arguments compared to others. These results indicate that the approach in this study is promising to be extended as a new tool to identify the different tones of arguments of newspapers.