• Title/Summary/Keyword: 텍스트수준

Search Result 267, Processing Time 0.027 seconds

Analysis of global trends on smart manufacturing technology using topic modeling (토픽모델링을 활용한 주요국의 스마트제조 기술 동향 분석)

  • Oh, Yoonhwan;Moon, HyungBin
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.4
    • /
    • pp.65-79
    • /
    • 2022
  • This study identified smart manufacturing technologies using patent and topic modeling, and compared the technology development trends in countries such as the United States, Japan, Germany, China, and South Korea. To this purpose, this study collected patents in the United States and Europe between 1991 and 2020, processed patent abstracts, and identified topics by applying latent Dirichlet allocation model to the data. As a result, technologies related to smart manufacturing are divided into seven categories. At a global level, it was found that the proportion of patents in 'data processing system' and 'thermal/fluid management' technologies is increasing. Considering the fact that South Korea has relative competitiveness in thermal/fluid management technologies related to smart manufacturing, it would be a successful strategy for South Korea to promote smart manufacturing in heavy and chemical industry. This study is significant in that it overcomes the limitations of quantitative technology level evaluation proposed a new methodology that applies text mining.

An Experimental Study on the Automatic Classification of Korean Journal Articles through Feature Selection (자질선정을 통한 국내 학술지 논문의 자동분류에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.1
    • /
    • pp.69-90
    • /
    • 2022
  • As basic data that can systematically support and evaluate R&D activities as well as set current and future research directions by grasping specific trends in domestic academic research, I sought efficient ways to assign standardized subject categories (control keywords) to individual journal papers. To this end, I conducted various experiments on major factors affecting the performance of automatic classification, focusing on feature selection techniques, for the purpose of automatically allocating the classification categories on the National Research Foundation of Korea's Academic Research Classification Scheme to domestic journal papers. As a result, the automatic classification of domestic journal papers, which are imbalanced datasets of the real environment, showed that a fairly good level of performance can be expected using more simple classifiers, feature selection techniques, and relatively small training sets.

The Discourse associated with mental illness on TV documentaries : The Completion of Distinction (TV 다큐멘터리가 생성한 정신장애 담론 : 구별짓기의 완성)

  • Chang, Hae Kyung;Woo, Ah Young
    • Korean Journal of Social Welfare Studies
    • /
    • v.42 no.1
    • /
    • pp.179-217
    • /
    • 2011
  • This paper discusses the type of discourse associated with mental illness and individuals with mental illness in the context of TV documentary. Discourse is an linguistic product which prescribes and interprets the reality and reconstructs the reality systematically. Therefore, TV documentary contents illuminate the dominant discourse associate with mental illness through the diverse types of representation. We picked four TV documentaries from each public channels and analyzed these documentaries using Fairclough's Critical Discourse Analysis. Faircough suggests the analysis frame consisting of three level. The analysis reveals that TV documentaries produce the discourse "the Completion of Distinction" associated with mental illness and individuals with mental illness. TV documentaries suggest the reason why we distinct them from us in textual level. In discourse practice level, they suggest the method and the principal agent of distinction. For social practice, TV documentaries reinforce the dual attitude of viewer. Alternative discourse associated with mental illness and individuals with mental illness will be constructed when individuals with mental illness recovers the status of principal agents and produces strong voices about themselves.

A Self-Guided Approach to Enhance Korean Text Generation in Writing Assistants (A Self-Guided Approach을 활용한 한국어 텍스트 생성 쓰기 보조 기법의 향상 방법)

  • Donghyeon Jang;Jinsu Kim;Minho Lee
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.541-544
    • /
    • 2023
  • LLM(Largescale Language Model)의 성능 향상을 위한 비용 효율적인 방법으로 ChatGPT, GPT-4와 같은 초거대 모델의 output에 대해 SLM(Small Language Model)을 finetune하는 방법이 주목받고 있다. 그러나, 이러한 접근법은 주로 범용적인 지시사항 모델을 위한 학습 방법으로 사용되며, 제한된 특정 도메인에서는 추가적인 성능 개선의 여지가 있다. 본 연구는 특정 도메인(Writing Assistant)에서의 성능 향상을 위한 새로운 방법인 Self-Guided Approach를 제안한다. Self-Guided Approach는 (1) LLM을 활용해 시드 데이터에 대해 도메인 특화된 metric(유용성, 관련성, 정확성, 세부사항의 수준별) 점수를 매기고, (2) 점수가 매겨진 데이터와 점수가 매겨지지 않은 데이터를 모두 활용하여 supervised 방식으로 SLM을 미세 조정한다. Vicuna에서 제안된 평가 방법인, GPT-4를 활용한 자동평가 프레임워크를 사용하여 Self-Guided Approach로 학습된 SLM의 성능을 평가하였다. 평가 결과 Self-Guided Approach가 Self-instruct, alpaca와 같이, 생성된 instruction 데이터에 튜닝하는 기존의 훈련 방법에 비해 성능이 향상됨을 확인했다. 다양한 스케일의 한국어 오픈 소스 LLM(Polyglot1.3B, PolyGlot3.8B, PolyGlot5.8B)에 대해서 Self-Guided Approach를 활용한 성능 개선을 확인했다. 평가는 GPT-4를 활용한 자동 평가를 진행했으며, Korean Novel Generation 도메인의 경우, 테스트 셋에서 4.547점에서 6.286점의 성능 향상이 발생했으며, Korean scenario Genration 도메인의 경우, 테스트 셋에서 4.038점에서 5.795 점의 성능 향상이 발생했으며, 다른 유사 도메인들에서도 비슷한 점수 향상을 확인했다. Self-Guided Approach의 활용을 통해 특정 도메인(Writing Assistant)에서의 SLM의 성능 개선 가능성을 확인했으며 이는 LLM에 비용부담을 크게 줄이면서도 제한된 도메인에서 성능을 유지하며, LLM을 활용한 응용 서비스에 있어 실질적인 도움을 제공할 수 있을 것으로 기대된다.

  • PDF

BIM-based visualization technology for blasting in Underground Space (지하공간 BIM 기반 발파진동 영향 시각화 기술)

  • Myoung Bae Seo;Soo Mi Choi;Seong Jong Oh;Seong Uk Kim;Jeong Hoon Shin
    • Smart Media Journal
    • /
    • v.12 no.11
    • /
    • pp.67-76
    • /
    • 2023
  • We propose a visualization method to respond to civil complaints through an analysis of the impact of blasting. In order to analyze the impact of blasting on tunnel excavation, we propose a simulation visualization method considering the mutual influence of the construction infrastructure by linking measurement data and 3D BIM model. First, the level of BIM modeling required for simulation was defined. In addition, vibration measurement data were collected for the GTX-A construction site, terrain and structure BIM were created, and a method for visualizing measurement data using blast vibration estimation was developed. Next, a spherical blasting influence source library was developed for visualization of the blasting influence source, and a specification table that could be linked with Revit Dynamo automation logic was constructed. Using this result, a method for easily visualizing the impact analysis of blasting vibration in 3D was proposed.

Cost Performance Evaluation Framework through Analysis of Unstructured Construction Supervision Documents using Binomial Logistic Regression (비정형 공사감리문서 정보와 이항 로지스틱 회귀분석을 이용한 건축 현장 비용성과 평가 프레임워크 개발)

  • Kim, Chang-Won;Song, Taegeun;Lee, Kiseok;Yoo, Wi Sung
    • Journal of the Korea Institute of Building Construction
    • /
    • v.24 no.1
    • /
    • pp.121-131
    • /
    • 2024
  • This research explores the potential of leveraging unstructured data from construction supervision documents, which contain detailed inspection insights from independent third-party monitors of building construction processes. With the evolution of analytical methodologies, such unstructured data has been recognized as a valuable source of information, offering diverse insights. The study introduces a framework designed to assess cost performance by applying advanced analytical methods to the unstructured data found in final construction supervision reports. Specifically, key phrases were identified using text mining and social network analysis techniques, and these phrases were then analyzed through binomial logistic regression to assess cost performance. The study found that predictions of cost performance based on unstructured data from supervision documents achieved an accuracy rate of approximately 73%. The findings of this research are anticipated to serve as a foundational resource for analyzing various forms of unstructured data generated within the construction sector in future projects.

A Study on Current State of Web Content Accessibility on General Hospital Websites in Korea (국내 종합병원의 웹 접근성 실태에 관한 연구)

  • Kim, Yong-Seob;Oh, Kun-Seok
    • Journal of Internet Computing and Services
    • /
    • v.11 no.3
    • /
    • pp.87-103
    • /
    • 2010
  • In the study, we introduce the trend in domestic and foreign web accessibility, as well as the legal system that ensures web accessibility. Based on Korean Web Content Accessibility Guidelines (KWCAG)1.0, we investigated the web content accessibility of 80 tertiary health-care hospitals and general hospitals in Korea. We evaluated accessibility by combining accessibility-based criteria (ABC) with usability-based criteria (UBC). ABC was limited to an alternative text for Guideline 1, using a small number of frames and keyboard accessibility for Guideline 2. UBC checked the voice service (TTS), resizing text, providing multi-lingual websites, and disclosing web accessibility policy. KADO-WAH2.0 was used for representing the compliance rate. The evaluation result was a considerable improvement from previous results, even though the rate of compliance with web accessibility was generally insufficient. There was a significant difference between those medical centers which did and did not comply with web accessibility. Incidentally, many hospitals were found to have attempted to confront and come to terms with web accessibility. In future, the following factors are advisable for medical centers with publicity or public interest: they must employ active and aggressive promotion of establishment of independent accessibility guidelines to secure web accessibility, they should effect an improvement of the realization of web accessibility, there can be constant education and promotion, and there can be an institutional supplementation, as well as others.

The Effect of the Quality of Pre-Assigned Subject Categories on the Text Categorization Performance (학습문헌집합에 기 부여된 범주의 정확성과 문헌 범주화 성능)

  • Shim, Kyung;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.2
    • /
    • pp.265-285
    • /
    • 2006
  • In text categorization a certain level of correctness of labels assigned to training documents is assumed without solid knowledge on that of real-world collections. Our research attempts to explore the quality of pre-assigned subject categories in a real-world collection, and to identify the relationship between the quality of category assignment in training set and text categorization performance. Particularly, we are interested in to what extent the performance can be improved by enhancing the quality (i.e., correctness) of category assignment in training documents. A collection of 1,150 abstracts in computer science is re-classified by an expert group, and divided into 907 training documents and 227 test documents (15 duplicates are removed). The performances of before and after re-classification groups, called Initial set and Recat-1/Recat-2 sets respectively, are compared using a kNN classifier. The average correctness of subject categories in the Initial set is 16%, and the categorization performance with the Initial set shows 17% in $F_1$ value. On the other hand, the Recat-1 set scores $F_1$ value of 61%, which is 3.6 times higher than that of the Initial set.

Analysis of Students' Understanding of the Terms Presented on the Information Board of Jinan-Muju National Geopark (진안-무주 국가지질공원의 안내 표지판에 제시된 용어에 대한 학생들의 이해도 분석)

  • Cho, Kyu Seong;Park, Kyeong-Jin
    • Journal of the Korean earth science society
    • /
    • v.41 no.5
    • /
    • pp.520-530
    • /
    • 2020
  • The purpose of this study was to investigate students' understanding of the terms presented on the information board in the Jinan-Muju National Geopark. To this end, a survey was conducted with 219 students (147 elementary, 41 middle, and 31 high school students) to determine the level of their perceptions of the geopark, and of the usefulness of the information board, and their understanding of the terms presented on the information boards of the National Geopark. To determine the students' understanding of terms, 10 representative information boards were selected and the entire content was converted into text. Afterwards, 256 key terms were extracted from the text through discussions with three experts, and these terms were presented to students to grasp their level of understanding. The results were as follows: First, the level of students' perceptions about the geopark was very low, so publicity and educational approaches are needed. Second, students were not interested in the information board and had a low level of understanding owing to the large amount of information and reading difficulties. Third, among the 256 terms, the number of terms that students found difficult to understand tended to decrease with increasing school grade: 80 for elementary school students, 53 for middle school students, and 31 for high school students. The reason the students had difficulty in understanding terms was that elementary school students had not yet learned the terms in the curriculum, whereas middle and high school students have difficulty understanding technical terms and Chinese characters. Therefore, the information board in the geopark will need to be easily translated into Chinese characters or additional explanations of technical terms need to be provided so that visitors can understand the concepts more easily.

The Recognition Comparison for the Utilization State of Smart Devices and Culinary Education Application Development of High School Students (고등학생의 스마트 기기 활용 실태와 조리교육 애플리케이션 개발에 대한 인식 비교 연구)

  • Kang, Keoung-Shim
    • Journal of Digital Convergence
    • /
    • v.10 no.11
    • /
    • pp.619-626
    • /
    • 2012
  • The purpose of this study is to compare and analyze the utilization state of smart devices and the recognition level of educational application development of the general high school and the specialized high school. Specialized high school students preferred the utilization of smart devices more and daily spent on the devices more time than general high school students. As for the learning field, language for the general high school and the certificate of qualification for the specialized high school were shown high. The merit of smart device utilization is the use of spare time and its infrastructure was most required. The most expected content is a video lecture for the general high school and cooperative learning for the specialized high school and the most satisfied point was mobility. The specialized high school students feel more necessity about the application development for culinary education and had a plan to utilize it more and more preferred practice videos. As for the food development areas, the general high school students hoped simple food and the specialized high school students did cooking technician food and they both hoped the application to be uploaded in portal sites and the department homepage. The application development for culinary education is required to focus simulation learning including practice videos and cooking recipes and add an evaluation function to check the academic achievement levels. It is required to provide the subject goals of each course and concrete information on solving problems. Contents including video, music, texts need to be attached to improve learning immersion. There should be the beginning and development of a lesson and the flow of arrangement and communication between main bodies of learning should be improved by utilization of SNS cooperative learning services.