• Title/Summary/Keyword: 텍스트 연구

Search Result 3,494, Processing Time 0.032 seconds

Advancing Societal Statistics Processing Methodology through Artificial Intelligence: A Case Study on Household Trend Survey and Time Use Survey (인공지능 기반 사회 통계 생산 방법론 고도화 방안: 가계동향조사와 생활시간조사 사례)

  • Kyo-Joong Oh;Ho-Jin Choi;Ilgu Kim;Seungwoo Han;Kunsoo Kim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.563-567
    • /
    • 2023
  • 본 연구는 한국 통계청이 수행하는 가계동향조사와 생활시간조사에서 자료처리 과정 및 방법을 혁신하려는 시도로, 기존의 통계 생산 방법론의 한계를 극복하고, 대규모 데이터의 효과적인 관리와 분석을 가능하게 하는 인공지능 기반의 통계 생산을 목표로 한다. 본 연구는 데이터 과학과 통계학의 교차점에서 진행되며, 인공지능 기술, 특히 자연어 처리와 딥러닝을 활용하여 비정형 텍스트 분류 방법의 성능을 검증하며, 인공지능 기반 통계분류 방법론의 확장성과 추가적인 조사 확대 적용의 가능성을 탐구한다. 이 연구의 결과는 통계 데이터의 품질 향상과 신뢰성 증가에 기여하며, 국민의 생활 패턴과 행동에 대한 더 깊고 정확한 이해를 제공한다.

  • PDF

Web Accessibility of Healthcare Websites of Korean Government and Public Agencies: Automated and Expert Evaluations (정부 및 공공기관의 보건 관련 웹 사이트의 웹 접근성 - 자동 및 전문가 평가 -)

  • Yi, Yong Jeong
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.26 no.4
    • /
    • pp.283-304
    • /
    • 2015
  • The purpose of this study was to identify Web accessibility issues of healthcare websites of the Korean government and public agencies by evaluating these websites' accessibility in accordance with the Korean Web Contents Accessibility Guideline. This study conducted both automated and expert testing to assess the accessibility of a total of 27 health-related websites. The results of the assessment which was conducted in two stages indicated that institutions such as the National Hospital and National Rehabilitation Center demonstrated almost no Web accessibility error. In addition, the Korea Health Insurance Review and Assessment Service, the Ministry of Health and Welfare, the Health Services Agency, the Ministry of Food and Drug Safety, and the Korea Medical Dispute Mediation and Arbitration Agency attained very high web accessibility. However, the results of an expert evaluation highlighted that there were considerable errors in providing appropriate alternative text, which was not found in the automated test, and the color contrast of the text content did not comply with Web accessibility standard. Therefore, these websites did not support web accessibility for the sight-impaired. Furthermore, the present study found that it was difficult to deliver accurate information to users due to errors in the default language display and markup, and also, issues of skipping repeated content, content linearization, and compliance with keyboard use were considered as challenges that might arise for people with sight, cognitive and mobility impairments with respect to Web accessibility. It is the first study that evaluated accessibility of healthcare websites of the Korean government and public agencies based on the Korean Web Contents Accessibility Guideline. The present study made a contribution to research on Web accessibility by conducting expert testing, which provided a more complete assessment that identified the degree and specific issues of accessibility errors when compared to automated testing.

Discovering Interdisciplinary Convergence Technologies Using Content Analysis Technique Based on Topic Modeling (토픽 모델링 기반 내용 분석을 통한 학제 간 융합기술 도출 방법)

  • Jeong, Do-Heon;Joo, Hwang-Soo
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.3
    • /
    • pp.77-100
    • /
    • 2018
  • The objectives of this study is to present a discovering process of interdisciplinary convergence technology using text mining of big data. For the convergence research of biotechnology(BT) and information communications technology (ICT), the following processes were performed. (1) Collecting sufficient meta data of research articles based on BT terminology list. (2) Generating intellectual structure of emerging technologies by using a Pathfinder network scaling algorithm. (3) Analyzing contents with topic modeling. Next three steps were also used to derive items of BT-ICT convergence technology. (4) Expanding BT terminology list into superior concepts of technology to obtain ICT-related information from BT. (5) Automatically collecting meta data of research articles of two fields by using OpenAPI service. (6) Analyzing contents of BT-ICT topic models. Our study proclaims the following findings. Firstly, terminology list can be an important knowledge base for discovering convergence technologies. Secondly, the analysis of a large quantity of literature requires text mining that facilitates the analysis by reducing the dimension of the data. The methodology we suggest here to process and analyze data is efficient to discover technologies with high possibility of interdisciplinary convergence.

Analyzing Comments of YouTube Video to Measure Use and Gratification Theory Using Videos of Trot Singer, Cho Myung-sub (YouTube 동영상 의견분석을 통한 사용과 충족 이론 측정 : 트로트 가수 조명섭 동영상을 중심으로)

  • Hong, Han-Kook;Leem, Byung-hak;Kim, Sam-Moon
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.9
    • /
    • pp.29-42
    • /
    • 2020
  • The purpose of this study is to present a qualitative research method for extracting and analyzing the comments written by YouTube video users. To do this, we used YouTube users' feedback to measure the hedonic, social, and utilitarian gratification of use and gratification theory(UGT) through by using analysis and topic modeling. The result of the measurement found that the first reason why users watch the trot singer, Cho Myung-sub's video in the KBS Korean broadcasting channel is to achieve hedonic gratification with high frequency. In word-document network analysis, the degree of centrality was high in words, such as 'cheering', 'thank you', 'fighting', and 'best'. Betweenness centrality is similar to the degree of centrality. Eigenvector centrality also shows that words such as 'love', 'heart', and 'thank you' are the most influential words of users' opinions. The results of the centrality analysis present that the majority of video users show their 'love', 'heart' and 'thank you' for the video. it indicates that the high words in centrality analysis is consistent with the high frequency words of hedonic and social gratification dimension of the UGT. The study has research methodological implication that shed light on the motivations for watching YouTube videos with UGT using text mining techniques that automate qualitative analysis, rather than following a survey-based structural equation model.

A Study on Consumer Value Perception through Social Big Data Analysis: Focus on Smartphone Brands (소셜 빅데이터 분석을 통한 소비자 가치 인식 연구: 신규 스마트폰을 중심으로)

  • Kim, Hyong-Jung;Kim, Jin-Hwa
    • The Journal of Society for e-Business Studies
    • /
    • v.22 no.1
    • /
    • pp.123-146
    • /
    • 2017
  • The information that consumers share in the SNS (Social Networking Service) has a great influence on the purchase of consumers. Therefore, it is necessary to pay attention to new research methodology and advertising strategy using Social Big Data. In this context, the purpose of this study is to quantitatively analyze customer value through Social Big Data. In this study, we analyzed the value structure of consumers for the three smartphone brands through text mining and positive/negative image analysis. Analysis result, it was possible to distinguish the emotional aspects (sensitivity) and rational aspects (rationality) for customer value per brand. In the case of the Galaxy S7 and iPhone 6S, emotional aspects were important before the launch, but the rational aspects was important after release date. On the other hand, in the case of the LG G5, emotional aspects were important before and after launch. We can propose two core advertising strategies based on analyzed consumer value. When developing advertising strategy in the case of the Galaxy S7, there is a need to emphasize the rational aspects of product attributes and differentiated functions. In the case of the LG G5, it is necessary to consider the emotional aspects of happiness, excitement, pleasure, and fun that are felt by using products in advertising strategy. As a result, this study will provide a good standard for actual advertising strategy through consumer value analysis. Advertising strategies are primarily driven by intuition or experience. Therefore, it is important to develop advertising strategies by analyzing consumer value through social big data analysis.

Analysis of Film 〈Obaltan〉 focused on Narratology's Viewpoint (서사학적 관점으로 분석한 영화〈오발탄〉의 서사구조 연구)

  • Kim, Jong-Wan
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.11
    • /
    • pp.111-119
    • /
    • 2011
  • Movie research in the 1980's structuralism looks tendency to escape director or text research and analyze spectator or inspection action. These post-structuralist divert interest by analytic convention of spectators in analysis by director's intention or text type correctly. There is the age that spectator, inspectional action and inspectional subject weighs more than director, work and text itself. But, inspection of movie can be person's enemy by director's narrative strategy or spectator's analytic quality that depend on a text and spectator and their interaction usually, and only method to acquire universality chooses full analytic discourse to principle. We should be structured by symbol system that the event is consisted of movie language to reappear the event through narrative in movie and this symbol system, director's narrative strategy can cause fixed esthetic distance between spectator. Researches to analyze this distance need to keep universal validity as much as being accepted by effort to gap with director and spectator. Therefore, narrative poetry that I analyze movie narration style by 'narrated' and unit of 'narrating' and study the form and function so-called, is going to follow narratology's access method. The consistent argument of this narrative poetry is that story is consisted of the events and these observe to structured thing by unit that is sequence through arrangement with the other event that adjoin in the event. Also, director need consensus with spectator to reappear connection of this event logically and it is thing which this reappearance form can be done characteristic by narrative strategy in directing. I am going to try narrative structure analysis of movie by narrative that is connected at structure of the event and 'narrating-narrative acts' that is interested in way to reappear this story to spectator hereupon. Of course, at process of research, Roland Barthes and his followers wish to apply 'narrative function' and concept of 'narrative acts' that prefer from time to time.

A Comparative Study of Emotional Response to Korean Drama among Countries: With Drama 'Goblin' (한국 드라마 수용에 있어서 국가별 감정 반응 분석: 드라마 <도깨비>를 중심으로)

  • Lee, Yewon;Woo, Sungju
    • Science of Emotion and Sensibility
    • /
    • v.20 no.4
    • /
    • pp.31-40
    • /
    • 2017
  • This research aims to investigate 'Hallyu' contents consumption tendency of consumers from Korea, Japan, and the United States by analyzing their emotional responses. With the development of social media, research on emotion analysis by reviewing text materials has grown. Whereas environmental variables affect consumer demand towards 'Hallyu' contents, little comparative analyses have been conducted on the emotional responses of consumers from different countries. In this research, the emotional prototype model proposed by Russell(1980) used to extract and distinguish emotional words to clarify how people in the three countries differently perceive the Korean drama "Goblin". First of all, the SNS reviews were collected during a two-month period (February 12 to April 12). Second, significant factors were identified in the collected data according to Russell's emotion model. Third, random forest was applied to organize the selected variables in the order of variable importance. Fourth, the correlations among the emotional words were compared. Lastly, the accuracy of the trained model was measured using the test dataset. The results show that "Happy" was found to be the greatest factor in Korea and in the United States and "Pleased" in Japan. Emotional words correlations showed that when watching the drama "Goblin", "passive unpleasure" was the main factor associated with individual's interest in Korea whereas "passive pleasure" was associated with individual's interest in Japan and in the United States. Based on the results, this research suggests the possibility of developing evaluation guidelines for emotional responses of different countries towards 'Hallyu' contents.

Analysis of Elementary School Students' Visual Representation Competence for Shadow Phenomenon (그림자 현상에 대한 초등학생의 시각적 표상 능력)

  • Yoon, Hye-Gyoung
    • Journal of The Korean Association For Science Education
    • /
    • v.39 no.2
    • /
    • pp.295-305
    • /
    • 2019
  • In previous study, visual representation competence taxonomy (VRC-T), which is composed of two dimensions, was developed for the purpose of promoting effective visual representation use and research in science education. In this study, elementary school students' visual representation competence for shadow phenomenon was investigated using VRC-T. In terms of visual representation competence, 'interpretation' was the highest score, followed by 'construction' and 'integration'. It also showed that students' visual representation competence was not high even after learning shadow-related units in the regular curriculum. On the other hand, text-based scientific knowledge was not correlated with all categories of visual representation competence. This indicates that there is a need to emphasize visual representation more in science class. Finally, hierarchical relationship among cognitive processes of VRC-T was explored according to ordering theory. If the tolerance level is somewhat loosened, a linear hierarchical relationship was found between the six cognitive processes. This suggests that VRC-T is an analytical framework that can be useful when designing assessment tools, tasks, and science class activities to enhance visual representation competence.

A Classification Model for Illegal Debt Collection Using Rule and Machine Learning Based Methods

  • Kim, Tae-Ho;Lim, Jong-In
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.4
    • /
    • pp.93-103
    • /
    • 2021
  • Despite the efforts of financial authorities in conducting the direct management and supervision of collection agents and bond-collecting guideline, the illegal and unfair collection of debts still exist. To effectively prevent such illegal and unfair debt collection activities, we need a method for strengthening the monitoring of illegal collection activities even with little manpower using technologies such as unstructured data machine learning. In this study, we propose a classification model for illegal debt collection that combine machine learning such as Support Vector Machine (SVM) with a rule-based technique that obtains the collection transcript of loan companies and converts them into text data to identify illegal activities. Moreover, the study also compares how accurate identification was made in accordance with the machine learning algorithm. The study shows that a case of using the combination of the rule-based illegal rules and machine learning for classification has higher accuracy than the classification model of the previous study that applied only machine learning. This study is the first attempt to classify illegalities by combining rule-based illegal detection rules with machine learning. If further research will be conducted to improve the model's completeness, it will greatly contribute in preventing consumer damage from illegal debt collection activities.

Study on the EDA based Statistics Attributes Discovery and Utilization for the Maritime Safety Statistics Items Diversification (해상안전 통계 항목 다양화를 위한 EDA 기반 통계 속성 도출 및 활용에 관한 연구)

  • Kang, Seong Kyung;Lee, Young Jai
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.26 no.7
    • /
    • pp.798-809
    • /
    • 2020
  • Evidence-based policymaking and assessments for scientific administration have increased the importance of statistics (data) utilization. Statistics can explain specific phenomena by providing numerical values and are a public resource for national decision making. Due to these inherent attributes, statistics are utilized as baseline and base data for government policy determinations and the analysis of various phenomena. However, compared to the importance, the role of statistics is limited, and statistics are often used as simple abstracts, produced mainly for suppliers, not for consumers' perspectives to create value. This study explores the statistical data and other attributes that can be utilized for policies or research to address the problems mentioned above. The baseline statistical data used in this study is from the Maritime Distress Accident Statistical Yearbook published by the South Korean Coast Guard, and other additional attributes are from text analyses of vessel casualty situation reports from the South Korean Maritime Police. Collecting 56 attributes drawn from the text analysis and executing an EDA resulted in 88 attribute unions: 18 attribute unions had a satisfactory significance probability (p-value < .05) and a strong correlation coefficient above 0.7, and 70 attribute unions had a middle correlation. (over 0.4 and under 0.7). Additionally, to utilize the extra attributes discovered from the EDA politically, a keyword analysis for each detailed strategy of the disaster Preparation basic plan was executed, the utilization availability of the attributes was obtained using a matching process of keywords, and the EDA deducted attributes were examined.