• Title/Summary/Keyword: 텍스트 수집

Search Result 697, Processing Time 0.131 seconds

Study on Participants' Perceptions of Sharing Economy Policies: A Text Ming Approach to Online Community Posts (공유경제 참여자의 비즈니스 등록정책에 대한 인식과 심적기재: 온라인 발화에 대한 텍스트마이닝)

  • Park, Soo Kyung
    • Journal of Digital Convergence
    • /
    • v.20 no.2
    • /
    • pp.47-56
    • /
    • 2022
  • With the advent of online platforms, individuals have been able to trade small resources, such as a room, in the market. However, as there is no clear regulation on these economic activities, various side effects have emerged. Accordingly, the government reestablished related policies to resolve the unintended consequences of these economic activities. However, the policy has not been implemented yet, and many participants do not comply with the policy. Therefore, this study intends to examine their perceptions in detail. For this purpose, a text mining technique was applied. Posts and comments from major online communities were collected. By applying the topic modeling technique, 5 topics were derived. Compliance with the government's policy is a voluntary decision. Therefore, it is necessary to carry out an in-depth understanding of the policy target. Therefore, based on this study, it is expected that in the future, methods to induce them to conform to policy can be discussed in detail.

A Study on Data Cleansing Techniques for Word Cloud Analysis of Text Data (텍스트 데이터 워드클라우드 분석을 위한 데이터 정제기법에 관한 연구)

  • Lee, Won-Jo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.745-750
    • /
    • 2021
  • In Big data visualization analysis of unstructured text data, raw data is mostly large-capacity, and analysis techniques cannot be applied without cleansing it unstructured. Therefore, from the collected raw data, unnecessary data is removed through the first heuristic cleansing process and Stopwords are removed through the second machine cleansing process. Then, the frequency of the vocabulary is calculated, visualized using the word cloud technique, and key issues are extracted and informationalized, and the results are analyzed. In this study, we propose a new Stopword cleansing technique using an external Stopword set (DB) in Python word cloud, and derive the problems and effectiveness of this technique through practical case analysis. And, through this verification result, the utility of the practical application of word cloud analysis applying the proposed cleansing technique is presented.

Research on Tourist Perception of Grand Canal Cultural Heritage Based on Network Text Analysis : The Pingjiang Historical and Cultural District of Suzhou City as an example (네트워크 텍스트 분석을 통한 대운하 문화유산에 대한 관광객 인식 연구 : 쑤저우시 핑장역사문화지구의 예)

  • Chengkang Zheng;Qiwei Jing;Nam Kyung Hyeon
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.215-231
    • /
    • 2023
  • Taking Pingjiang historical and cultural block in Suzhou as an example, this paper collects 1436 tourist comment data from Ctrip. com with Python technology, and uses network text analysis method to analyze frequency words, semantic network and emotion, so as to evaluate the tourist perception characteristics and levels of the Grand Canal cultural heritage. The study found that: natural and humanistic landscapes, historical and cultural deposits, and the style of the Jiangnan Canal are fully reflected in the perception of visitors to the Pingjiang Historical and Cultural District; Tourists hold strong positive emotions towards the Pingjiang Road historical and cultural district, however, there is still more space for the transformation and upgrading of the district. Finally,suggestions for measures to improve the perception of tourists of the Grand Canal cultural heritage are given in terms of conservation first, cultural integration and innovative utilization.

Identifying Consumer Response Factors in Live Commerce : Based on Consumer-Generated Text Data (라이브 커머스에서의 소비자 반응 요인 도출 : 소비자 생성 텍스트 데이터를 기반으로)

  • Park, Jae-Hyeong;Lee, Han-Sol;Kang, Ju-Young
    • Informatization Policy
    • /
    • v.30 no.2
    • /
    • pp.68-85
    • /
    • 2023
  • In this study, we collected data from live commerce streaming. Streamimg data were then categorized based on the degree of chatting activation, with the distribution of text responses generated by consumers analyzed. From a total of 2,282 streaming data on NAVER Shopping Live -which has the largest share in the domestic live commerce market- we selected 200 streaming data with the most active viewer responses and finally chose the streams that had steep increase or decrease in viewer responses. We synthesized variables from the existing literature on live commerce viewing intentions and participation motivations to create a table of variables for the purpose of the study. Then we applied them with events in the broadcast. Through this study, we identified which components of the broadcast stimulate the variables of consumer response found in previous studies, moreover, we empirically identified the motivations of consumers to participate in live commerce through data.

A Sentiment Analysis of Customer Reviews on the Connected Car using Text Mining: Focusing on the Comparison of UX Factors between Domestic-Overseas Brands (텍스트 마이닝을 활용한 커넥티드 카 고객 리뷰의 감성 분석: 국내-해외 브랜드간 UX 요인 비교를 중심으로)

  • Youjung Shin;Junho Choi;Sung Woo Kim
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.517-528
    • /
    • 2023
  • The purpose of this study is to analyze and compare UX factors of connectivity systems of domestic and overseas car brands. Using a text mining analysis, UX factors of domestic and overseas brands were compared through positive-negative sentiment index. After collecting 120,000 reviews on Hyundai Motor Group (Hyundai, Kia, Genesis) and 190,000 on Tesla, BMW, and Mercedes, pre-processing was performed. Keywords were classified into 11 UX factors in 3 dimensions of the system connection, information, and service. For domestic brands, sentiment index for 'safety' was the highest. For overseas brands, 'entertainment' was the most positive UX factor.

Analyzing data-related policy programs in Korea using text mining and network cluster analysis (텍스트 마이닝과 네트워크 군집 분석을 활용한 한국의 데이터 관련 정책사업 분석)

  • Sungjun Choi;Kiyoon Shin;Yoonhwan Oh
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.6
    • /
    • pp.63-81
    • /
    • 2023
  • This study endeavors to classify and categorize similar policy programs through network clustering analysis, using textual information from data-related policy programs in Korea. To achieve this, descriptions of data-related budgetary programs in South Korea in 2022 were collected, and keywords from the program contents were extracted. Subsequently, the similarity between each program was derived using TF-IDF, and policy program network was constructed accordingly. Following this, the structural characteristics of the network were analyzed, and similar policy programs were clustered and categorized through network clustering. Upon analyzing a total of 97 programs, 7 major clusters were identified, signifying that programs with analogous themes or objectives were categorized based on application area or services utilizing data. The findings of this research illuminate the current status of data-related policy programs in Korea, providing policy implications for a strategic approach to planning future national data strategies and programs, and contributing to the establishment of evidence-based policies.

A Study on Unstructured text data Post-processing Methodology using Stopword Thesaurus (불용어 시소러스를 이용한 비정형 텍스트 데이터 후처리 방법론에 관한 연구)

  • Won-Jo Lee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.6
    • /
    • pp.935-940
    • /
    • 2023
  • Most text data collected through web scraping for artificial intelligence and big data analysis is generally large and unstructured, so a purification process is required for big data analysis. The process becomes structured data that can be analyzed through a heuristic pre-processing refining step and a post-processing machine refining step. Therefore, in this study, in the post-processing machine refining process, the Korean dictionary and the stopword dictionary are used to extract vocabularies for frequency analysis for word cloud analysis. In this process, "user-defined stopwords" are used to efficiently remove stopwords that were not removed. We propose a methodology for applying the "thesaurus" and examine the pros and cons of the proposed refining method through a case analysis using the "user-defined stop word thesaurus" technique proposed to complement the problems of the existing "stop word dictionary" method with R's word cloud technique. We present comparative verification and suggest the effectiveness of practical application of the proposed methodology.

A Study on the Purchasing Factors of Color Cosmetics Using Big Data: Focusing on Topic Modeling and Concor Analysis (빅데이터를 활용한 색조화장품의 구매 요인에 관한 연구: 토픽모델링과 Concor 분석을 중심으로)

  • Eun-Hee Lee;Seung- Hee Bae
    • Journal of the Korean Applied Science and Technology
    • /
    • v.40 no.4
    • /
    • pp.724-732
    • /
    • 2023
  • In this study, we tried to analyze the characteristics of color cosmetics information search and the major information of interest in the color cosmetics market after COVID-19 shown in the text mining analysis results by collecting data on online interest information of consumers in the color cosmetics market after COVID-19. In the empirical analysis, text mining was performed on all documents such as news, blogs, cafes, and web pages, including the word "color cosmetics". As a result of the analysis, online information searches for color cosmetics after COVID-19 were mainly focused on purchase information, information on skin and mask-related makeup methods, and major topics such as interest brands and event information. As a result, post-COVID-19 color cosmetics buyers will become more sensitive to purchase information such as product value, safety, price benefits, and store information through active online information search, so a response strategy is required.

A Sustainable Development Issues and Trends in Myanmar: A Text Network Analysis (미얀마의 지속가능발전에 대한 이슈 및 트렌드 분석: 텍스트 네트워크 분석)

  • Phyo Su Thwe;EuiBeom Jeong;DonHee Lee
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.4
    • /
    • pp.105-122
    • /
    • 2024
  • Myanmar was successful in increasing its sustainable development index during the three years period from 2018 to 2020. However, the index began to decline since 2021. This study aims to analyze both the success factors and obstacles for sustainable development in Myanmar. Using the search terms 'Myanmar' and 'sustainability', online news items were collected from January 2018 to December 2023 and were examined through text network analysis. The study identified the following success factors that contribute to sustainable development in Myanmar: foreign investments, private companies' participation in the effort, human resource development projects, and the use of new and renewable energy. The inhibition factors for the development efforts identified were: government's coercive/restrictive policies, labor rights violations, and forest degradation. The findings of this study provide useful insights for understanding the current status of sustainability in Myanmar from academic and practical perspectives. The results also present benchmarking information for policy-makers in Myanmar and other similar developing countries that are searching for strategic directions in their sustainable development efforts.

Tourism Information Contents and Text Networking (Focused on Formal Website of Jeju and Chinese Personal Blogs) (온라인 관광정보의 내용 및 텍스트 네트워크 (제주 공식 웹사이트와 중국 개인블로그를 중심으로))

  • Zhang, Lin;Yun, Hee Jeong
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.1
    • /
    • pp.19-30
    • /
    • 2018
  • The main purposes of this study are to analyze the contents and text network of online tourism information. For this purpose, Jeju Island, one of the representative tourist destinations in South Korea is selected as a study site. And this study collects the contents of both JeJu official tourism website and Sina Weibo's personal blogs which is one of the most popular Social Network Systems in China. In addition, this study analyzes this online text information using ROST Content Mining System, one of the Chinese big data mining systems. The results of the content analysis show that the formal website of Jeju includes the nouns related to natural, geographical and physical resources, verbs related to existence of resources, and adjectives related to the beauty, cleanness and convenience of resources mainly. Meanwhile, personal blogs include the nouns of Korean-wave, food, local products, other destinations and shopping, verbs related to activity and feeling in Jeju, and adjectives related to their experiences and feeling mainly. Finally, the results of text network show that there are some strong centrality and network of online tourism information at formal website, but there are weak relationships in personal blogs. The results of this study may be able to contribute to the development of demand-based marketing strategies of tourists destination.