• Title/Summary/Keyword: Text Data Analysis

Search Result 1,555, Processing Time 0.037 seconds

A Study on Sentiment Score of Healthcare Service Quality on the Hospital Rating (의료 서비스 리뷰의 감성 수준이 병원 평가에 미치는 영향 분석)

  • Jee-Eun Choi;Sodam Kim;Hee-Woong Kim
    • Information Systems Review
    • /
    • v.20 no.2
    • /
    • pp.111-137
    • /
    • 2018
  • Considering the increase in health insurance benefits and the elderly population of the baby boomer generation, the amount consumed by health care in 2020 is expected to account for 20% of US GDP. As the healthcare industry develops, competition among the medical services of hospitals intensifies, and the need of hospitals to manage the quality of medical services increases. In addition, interest in online reviews of hospitals has increased as online reviews have become a tool to predict hospital quality. Consumers tend to refer to online reviews even when choosing healthcare service providers and after evaluating service quality online. This study aims to analyze the effect of sentiment score of healthcare service quality on hospital rating with Yelp hospital reviews. This study classifies large amount of text data collected online primarily into five service quality measurement indexes of SERVQUAL theory. The sentiment scores of reviews are then derived by SERVQUAL dimensions, and an econometric analysis is conducted to determine the sentiment score effects of the five service quality dimensions on hospital reviews. Results shed light on the means of managing online hospital reputation to benefit managers in the healthcare and medical industry.

Semantic Visualization of Dynamic Topic Modeling (다이내믹 토픽 모델링의 의미적 시각화 방법론)

  • Yeon, Jinwook;Boo, Hyunkyung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.131-154
    • /
    • 2022
  • Recently, researches on unstructured data analysis have been actively conducted with the development of information and communication technology. In particular, topic modeling is a representative technique for discovering core topics from massive text data. In the early stages of topic modeling, most studies focused only on topic discovery. As the topic modeling field matured, studies on the change of the topic according to the change of time began to be carried out. Accordingly, interest in dynamic topic modeling that handle changes in keywords constituting the topic is also increasing. Dynamic topic modeling identifies major topics from the data of the initial period and manages the change and flow of topics in a way that utilizes topic information of the previous period to derive further topics in subsequent periods. However, it is very difficult to understand and interpret the results of dynamic topic modeling. The results of traditional dynamic topic modeling simply reveal changes in keywords and their rankings. However, this information is insufficient to represent how the meaning of the topic has changed. Therefore, in this study, we propose a method to visualize topics by period by reflecting the meaning of keywords in each topic. In addition, we propose a method that can intuitively interpret changes in topics and relationships between or among topics. The detailed method of visualizing topics by period is as follows. In the first step, dynamic topic modeling is implemented to derive the top keywords of each period and their weight from text data. In the second step, we derive vectors of top keywords of each topic from the pre-trained word embedding model. Then, we perform dimension reduction for the extracted vectors. Then, we formulate a semantic vector of each topic by calculating weight sum of keywords in each vector using topic weight of each keyword. In the third step, we visualize the semantic vector of each topic using matplotlib, and analyze the relationship between or among the topics based on the visualized result. The change of topic can be interpreted in the following manners. From the result of dynamic topic modeling, we identify rising top 5 keywords and descending top 5 keywords for each period to show the change of the topic. Existing many topic visualization studies usually visualize keywords of each topic, but our approach proposed in this study differs from previous studies in that it attempts to visualize each topic itself. To evaluate the practical applicability of the proposed methodology, we performed an experiment on 1,847 abstracts of artificial intelligence-related papers. The experiment was performed by dividing abstracts of artificial intelligence-related papers into three periods (2016-2017, 2018-2019, 2020-2021). We selected seven topics based on the consistency score, and utilized the pre-trained word embedding model of Word2vec trained with 'Wikipedia', an Internet encyclopedia. Based on the proposed methodology, we generated a semantic vector for each topic. Through this, by reflecting the meaning of keywords, we visualized and interpreted the themes by period. Through these experiments, we confirmed that the rising and descending of the topic weight of a keyword can be usefully used to interpret the semantic change of the corresponding topic and to grasp the relationship among topics. In this study, to overcome the limitations of dynamic topic modeling results, we used word embedding and dimension reduction techniques to visualize topics by era. The results of this study are meaningful in that they broadened the scope of topic understanding through the visualization of dynamic topic modeling results. In addition, the academic contribution can be acknowledged in that it laid the foundation for follow-up studies using various word embeddings and dimensionality reduction techniques to improve the performance of the proposed methodology.

6G Technology Competitiveness and Network Analysis: Focusing on GaN Integrated Circuit Patent Data (6G의 기술경쟁력 및 네트워크 분석: GaN 집적회로 특허 데이터 중심)

  • Woo-Seok Choi;Jin-Yong Kim;Jung-Hwan Lee;Sang-Hyun Choi
    • Journal of Industrial Convergence
    • /
    • v.21 no.3
    • /
    • pp.1-15
    • /
    • 2023
  • Expectations for wireless communication technology are rising as a base technology that promotes innovation in various industries in line with the paradigm of digital transformation in the 21st century beyond the stage of being used only for communication service itself. In this study, in order to compare 6G technological competitiveness between Korea and leading countries, technological competitiveness was confirmed through PFS, CPP, and network analysis based on GaN Integrated Circuit patent data. Korea's 6G technological competitiveness was 0.62 in PFS and 3.93 in CPP, which were 32.8% and 19.9%, respectively, compared to leading countries. In addition, as a result of network analysis, the collaboration rate in the 6G field was 7.2%, and the collaboration ecosystem was very insufficient in most countries. In contrast, it was confirmed that Korea, unlike leading countries, has established a small-scale collaboration ecosystem linked by industry and academia. Thus, it is necessary to establish a strategy for 6G communication technology at the national level so that communication technology can be advanced based on a relatively well-established collaborative ecosystem.

Perception on the Education Practicum of Pre-service School Librarian Teachers: Focusing on the Analysis of In-depth Interview Data (예비 사서교사의 교육실습에 대한 인식 조사 - 심층 면담자료 분석을 중심으로 -)

  • Jeonghoon Lim;Bong-Suk Kang;Juhyeon Park;Sang Woo Han
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.57 no.4
    • /
    • pp.75-95
    • /
    • 2023
  • This study investigated the overall perceptions of pre-service school librarian teacher on the current education practicum through semi-structured in-depth interviews and suggested improvements to the educational practicum system. For this purpose, interview data were collected from 28 pre-service school librarian teacher (6 teachers' colleges, 14 taking teaching qualification courses, and 8 graduate school of education) who participated in educational practicum in school libraries, and a research method that combines qualitative analysis techniques with text network analysis was applied. The results of the study showed that pre-service school librarian teacher believe that educational practicum can prepare them for various field experiences and cultivate their ability to cope with situations they will encounter in the future. Through qualitative inquiry, we were able to identify their perceptions of school field practicum as a whole, their perceptions of the school field practicum, and their perceptions of educational service activities. Based on this, to improve the current problems of educational practice, we suggested expanding the period of school internship program, distributing the time, establishing a full-time practice system, having continuous discussions with field teachers, and developing a systematic school field practicum.

Analysis of Generative AI Technology Trends Based on Patent Data (특허 데이터 기반 생성형 AI 기술 동향 분석)

  • Seongmu Ryu;Taewon Song;Minjeong Lee;Yoonju Choi;Soonuk Seol
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.17 no.1
    • /
    • pp.1-9
    • /
    • 2024
  • This paper analyzes the trends in generative AI technology based on patent application documents. To achieve this, we selected 5,433 generative AI-related patents filed in South Korea, the United States, and Europe from 2003 to 2023, and analyzed the data by country, technology category, year, and applicant, presenting it visually to find insights and understand the flow of technology. The analysis shows that patents in the image category account for 36.9%, the largest share, with a continuous increase in filings, while filings in the text/document and music/speech categories have either decreased or remained stable since 2019. Although the company with the highest number of filings is a South Korean company, four out of the top five filers are U.S. companies, and all companies have filed the majority of their patents in the U.S., indicating that generative AI is growing and competing centered around the U.S. market. The findings of this paper are expected to be useful for future research and development in generative AI, as well as for formulating strategies for acquiring intellectual property.

What Concerns Does ChatGPT Raise for Us?: An Analysis Centered on CTM (Correlated Topic Modeling) of YouTube Video News Comments (ChatGPT는 우리에게 어떤 우려를 초래하는가?: 유튜브 영상 뉴스 댓글의 CTM(Correlated Topic Modeling) 분석을 중심으로)

  • Song, Minho;Lee, Soobum
    • Informatization Policy
    • /
    • v.31 no.1
    • /
    • pp.3-31
    • /
    • 2024
  • This study aimed to examine public concerns in South Korea considering the country's unique context, triggered by the advent of generative artificial intelligence such as ChatGPT. To achieve this, comments from 102 YouTube video news related to ethical issues were collected using a Python scraper, and morphological analysis and preprocessing were carried out using Textom on 15,735 comments. These comments were then analyzed using a Correlated Topic Model (CTM). The analysis identified six primary topics within the comments: "Legal and Ethical Considerations"; "Intellectual Property and Technology"; "Technological Advancement and the Future of Humanity"; "Potential of AI in Information Processing"; "Emotional Intelligence and Ethical Regulations in AI"; and "Human Imitation."Structuring these topics based on a correlation coefficient value of over 10% revealed 3 main categories: "Legal and Ethical Considerations"; "Issues Related to Data Generation by ChatGPT (Intellectual Property and Technology, Potential of AI in Information Processing, and Human Imitation)"; and "Fear for the Future of Humanity (Technological Advancement and the Future of Humanity, Emotional Intelligence, and Ethical Regulations in AI)."The study confirmed the coexistence of various concerns along with the growing interest in generative AI like ChatGPT, including worries specific to the historical and social context of South Korea. These findings suggest the need for national-level efforts to ensure data fairness.

A Study on the Product Planning Model based on Word2Vec using On-offline Comment Analysis: Focused on the Noiseless Vertical Mouse User (온·오프라인 댓글 분석이 활용된 Word2Vec 기반 상품기획 모델연구: 버티컬 무소음마우스 사용자를 중심으로)

  • Ahn, Yeong-Hwi
    • Journal of Digital Convergence
    • /
    • v.19 no.10
    • /
    • pp.221-227
    • /
    • 2021
  • In this paper, we conducted word-to-word similarity analysis of standardized datasets collected through web crawling for 10,000 Vertical Noise Mouses using Word2Vec, and made 92 students of computer engineering use the products presented for 5 days, and conducted self-report questionnaire analysis. The questionnaire analysis was conducted by collecting the words in the form of a narrative form and presenting and selecting the top 50 words extracted from the word frequency analysis and the word similarity analysis. As a result of analyzing the similarity of e-commerce user's product review, pain (.985) and design (.963) were analyzed as the advantages of click keywords, and the disadvantages were vertical (.985) and adaptation (.948). In the descriptive frequency analysis, the most frequently selected items were Vertical (123) and Pain (118). Vertical (83) and Pain (75) were selected for the advantages of selecting the long/demerit similar words, and adaptation (89) and buttons (72) were selected for the disadvantages. Therefore, it is expected that decision makers and product planners of medium and small enterprises can be used as important data for decision making when the method applied in this study is reflected as a new product development process and a review strategy of existing products.

Trend Forecasting and Analysis of Quantum Computer Technology (양자 컴퓨터 기술 트렌드 예측과 분석)

  • Cha, Eunju;Chang, Byeong-Yun
    • Journal of the Korea Society for Simulation
    • /
    • v.31 no.3
    • /
    • pp.35-44
    • /
    • 2022
  • In this study, we analyze and forecast quantum computer technology trends. Previous research has been mainly focused on application fields centered on technology for quantum computer technology trends analysis. Therefore, this paper analyzes important quantum computer technologies and performs future signal detection and prediction, for a more market driven technical analysis and prediction. As analyzing words used in news articles to identify rapidly changing market changes and public interest. This paper extends conference presentation of Cha & Chang (2022). The research is conducted by collecting domestic news articles from 2019 to 2021. First, we organize the main keywords through text mining. Next, we explore future quantum computer technologies through analysis of Term Frequency - Inverse Document Frequency(TF-IDF), Key Issue Map(KIM), and Key Emergence Map (KEM). Finally, the relationship between future technologies and supply and demand is identified through random forests, decision trees, and correlation analysis. As results of the study, the interest in artificial intelligence was the highest in frequency analysis, keyword diffusion and visibility analysis. In terms of cyber-security, the rate of mention in news articles is getting overwhelmingly higher than that of other technologies. Quantum communication, resistant cryptography, and augmented reality also showed a high rate of increase in interest. These results show that the expectation is high for applying trend technology in the market. The results of this study can be applied to identifying areas of interest in the quantum computer market and establishing a response system related to technology investment.

A Study on Commodity Asset Investment Model Based on Machine Learning Technique (기계학습을 활용한 상품자산 투자모델에 관한 연구)

  • Song, Jin Ho;Choi, Heung Sik;Kim, Sun Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.127-146
    • /
    • 2017
  • Services using artificial intelligence have begun to emerge in daily life. Artificial intelligence is applied to products in consumer electronics and communications such as artificial intelligence refrigerators and speakers. In the financial sector, using Kensho's artificial intelligence technology, the process of the stock trading system in Goldman Sachs was improved. For example, two stock traders could handle the work of 600 stock traders and the analytical work for 15 people for 4weeks could be processed in 5 minutes. Especially, big data analysis through machine learning among artificial intelligence fields is actively applied throughout the financial industry. The stock market analysis and investment modeling through machine learning theory are also actively studied. The limits of linearity problem existing in financial time series studies are overcome by using machine learning theory such as artificial intelligence prediction model. The study of quantitative financial data based on the past stock market-related numerical data is widely performed using artificial intelligence to forecast future movements of stock price or indices. Various other studies have been conducted to predict the future direction of the market or the stock price of companies by learning based on a large amount of text data such as various news and comments related to the stock market. Investing on commodity asset, one of alternative assets, is usually used for enhancing the stability and safety of traditional stock and bond asset portfolio. There are relatively few researches on the investment model about commodity asset than mainstream assets like equity and bond. Recently machine learning techniques are widely applied on financial world, especially on stock and bond investment model and it makes better trading model on this field and makes the change on the whole financial area. In this study we made investment model using Support Vector Machine among the machine learning models. There are some researches on commodity asset focusing on the price prediction of the specific commodity but it is hard to find the researches about investment model of commodity as asset allocation using machine learning model. We propose a method of forecasting four major commodity indices, portfolio made of commodity futures, and individual commodity futures, using SVM model. The four major commodity indices are Goldman Sachs Commodity Index(GSCI), Dow Jones UBS Commodity Index(DJUI), Thomson Reuters/Core Commodity CRB Index(TRCI), and Rogers International Commodity Index(RI). We selected each two individual futures among three sectors as energy, agriculture, and metals that are actively traded on CME market and have enough liquidity. They are Crude Oil, Natural Gas, Corn, Wheat, Gold and Silver Futures. We made the equally weighted portfolio with six commodity futures for comparing with other commodity indices. We set the 19 macroeconomic indicators including stock market indices, exports & imports trade data, labor market data, and composite leading indicators as the input data of the model because commodity asset is very closely related with the macroeconomic activities. They are 14 US economic indicators, two Chinese economic indicators and two Korean economic indicators. Data period is from January 1990 to May 2017. We set the former 195 monthly data as training data and the latter 125 monthly data as test data. In this study, we verified that the performance of the equally weighted commodity futures portfolio rebalanced by the SVM model is better than that of other commodity indices. The prediction accuracy of the model for the commodity indices does not exceed 50% regardless of the SVM kernel function. On the other hand, the prediction accuracy of equally weighted commodity futures portfolio is 53%. The prediction accuracy of the individual commodity futures model is better than that of commodity indices model especially in agriculture and metal sectors. The individual commodity futures portfolio excluding the energy sector has outperformed the three sectors covered by individual commodity futures portfolio. In order to verify the validity of the model, it is judged that the analysis results should be similar despite variations in data period. So we also examined the odd numbered year data as training data and the even numbered year data as test data and we confirmed that the analysis results are similar. As a result, when we allocate commodity assets to traditional portfolio composed of stock, bond, and cash, we can get more effective investment performance not by investing commodity indices but by investing commodity futures. Especially we can get better performance by rebalanced commodity futures portfolio designed by SVM model.

Analysis of Literatures Related to Crop Growth and Yield of Onion and Garlic Using Text-mining Approaches for Develop Productivity Prediction Models (양파·마늘 생산성 예측 모델 개발을 위한 텍스트마이닝 기법 활용 생육 및 수량 관련 문헌 분석)

  • Kim, Jin-Hee;Kim, Dae-Jun;Seo, Bo-Hun;Kim, Kwang Soo
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.4
    • /
    • pp.374-390
    • /
    • 2021
  • Growth and yield of field vegetable crops would be affected by climate conditions, which cause a relatively large fluctuation in crop production and consumer price over years. The yield prediction system for these crops would support decision-making on policies to manage supply and demands. The objectives of this study were to compile literatures related to onion and garlic and to perform data-mining analysis, which would shed lights on the development of crop models for these major field vegetable crops in Korea. The literatures on crop growth and yield were collected from the databases operated by Research Information Sharing Service, National Science & Technology Information Service and SCOPUS. The keywords were chosen to retrieve research outcomes related to crop growth and yield of onion and garlic. These literatures were analyzed using text mining approaches including word cloud and semantic networks. It was found that the number of publications was considerably less for the field vegetable crops compared with rice. Still, specific patterns between previous research outcomes were identified using the text mining methods. For example, climate change and remote sensing were major topics of interest for growth and yield of onion and garlic. The impact of temperature and irrigation on crop growth was also assessed in the previous studies. It was also found that yield of onion and garlic would be affected by both environment and crop management conditions including sowing time, variety, seed treatment method, irrigation interval, fertilization amount and fertilizer composition. For meteorological conditions, temperature, precipitation, solar radiation and humidity were found to be the major factors in the literatures. These indicate that crop models need to take into account both environmental and crop management practices for reliable prediction of crop yield.