• Title/Summary/Keyword: wordcloud

Search Result 28, Processing Time 0.025 seconds

100 Article Paper Text Minning Data Analysis and Visualization in Web Environment (웹 환경에서 100 논문에 대한 텍스트 마이닝, 데이터 분석과 시각화)

  • Li, Xiaomeng;Li, Jiapei;Lee, HyunChang;Shin, SeongYoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.157-158
    • /
    • 2017
  • There is a method to analyze the big data of the article and text mining by using Python language. And Python is a kind of programming language and it is easy to operating. Reaserch and use Python to creat a Web environment that the research result of the analysis can show directly on the browser. In this thesis, there are 100 article paper frrom Altmetric, Altmetric tracks a range of sources to capture. It is necessary to collect and analyze the big data use an effictive method, After the result coming out, Use Python wordcloud to make a directive image that can show the highest frequency of words.

  • PDF

Intelligent Wordcloud Using Text Mining (텍스트 마이닝을 이용한 지능적 워드클라우드)

  • Kim, Yeongchang;Ji, Sangsu;Park, Dongseo;Lee, Choong Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.325-326
    • /
    • 2019
  • This paper proposes an intelligent word cloud by improving the existing method of representing word cloud by examining the frequency of nouns with text mining technique. In this paper, we propose a method to visually show word clouds focused on other parts, such as verbs, by effectively adding newly-coined words and the like to a dictionary that extracts noun words in text mining. In the experiment, the KoNLP package was used for extracting the frequency of existing nouns, and 80 new words that were not supported were added manually by examining frequency.

  • PDF

A Study on the Use of Stopword Corpus for Cleansing Unstructured Text Data (비정형 텍스트 데이터 정제를 위한 불용어 코퍼스의 활용에 관한 연구)

  • Lee, Won-Jo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.6
    • /
    • pp.891-897
    • /
    • 2022
  • In big data analysis, raw text data mostly exists in various unstructured data forms, so it becomes a structured data form that can be analyzed only after undergoing heuristic pre-processing and computer post-processing cleansing. Therefore, in this study, unnecessary elements are purified through pre-processing of the collected raw data in order to apply the wordcloud of R program, which is one of the text data analysis techniques, and stopwords are removed in the post-processing process. Then, a case study of wordcloud analysis was conducted, which calculates the frequency of occurrence of words and expresses words with high frequency as key issues. In this study, to improve the problems of the "nested stopword source code" method, which is the existing stopword processing method, using the word cloud technique of R, we propose the use of "general stopword corpus" and "user-defined stopword corpus" and conduct case analysis. The advantages and disadvantages of the proposed "unstructured data cleansing process model" are comparatively verified and presented, and the practical application of word cloud visualization analysis using the "proposed external corpus cleansing technique" is presented.

A Study on Improving the Satisfaction of Non-face-to-face Video Lectures Using IPA Analysis (IPA 분석법을 활용한 비대면 동영상 강의 만족도 제고 방안 연구)

  • Jung, Dae-Hyun;Kim, Jin-Sung
    • The Journal of Information Systems
    • /
    • v.29 no.4
    • /
    • pp.45-56
    • /
    • 2020
  • Purpose The purpose of this study is to present the direction of efficient e-learning education through the importance and satisfaction survey of learners of non-face-to-face video lectures. Therefore, by grasping the degree of satisfaction of the importance ratio through the IPA analysis method, we try to present improvement measures for insufficient education methods. Design/methodology/approach For IPA analysis, we conducted an online survey of four universities and analyzed 154 samples. The analysis method used SPSS, and through the wordcloud analysis method of R, the suggestions for the non-face-to-face lecture method felt by learners were analyzed to derive implications for improving the quality of education. Findings As a result of the overall satisfaction survey for the entire non-face-to-face class, the factors with the greatest dissatisfaction are listed as follows. Complaints about the adequacy of learning materials and activities (quiz, discussion, assignments, etc.), Complaints about how to use the produced content, and complaints about announcements about class management (lecture schedule, lecture method) were identified in order. The factors of dissatisfaction were clear in the non-face-to-face class where interactive communication was impossible or insufficient. In addition to the lack of quick Q&A, there seems to have been a phenomenon of some neglect.

Necessity of the Physical Distribution Cooperation to Enhance Competitive Capabilities of Healthcare SCM -Bigdata Business Model's Viewpoint- (의료 SCM 경쟁역량 강화를 위한 물류공동화 도입 필요성 -빅데이터 비즈니스 모델 관점-)

  • Park, Kwang-O;Jung, Dae-Hyun;Kwon, Sang-Min
    • Management & Information Systems Review
    • /
    • v.39 no.3
    • /
    • pp.17-35
    • /
    • 2020
  • The purpose of this study is to develop business models for current situational scenarios reflecting customer needs emphasize the need for implementing a logistics cooperation system by analyzing big data to strengthen SCM competitiveness capacities. For healthcare SCM competitiveness needed for the logistics cooperation usage intent, they were divided into product quality, price leadership, hand-over speed, and process flexibility for examination. The wordcloud results that analyzed major considerations to realize work efficiency between medical institutes, words like unexpected situations, information sharing, delivery, real-time, delivery, convenience, etc. were mentioned frequently. It can be analyzed as expressing the need to construct a system that can immediately respond to emergency situations on the weekends. Furthermore, in addition to pursuing communication and convenience, the importance of real-time information sharing that can share to the efficiency of inventory management were evident. Accordingly, it is judged that it is necessary to aim for a business model that can enhance visibility of the logistics pipeline in real-time using big data analysis on site. By analyzing the effects of the adaptability of a supply chain network for healthcare SCM competitiveness, it was revealed that obtaining competitive capacities is possible through the implementation of logistics cooperation. Stronger partnerships such as logistics cooperation will lead to SCM competitive capacities. It will be necessary to strengthen SCM competitiveness by searching for a strategic approach among companies in a direction that can promote mutual partnerships among companies using the joint logistics system of medical institutes. In particular, it will be necessary to search for ways to utilize HCSM through big data analysis according to the construction of a logistics cooperation system.

Text Mining Analysis Technique on ECDIS Accident Report (텍스트 마이닝 기법을 활용한 ECDIS 사고보고서 분석)

  • Lee, Jeong-Seok;Lee, Bo-Kyeong;Cho, Ik-Soon
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.25 no.4
    • /
    • pp.405-412
    • /
    • 2019
  • SOLAS requires that ECDIS be installed on ships of more than 500 gross tonnage engaged in international navigation until the first inspection arriving after July 1, 2018. Several accidents related to the use of ECDIS have occurred with its installation as a new major navigation instrument. The 12 incident reports issued by MAIB, BSU, BEAmer, DMAIB, and DSB were analyzed, and the cause of accident was determined to be related to the operation of the navigator and the ECDIS system. The text was analyzed using the R-program to quantitatively analyze words related to the cause of the accident. We used text mining techniques such as Wordcloud, Wordnetwork and Wordweight to represent the importance of words according to their frequency of derivation. Wordcloud uses the N-gram model as a way of expressing the frequency of used words in cloud form. As a result of the uni-gram analysis of the N-gram model, ECDIS words were obtained the most, and the bi-gram analysis results showed that the word "Safety Contour" was used most frequently. Based on the bi-gram analysis, the causative words are classified into the officer and the ECDIS system, and the related words are represented by Wordnetwork. Finally, the related words with the of icer and the ECDIS system were composed of word corpus, and Wordweight was applied to analyze the change in corpus frequency by year. As a result of analyzing the tendency of corpus variation with the trend line graph, more recently, the corpus of the officer has decreased, and conversely, the corpus of the ECDIS system is gradually increasing.

A Study on the Development of Korean Defense Standards through Text Mining-Based Trend Analysis of United States Defense Standards (텍스트 마이닝 기반의 미국 국방 표준 동향 분석을 통한 한국 국방 표준의 발전 방안 연구)

  • Chae, Soohwan;Shim, Bohyun;Yeom, Seulki;Hong, Seongdon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.3
    • /
    • pp.651-660
    • /
    • 2021
  • This study examined the trend of standards established in the United States and to find points that can be applied to Korean defense standards. The titles of various United States defense standard documents registered on the web were selected for this research. The wordcloud was created after analyzing the frequency of words appearing in the title using text mining. The trend of words appearing in MIL-STD by era was obtained. This study identified words that appear often due to the format of the document itself, words that appear regularly throughout the era, words that are used frequently in the past but are not used much in the present, and words that did not receive attention in the past but appeared recurrently in the present. In addition, the characteristics of each document were derived through the wordcloud produced for various defense documents. In conclusion, Korean defense standards also require a consideration of safe and efficient management, transport, and load design of hazardous materials. Furthermore, the quality of defense standards can be expected to improve if the defense standard document system can be established, focusing on efficient management.

A case study of Digital humanities lecture on Marcel Proust's À La Recherche du temps perdu (마르셀 프루스트의 『잃어버린 시간을 찾아서』에 대한 디지털인문학적 강의 운영 사례 연구)

  • Jinyoung MIN
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.269-275
    • /
    • 2023
  • In 2021, the 150th anniversary of Proust's birth, and in 2022, the 100th anniversary of his death, the interest in À la recherche du temps perdu increased. We took advantage of a digital humanities approach to make these seven novels known as difficult easily accessible to French literature major korean students. We let the students analyze using the analyzing tools for the big data and find some clues to understand the works through the visualized data. We picked out the main characters and places that appear in his works with Wordcloud, and checked the awareness of Proust in domestic and foreign through the various sites to analyze the big data, such as Big Kinds and Textom. Through the methodology of digital humanities, the students commented that they have gradually enlarged their understanding breadth for Proust's 『In Search of Lost Time』 rather than giving up it as difficult. This study confirmed that applying the big data analysis and digital humanities is an appropriate teaching method in finding ways for the students to broaden the understanding of French literature.

Data Visualization based on Academic Research Papers (학술 연구논문 데이터에 기반한 시각화)

  • Lee, HyunChang;Shin, SeongYoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.05a
    • /
    • pp.99-100
    • /
    • 2018
  • Citation of academic research papers is a very important result for academic researchers, and their utilization is becoming an important evaluation factor. Most papers are composed of authors' keywords. However, there may be some papers with little relevance between the textual content and the presented keywords. Therefore, it is necessary to extract and present important keywords through objective methods for titles and abstracts of theses. In this paper, we present the development results of important keywords through data visualization for academic research papers.

  • PDF

An Efficient Dynamic Workload Balancing Strategy (빅데이터를 활용한 국내 샤오미에 관한 인식 연구)

  • Jae-Young Moon;Eun-Ji Lee
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.343-344
    • /
    • 2023
  • 본 논문에서는 최근 스마트업체이며 제조업체로 화두가 되고 있는 샤오미 키워드로 빅데이터 분석을 활용하여 분석하고자 한다. 샤오미는 2021년 스마트폰 제조업체 세계1위를 차지했고, 글로벌 100대 브랜드(2022)에는 처음으로 84위에 진입하여 급격하게 성장하고 있는 업체 중 하나이다. 특히 국내에서도 점차 점유율이 커지고 있는 상황에서 국내 소비자들의 인식과 향후 국내에서의 입지를 알아보고자 한다. 국내 포털과 SNS에 채널을 통한 '샤오미' 키워드에 관한 데이터를 통해 키워드 분석, 워드클라우드, 토픽모델링 등의 분석을 진행하여 최근 국내 샤오미에 관한 인식과 향후 방향성을 제시해보고자 한다.

  • PDF