• Title/Summary/Keyword: Text frequency analysis

Search Result 459, Processing Time 0.024 seconds

A User Sentiment Classification Using Instagram image and text Analysis (인스타그램 이미지와 텍스트 분석을 통한 사용자 감정 분류)

  • Hong, Taekeun;Kim, Jeongin;Shin, Juhyun
    • Smart Media Journal
    • /
    • v.5 no.1
    • /
    • pp.61-68
    • /
    • 2016
  • According to increasing SNS users and developing smart devices like smart phone and tablet PC recently, many techniques to classify user emotions with social network information are researching briskly. The use emotion classification stands for distinguishing its emotion with text and images listed on his/her SNS. This paper suggests a method to classify user emotions through sampling a value of a representative figure on a trigonometrical function, a representative adjective on text, and a canny algorithm on images. The sampling representative adjective on text is selected as one of high frequency in the samplings and measured values of positive-negative by SentiWordNet. Figures sampled on images are selected as the representative in figures; triangle, quadrangle, and circle as well as classified user emotions by measuring pleasure-unpleased values as a type of figures and inclines. Finally, this is re-defined as x-y graph that represents pleasure-unpleased and positive-negative values with wheel of emotions by Plutchik. Also, we are anticipating for applying user-customized service through classifying user emotions on wheel of emotions by Plutchik that is redefined the representative adjectives and figures.

Derivation of Green Infrastructure Planning Factors for Reducing Particulate Matter - Using Text Mining - (미세먼지 저감을 위한 그린인프라 계획요소 도출 - 텍스트 마이닝을 활용하여 -)

  • Seok, Youngsun;Song, Kihwan;Han, Hyojoo;Lee, Junga
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.49 no.5
    • /
    • pp.79-96
    • /
    • 2021
  • Green infrastructure planning represents landscape planning measures to reduce particulate matter. This study aimed to derive factors that may be used in planning green infrastructure for particulate matter reduction using text mining techniques. A range of analyses were carried out by focusing on keywords such as 'particulate matter reduction plan' and 'green infrastructure planning elements'. The analyses included Term Frequency-Inverse Document Frequency (TF-IDF) analysis, centrality analysis, related word analysis, and topic modeling analysis. These analyses were carried out via text mining by collecting information on previous related research, policy reports, and laws. Initially, TF-IDF analysis results were used to classify major keywords relating to particulate matter and green infrastructure into three groups: (1) environmental issues (e.g., particulate matter, environment, carbon, and atmosphere), target spaces (e.g., urban, park, and local green space), and application methods (e.g., analysis, planning, evaluation, development, ecological aspect, policy management, technology, and resilience). Second, the centrality analysis results were found to be similar to those of TF-IDF; it was confirmed that the central connectors to the major keywords were 'Green New Deal' and 'Vacant land'. The results from the analysis of related words verified that planning green infrastructure for particulate matter reduction required planning forests and ventilation corridors. Additionally, moisture must be considered for microclimate control. It was also confirmed that utilizing vacant space, establishing mixed forests, introducing particulate matter reduction technology, and understanding the system may be important for the effective planning of green infrastructure. Topic analysis was used to classify the planning elements of green infrastructure based on ecological, technological, and social functions. The planning elements of ecological function were classified into morphological (e.g., urban forest, green space, wall greening) and functional aspects (e.g., climate control, carbon storage and absorption, provision of habitats, and biodiversity for wildlife). The planning elements of technical function were classified into various themes, including the disaster prevention functions of green infrastructure, buffer effects, stormwater management, water purification, and energy reduction. The planning elements of the social function were classified into themes such as community function, improving the health of users, and scenery improvement. These results suggest that green infrastructure planning for particulate matter reduction requires approaches related to key concepts, such as resilience and sustainability. In particular, there is a need to apply green infrastructure planning elements in order to reduce exposure to particulate matter.

Text Mining and Visualization of Unstructured Data Using Big Data Analytical Tool R (빅데이터 분석 도구 R을 이용한 비정형 데이터 텍스트 마이닝과 시각화)

  • Nam, Soo-Tai;Shin, Seong-Yoon;Jin, Chan-Yong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.9
    • /
    • pp.1199-1205
    • /
    • 2021
  • In the era of big data, not only structured data well organized in databases, but also the Internet, social network services, it is very important to effectively analyze unstructured big data such as web documents, e-mails, and social data generated in real time in mobile environment. Big data analysis is the process of creating new value by discovering meaningful new correlations, patterns, and trends in big data stored in data storage. We intend to summarize and visualize the analysis results through frequency analysis of unstructured article data using R language, a big data analysis tool. The data used in this study was analyzed for total 104 papers in the Mon-May 2021 among the journals of the Korea Institute of Information and Communication Engineering. In the final analysis results, the most frequently mentioned keyword was "Data", which ranked first 1,538 times. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.

Analysis of Public Perception and Policy Implications of Foreign Workers through Social Big Data analysis (소셜 빅데이터분석을 통한 외국인근로자에 관한 국민 인식 분석과 정책적 함의)

  • Ha, Jae-Been;Lee, Do-Eun
    • Journal of Digital Convergence
    • /
    • v.19 no.11
    • /
    • pp.1-10
    • /
    • 2021
  • This paper aimed to look at the awareness of foreign workers in social platforms by using text mining, one of the big data techniques and draw suggestions for foreign workers. To achieve this purpose, data collection was conducted with search keyword 'Foreign Worker' from Jan. 1, to Dec. 31, 2020, and frequency analysis, TF-IDF analysis, and degree centrality analysis and 100 parent keywords were drawn for comparison. Furthermore, Ucinet6.0 and Netdraw were used to analyze semantic networks, and through CONCOR analysis, data were clustered into the following eight groups: foreigner policy issue, regional community issue, business owner's perspective issue, employment issue, working environment issue, legal issue, immigration issue, and human rights issue. Based on such analyzed results, it identified national awareness of foreign workers and main issues and provided the basic data on policy proposals for foreign workers and related researches.

Analysis of Topics Related to Population Aging Using Natural Language Processing Techniques (자연어 처리 기술을 활용한 인구 고령화 관련 토픽 분석)

  • Hyunjung Park;Taemin Lee;Heuiseok Lim
    • Journal of Information Technology Services
    • /
    • v.23 no.1
    • /
    • pp.55-79
    • /
    • 2024
  • Korea, which is expected to enter a super-aged society in 2025, is facing the most worrisome crisis worldwide. Efforts are urgently required to examine problems and countermeasures from various angles and to improve the shortcomings. In this regard, from a new viewpoint, we intend to derive useful implications by applying the recent natural language processing techniques to online articles. More specifically, we derive three research questions: First, what topics are being reported in the online media and what is the public's response to them? Second, what is the relationship between these aging-related topics and individual happiness factors? Third, what are the strategic directions and implications for benchmarking discussed to solve the problem of population aging? To find answers to these, we collect Naver portal articles related to population aging and their classification categories, comments, and number of comments, including other numerical data. From the data, we firstly derive 33 topics with a semi-supervised BERTopic by reflecting article classification information that was not used in previous studies, conducting sentiment analysis of comments on them with a current open-source large language model. We also examine the relationship between the derived topics and personal happiness factors extended to Alderfer's ERG dimension, carrying out additional 3~4-gram keyword frequency analysis, trend analysis, text network analysis based on 3~4-gram keywords, etc. Through this multifaceted approach, we present diverse fresh insights from practical and theoretical perspectives.

Frequency and Social Network Analysis of the Bible Data using Big Data Analytics Tools R (R을 이용한 성경 데이터의 빈도와 소셜 네트워크 분석)

  • Ban, ChaeHoon;Ha, JongSoo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.93-96
    • /
    • 2018
  • Big datatics technology that can store and analyze data and obtain new knowledge has been adjusted for importance in many fields of the society. Big data is emerging as an important problem in the field of information and communication technology, but the mind of continuous technology is rising. R, a tool that can analyze big data, is a language and environment that enables information analysis of statistical bases. In this thesis, we use this to analyze the Bible data. R is used to investigate the frequency of what text is distributed and analyze the Bible through analysis of social network.

  • PDF

A Study on Social Issues for Hydrogen Industry Using News Big Data (뉴스 빅데이터를 활용한 수소 이슈 탐색)

  • CHOI, ILYOUNG;KIM, HYEA-KYEONG
    • Transactions of the Korean hydrogen and new energy society
    • /
    • v.33 no.2
    • /
    • pp.121-129
    • /
    • 2022
  • With the advent of the post-2020 climate regime, the hydrogen industry is growing rapidly around the world. In order to build the hydrogen economy, it is important to identify social issues related to hydrogen and prepare countermeasures for them. Accordingly, this study conducted a semantic network analysis on hydrogen news from NAVER. As a result of the analysis, the number of hydrogen news in 2020 increased by 4.5 times compared to 2016, and as of 2018, the hydrogen issue has shifted from an environmental aspect to an economic aspect. In addition, although the initial government-led hydrogen industry is expanding to the mobility field such as privately-led fuel cell electric vehicles and hydrogen fuel, terms showing concerns about the safety such as explosions are constantly being exposed. Thus, it is necessary not only to expand the hydrogen ecosystem through the participation of private companies, but also to promote hydrogen safety.

A Study on Domestic Research Trends (2001-2020) of Forest Ecology Using Text Mining (텍스트마이닝을 활용한 국내 산림생태 분야 연구동향(2001-2020) 분석)

  • Lee, Jinkyu;Lee, Chang-Bae
    • Journal of Korean Society of Forest Science
    • /
    • v.110 no.3
    • /
    • pp.308-321
    • /
    • 2021
  • The purpose of this study was to analyze domestic research trends over the past 20 years and future direction of forest ecology using text mining. A total of 1,015 academic papers and keywords data related to forest ecology were collected by the "Research and Information Service Section" and analyzed using big data analysis programs, such as Textom and UCINET. From the results of word frequency and N-gram analyses, we found domestic studies on forest ecology rapidly increased since 2011. The most common research topic was "species diversity" over the past 20 years and "climate change" became a major topic since 2011. Based on CONCOR analysis, study subjects were grouped intoeight categories, such as "species diversity," "environmental policy," "climate change," "management," "plant taxonomy," "habitat suitability index," "vascular plants," and "recreation and welfare." Consequently, species diversity and climate change will remain important topics in the future and diversifying and expanding domestic research topics following global research trendsis necessary.

Millennial parents' perception of babywearing products: A text analysis approach (밀레니얼 세대의 Babywearing 제품에 대한 인식: 텍스트 분석 접근)

  • Lee, Wan-Gee;Park, Myung-Ja;Lee, Kyu-Hye
    • Journal of the Korea Fashion and Costume Design Association
    • /
    • v.23 no.2
    • /
    • pp.17-28
    • /
    • 2021
  • The baby-tech industry, which combines IT with existing parenting product, is attracting increasing amounts of attention. Consequently various types of baby products incorporating functionality and design are being launched. In recent years, particularly as the market segments increases for babywearing products, parenting products that account for the child's comfort and parents' convenience are required. Therefore, this study examines the characteristics and consumer perception of babywear products, which are important for the emotional stability, development, and rearing of children. The study utilizes text mining and a network analysis by collecting unstructured text data. An examination of the network, based on the frequency of keywords for each babywear product and the degree of the connection to the centering index, revealed that consumers value convenience and price when purchasing products. The consumer perception and consideration factors that appear individually according to the product were also identified. In addition, studying body parts with high TF-IDF values revealed a difference in the body parts considered by consumers for each product. Lastly, through the visualization data based on the keywords that appeared in public, commonly appearing keywords, and those that appeared individually were examined. Through SNS, product characteristics as well as a new parenting culture that shared child-rearing routines were confirmed. This study suggests planning and marketing directions for the development of babywear products that meet consumer needs.

Research on Tourist Perception of Grand Canal Cultural Heritage Based on Network Text Analysis : The Pingjiang Historical and Cultural District of Suzhou City as an example (네트워크 텍스트 분석을 통한 대운하 문화유산에 대한 관광객 인식 연구 : 쑤저우시 핑장역사문화지구의 예)

  • Chengkang Zheng;Qiwei Jing;Nam Kyung Hyeon
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.215-231
    • /
    • 2023
  • Taking Pingjiang historical and cultural block in Suzhou as an example, this paper collects 1436 tourist comment data from Ctrip. com with Python technology, and uses network text analysis method to analyze frequency words, semantic network and emotion, so as to evaluate the tourist perception characteristics and levels of the Grand Canal cultural heritage. The study found that: natural and humanistic landscapes, historical and cultural deposits, and the style of the Jiangnan Canal are fully reflected in the perception of visitors to the Pingjiang Historical and Cultural District; Tourists hold strong positive emotions towards the Pingjiang Road historical and cultural district, however, there is still more space for the transformation and upgrading of the district. Finally,suggestions for measures to improve the perception of tourists of the Grand Canal cultural heritage are given in terms of conservation first, cultural integration and innovative utilization.