• Title/Summary/Keyword: news data

Search Result 888, Processing Time 0.026 seconds

A Study on Public Awareness of Landslide and Check Dam Using the Big Data Platform 'Hyean' (공공 빅데이터 플랫폼 '혜안'을 통한 산사태 및 사방댐 인식 분석)

  • Sohee Park;Min Jeng Kang;Song Eu
    • Journal of the Society of Disaster Information
    • /
    • v.18 no.4
    • /
    • pp.687-698
    • /
    • 2022
  • Purpose: This study was conducted to understand the public awareness of landslide and check dams in 2015-2020 using the big data platform 'Hyean' and to confirm the utilization of this platform in disaster prevention areas. Method: The total amount, number of detection by period by media, and affirmative and negative trends of a search for 'landslide' and 'check dam' in 2015-2020 were analyzed using a keyword search of 'Hyean.' Result: There is significant lack of public awareness of check dam compared to landslide, and the trend is more noticeable in the conspicuous gap of data amount between the news and SNS media. The number and the timing of the search for 'landslide' coincided with the actual occurrence of landslide, while the detection of 'check dam' was less related to it. Relatively affirmative preception for the check dam is inferred, but it was difficult to confirm accurate statistical affirmative and negative trends in the disaster prevention field using 'Hyean.' Conclusion: Unlike the experts who expect positive public awareness of check dam, the statistic results show that the public awareness of the check dam as an effective countermeasure against landslide was extremely low. Active promotion of erosion control projects should be carried out first, and a balanced sample survey should accompany online and periodic field surveys. Since there is a limit to grasping the effective perception in the field of disaster prevention area using 'Hyean', it should be very cautious to establish local/governmental policies using it.

A Study on Establishing a Market Entry Strategy for the Satellite Industry Using Future Signal Detection Techniques (미래신호 탐지 기법을 활용한 위성산업 시장의 진입 전략 수립 연구)

  • Sehyoung Kim;Jaehyeong Park;Hansol Lee;Juyoung Kang
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.249-265
    • /
    • 2023
  • Recently, the satellite industry has been paying attention to the private-led 'New Space' paradigm, which is a departure from the traditional government-led industry. The space industry, which is considered to be the next food industry, is still receiving relatively little attention in Korea compared to the global market. Therefore, the purpose of this study is to explore future signals that can help determine the market entry strategies of private companies in the domestic satellite industry. To this end, this study utilizes the theoretical background of future signal theory and the Keyword Portfolio Map method to analyze keyword potential in patent document data based on keyword growth rate and keyword occurrence frequency. In addition, news data was collected to categorize future signals into first symptom and early information, respectively. This is utilized as an interpretive indicator of how the keywords reveal their actual potential outside of patent documents. This study describes the process of data collection and analysis to explore future signals and traces the evolution of each keyword in the collected documents from a weak signal to a strong signal by specifically visualizing how it can be used through the visualization of keyword maps. The process of this research can contribute to the methodological contribution and expansion of the scope of existing research on future signals, and the results can contribute to the establishment of new industry planning and research directions in the satellite industry.

A Study on Tourism Behavior in the New normal Era Using Big Data (빅데이터를 활용한 뉴노멀(New normal)시대의 관광행태 변화에 관한 연구)

  • Kyoung-mi Yoo;Jong-cheon Kang;Youn-hee Choi
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.3
    • /
    • pp.167-181
    • /
    • 2023
  • This study utilized TEXTOM, a social network analysis program to analyze changes in current tourism behavior after travel restrictions were eased after the outbreak of COVID-19. Data on the keywords 'domestic travel' and 'overseas travel' were collected from blogs, cafes, and news provided by Naver, Google, and Daum. The collection period was set from April to December 2022 when social distancing was lifted, and 2019 and 2020 were each set as one year and compared and analyzed with 2022. A total of 80 key words were extracted through text mining and centrality analysis was performed using NetDraw. Finally, through the CONCOR, the correlated keywords were clustered into 4. As a result of the study, tourism behavior in 2022 shows tourism recovery before the outbreak of COVID-19, segmentation of travel based on each person's preferred theme, prioritization of each country's corona mitigation policy, and then selecting a tourist destination. It is expected to provide basic data for the development of tourism marketing strategies and tourism products for the newly emerging tourism ecosystem after COVID-19.

A study on Korean tourism trends using social big data -Focusing on sentiment analysis- (소셜 빅데이터를 활용한 한국관광 트렌드에 관한연구 -감성분석을 중심으로-)

  • Youn-hee Choi;Kyoung-mi Yoo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.3
    • /
    • pp.97-109
    • /
    • 2024
  • In the field of domestic tourism, tourism trend analysis of tourism consumers, both international tourists and domestic tourists, is essential not only for the Korean tourism market but also for local and governmental tourism policy makers. e will explore the keywords and sentiment analysis on social media to establish a marketing strategy plan and revitalize the domestic tourism industry through communication and information from tourism consumers. This study utilized TEXTOM 6.0 to analyze recent trends in Korean tourism. Data was collected from September 31, 2022, to August 31, 2023, using 'Korean tourism' and 'domestic tourism' as keywords, targeting blogs, cafes, and news provided by Naver, Daum, and Google. Through text mining, 100 key words and TF-IDF were extracted in order of frequency, and then CONCOR analysis and sentiment analysis were conducted. For Korean tourism keywords, words related to tourist destinations, travel companions and behaviors, tourism motivations and experiences, accommodation types, tourist information, and emotional connections ranked high. The results of the CONCOR analysis were categorized into five clusters related to tourist destinations, tourist information, tourist activities/experiences, tourism motivation/content, and inbound related. Finally, the sentiment analysis showed a high level of positive documents and vocabulary. This study analyzes the rapidly changing trends of Korean tourism through text mining on Korean tourism and is expected to provide meaningful data to promote domestic tourism not only for Koreans but also for foreigners visiting Korea.

A Study on Differences of Contents and Tones of Arguments among Newspapers Using Text Mining Analysis (텍스트 마이닝을 활용한 신문사에 따른 내용 및 논조 차이점 분석)

  • Kam, Miah;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.53-77
    • /
    • 2012
  • This study analyses the difference of contents and tones of arguments among three Korean major newspapers, the Kyunghyang Shinmoon, the HanKyoreh, and the Dong-A Ilbo. It is commonly accepted that newspapers in Korea explicitly deliver their own tone of arguments when they talk about some sensitive issues and topics. It could be controversial if readers of newspapers read the news without being aware of the type of tones of arguments because the contents and the tones of arguments can affect readers easily. Thus it is very desirable to have a new tool that can inform the readers of what tone of argument a newspaper has. This study presents the results of clustering and classification techniques as part of text mining analysis. We focus on six main subjects such as Culture, Politics, International, Editorial-opinion, Eco-business and National issues in newspapers, and attempt to identify differences and similarities among the newspapers. The basic unit of text mining analysis is a paragraph of news articles. This study uses a keyword-network analysis tool and visualizes relationships among keywords to make it easier to see the differences. Newspaper articles were gathered from KINDS, the Korean integrated news database system. KINDS preserves news articles of the Kyunghyang Shinmun, the HanKyoreh and the Dong-A Ilbo and these are open to the public. This study used these three Korean major newspapers from KINDS. About 3,030 articles from 2008 to 2012 were used. International, national issues and politics sections were gathered with some specific issues. The International section was collected with the keyword of 'Nuclear weapon of North Korea.' The National issues section was collected with the keyword of '4-major-river.' The Politics section was collected with the keyword of 'Tonghap-Jinbo Dang.' All of the articles from April 2012 to May 2012 of Eco-business, Culture and Editorial-opinion sections were also collected. All of the collected data were handled and edited into paragraphs. We got rid of stop-words using the Lucene Korean Module. We calculated keyword co-occurrence counts from the paired co-occurrence list of keywords in a paragraph. We made a co-occurrence matrix from the list. Once the co-occurrence matrix was built, we used the Cosine coefficient matrix as input for PFNet(Pathfinder Network). In order to analyze these three newspapers and find out the significant keywords in each paper, we analyzed the list of 10 highest frequency keywords and keyword-networks of 20 highest ranking frequency keywords to closely examine the relationships and show the detailed network map among keywords. We used NodeXL software to visualize the PFNet. After drawing all the networks, we compared the results with the classification results. Classification was firstly handled to identify how the tone of argument of a newspaper is different from others. Then, to analyze tones of arguments, all the paragraphs were divided into two types of tones, Positive tone and Negative tone. To identify and classify all of the tones of paragraphs and articles we had collected, supervised learning technique was used. The Na$\ddot{i}$ve Bayesian classifier algorithm provided in the MALLET package was used to classify all the paragraphs in articles. After classification, Precision, Recall and F-value were used to evaluate the results of classification. Based on the results of this study, three subjects such as Culture, Eco-business and Politics showed some differences in contents and tones of arguments among these three newspapers. In addition, for the National issues, tones of arguments on 4-major-rivers project were different from each other. It seems three newspapers have their own specific tone of argument in those sections. And keyword-networks showed different shapes with each other in the same period in the same section. It means that frequently appeared keywords in articles are different and their contents are comprised with different keywords. And the Positive-Negative classification showed the possibility of classifying newspapers' tones of arguments compared to others. These results indicate that the approach in this study is promising to be extended as a new tool to identify the different tones of arguments of newspapers.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

UX Methodology Study by Data Analysis Focusing on deriving persona through customer segment classification (데이터 분석을 통한 UX 방법론 연구 고객 세그먼트 분류를 통한 페르소나 도출을 중심으로)

  • Lee, Seul-Yi;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.151-176
    • /
    • 2021
  • As the information technology industry develops, various kinds of data are being created, and it is now essential to process them and use them in the industry. Analyzing and utilizing various digital data collected online and offline is a necessary process to provide an appropriate experience for customers in the industry. In order to create new businesses, products, and services, it is essential to use customer data collected in various ways to deeply understand potential customers' needs and analyze behavior patterns to capture hidden signals of desire. However, it is true that research using data analysis and UX methodology, which should be conducted in parallel for effective service development, is being conducted separately and that there is a lack of examples of use in the industry. In thiswork, we construct a single process by applying data analysis methods and UX methodologies. This study is important in that it is highly likely to be used because it applies methodologies that are actively used in practice. We conducted a survey on the topic to identify and cluster the associations between factors to establish customer classification and target customers. The research methods are as follows. First, we first conduct a factor, regression analysis to determine the association between factors in the happiness data survey. Groups are grouped according to the survey results and identify the relationship between 34 questions of psychological stability, family life, relational satisfaction, health, economic satisfaction, work satisfaction, daily life satisfaction, and residential environment satisfaction. Second, we classify clusters based on factors affecting happiness and extract the optimal number of clusters. Based on the results, we cross-analyzed the characteristics of each cluster. Third, forservice definition, analysis was conducted by correlating with keywords related to happiness. We leverage keyword analysis of the thumb trend to derive ideas based on the interest and associations of the keyword. We also collected approximately 11,000 news articles based on the top three keywords that are highly related to happiness, then derived issues between keywords through text mining analysis in SAS, and utilized them in defining services after ideas were conceived. Fourth, based on the characteristics identified through data analysis, we selected segmentation and targetingappropriate for service discovery. To this end, the characteristics of the factors were grouped and selected into four groups, and the profile was drawn up and the main target customers were selected. Fifth, based on the characteristics of the main target customers, interviewers were selected and the In-depthinterviews were conducted to discover the causes of happiness, causes of unhappiness, and needs for services. Sixth, we derive customer behavior patterns based on segment results and detailed interviews, and specify the objectives associated with the characteristics. Seventh, a typical persona using qualitative surveys and a persona using data were produced to analyze each characteristic and pros and cons by comparing the two personas. Existing market segmentation classifies customers based on purchasing factors, and UX methodology measures users' behavior variables to establish criteria and redefine users' classification. Utilizing these segment classification methods, applying the process of producinguser classification and persona in UX methodology will be able to utilize them as more accurate customer classification schemes. The significance of this study is summarized in two ways: First, the idea of using data to create a variety of services was linked to the UX methodology used to plan IT services by applying it in the hot topic era. Second, we further enhance user classification by applying segment analysis methods that are not currently used well in UX methodologies. To provide a consistent experience in creating a single service, from large to small, it is necessary to define customers with common goals. To this end, it is necessary to derive persona and persuade various stakeholders. Under these circumstances, designing a consistent experience from beginning to end, through fast and concrete user descriptions, would be a very effective way to produce a successful service.

Sports Celebrities as a Determinant of Sport Media Distribution Contents: Focusing on Tacit Premise of Agenda Setting Theory (스포츠미디어의 유통 콘텐츠 결정요인으로서 스포츠 스타: 의제설정 이론의 암묵적 전제를 중심으로)

  • YOO, Sang-Keon;KIM, Yong-Eun;SEO, Won-Jae
    • Journal of Distribution Science
    • /
    • v.17 no.10
    • /
    • pp.83-91
    • /
    • 2019
  • Purpose - Media is a significant distributional channel in sport. In terms of determining the influencer in building sport media contents, recent sport media studies have employed agenda-setting theory, assuming media itself as the agenda provider. In a real-world situation, however, sports stars have been deemed key factor determining distribution contents in sport. The starting point of this study is the "tacit premise" of agenda-setting theory. Given the agenda-setting theory, the current study attempted to explore the function of sport stars as an agenda provider, which is a key determinant of sport distribution. Research design, data, and methodology - This study has reviewed articles of Yuna Kim, Sang-hwa Lee, and Hyun-jin Ryu from daily newspapers including as dong-a ilbo and joongang ilbo (2013 to 2017). The study collected data, portable document format (PDF), from the online archive of dong-a ilbo and joongang ilbo. We coded the length of the article, the frequency, the size of the picture, and the structural form of the article. Inter-coder reliability was compared with data previously investigated by the researcher. Inter-coder reliabilities for study 1 and 2 was .89 and .85. To examine hypotheses, descriptive analysis, correlations, and cross-tap analysis were performed. Results - The results partially supported the hypotheses proposing the significant role of sports stars as the agenda setters in distributing sport media contents. In specific, the study found that the number of articles about sports stars prevailed the number of articles about regular athletes. Besides, studies found that the use of photos was more frequent in articles of sports starts than that of regular athletes. In sports newspaper articles, featured story articles were used more than straight-articles for news relating to sports stars. Also, sports newspaper of sports stars contained more information associated within an event rather than outside of an event. Conclusions - In sports journalism, this study challenges the current theory that the media affects the composition and the content of sports coverages. As the principle of the agenda-setting of sports media, the influence of sports stars must be continuously studied along with a follow-up study.

Optimistic Concurrency Control with Update Transaction First for Broadcast Environment : OCC/UTF (방송환경에서 갱신 거래 우선 낙관적 동시성 제어 기법)

  • Lee, Uk-Hyeon;Hwang, Bu-Hyeon
    • The KIPS Transactions:PartD
    • /
    • v.9D no.2
    • /
    • pp.185-194
    • /
    • 2002
  • Most of mobile computing systems allow mostly read-only transactions from mobile clients for retrieving various types of Information such as stock data, traffic information and news updates. Since previous concurrence control protocols, however, do not consider such a particular characteristics, the performance degradation occurs when previous schemes are applied to the broadcast environment. In this paper, we propose OCC/UTF(Optimistic Concurrence Control with Update Transaction First) that is most appropriate for broadcast environment. OCC/UTF lets a query transaction, that has already read the data item which was invalidated by update transaction, read again the same data item without the abort of the query transaction due to non-serializability. Therefore, serializable order is maintained and the query transaction is committed safely regardless of commitment of update transactions. In OCC/UTF, Clients need not require server to commit their query transactions. Because of broadcasting the validation reports including values updated recently to clients, it reduces the overhead of requesting recent values from the server and the server need not also re-broadcast the newest values. As a result, OCC/UTF makes full use of the asymmetric bandwidth. It can also improve transaction throughput by increasing the commit ratio of query transactions as much as possible.

Monitoring Seasonal Influenza Epidemics in Korea through Query Search (인터넷 검색어를 활용한 계절적 유행성 독감 발생 감지)

  • Kwon, Chi-Myung;Hwang, Sung-Won;Jung, Jae-Un
    • Journal of the Korea Society for Simulation
    • /
    • v.23 no.4
    • /
    • pp.31-39
    • /
    • 2014
  • Seasonal influenza epidemics cause 3 to 5 millions severe illness and 250,000 to 500,000 deaths worldwide each year. To prepare better controls on severe influenza epidemics, many studies have been proposed to achieve near real-time surveillance of the spread of influenza. Korea CDC publishes clinical data of influenza epidemics on a weekly basis typically with a 1-2-week reporting lag. To provide faster detection of epidemics, recently approaches using unofficial data such as news reports, social media, and search queries are suggested. Collection of such data is cheap in cost and is realized in near real-time. This research aims to develop regression models for early detecting the outbreak of the seasonal influenza epidemics in Korea with keyword query information provided from the Naver (Korean representative portal site) trend services for PC and mobile device. We selected 20 key words likely to have strong correlations with influenza-like illness (ILI) based on literature review and proposed a logistic regression model and a multiple regression model to predict the outbreak of ILI. With respect of model fitness, the multiple regression model shows better results than logistic regression model. Also we find that a mobile-based regression model is better than PC-based regression model in estimating ILI percentages.