• Title/Summary/Keyword: news bigdata

Search Result 33, Processing Time 0.019 seconds

Abusive Detection Using Bidirectional Long Short-Term Memory Networks (양방향 장단기 메모리 신경망을 이용한 욕설 검출)

  • Na, In-Seop;Lee, Sin-Woo;Lee, Jae-Hak;Koh, Jin-Gwang
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.35-45
    • /
    • 2019
  • Recently, the damage with social cost of malicious comments is increasing. In addition to the news of talent committing suicide through the effects of malicious comments. The damage to malicious comments including abusive language and slang is increasing and spreading in various type and forms throughout society. In this paper, we propose a technique for detecting abusive language using a bi-directional long short-term memory neural network model. We collected comments on the web through the web crawler and processed the stopwords on unused words such as English Alphabet or special characters. For the stopwords processed comments, the bidirectional long short-term memory neural network model considering the front word and back word of sentences was used to determine and detect abusive language. In order to use the bi-directional long short-term memory neural network, the detected comments were subjected to morphological analysis and vectorization, and each word was labeled with abusive language. Experimental results showed a performance of 88.79% for a total of 9,288 comments screened and collected.

  • PDF

A Study on the Smart Tourism Awareness through Bigdata Analysis

  • LEE, Song-Yi;LEE, Hwan-Soo
    • The Journal of Industrial Distribution & Business
    • /
    • v.11 no.5
    • /
    • pp.45-52
    • /
    • 2020
  • Purpose: In the 4th industrial revolution, services that incorporate various smart technologies in the tourism sector have begun to gain popularity. Accordingly, academic discussions on smart tourism have also started to become active in various fields. Despite recent research, the definition of smart tourism is still ambiguous, and it is not easy to differentiate its scope or characteristics from traditional tourism concepts. Thus, this study aims to analyze the perception of smart tourism exposed online to identify the current point of smart tourism in Korea and present the research direction for conceptualizing smart tourism suitable for the domestic situation. Research design, data, and methodology: This study analyzes the perception of smart tourism exposed online based on 20,198 news data from portal sites over the past six years. Data on words used with smart tourism were collected from the leading portal sites Naver, Daum, and Google. Text mining techniques were applied to identify the social awareness status of smart tourism. Network analysis was used to visualize the results between words related to smart tourism, and CONCOR analysis was conducted to derive clusters formed by words having similarity. Results: As a result of keyword analysis, the frequency of words related to the development and construction of smart tourism areas was high. The analysis of the centrality of the connection between words showed that the frequency of keywords was similar, and that the words "smartphones" and "China" had relatively high connection centrality. The results of network analysis and CONCOR indicated that words were formed into eight groups including related technologies, promotion, globalization, service introduction, innovation, regional society, activation, and utilization guide. The overall results of data analysis showed that the development of smart tourism cities was a noticeable issue. Conclusions: This study is meaningful in that it clearly reflects the differences in the perception of smart tourism between online and research trends despite various efforts to develop smart tourism in Korea. In addition, this study highlights the need to understand smart tourism concepts and enhance academic discussions. It is expected that such academic discussions will contribute to improving the competitiveness of smart tourism research in Korea.

COVID-19 Discourse and Social Welfare Intervention through Online News Big Data: Focusing on the Elderly Living Alone (온라인 뉴스 빅데이터를 통한 코로나 19 담론과 사회복지 개입방안: 독거노인을 중심으로)

  • Yeo, Jiyoung
    • 한국노년학
    • /
    • v.41 no.3
    • /
    • pp.353-371
    • /
    • 2021
  • The purpose of this study is to provide clues to social welfare policy making by revealing discourse on social intervention and response based on big data on elderly living alone in the COVID-19 situation. Keyword analysis, network analysis, and topic analysis were utilized to explore the ways in which news media have portrayed challenges facing older individuals and the ways in which the central and local government as well as private organization have responded to them. Results are as follows. First, networks(degree, closeness, betweenness) were formed around region, delivery, society, support, and vulnerability, suggesting an increased demand for economic assistance and social support as well as stronger service delivery systems. Second, key topics derived included "establishing public delivery systems", "establishing local networks", "Managing care gap", "Establishing a private economic support system", and "Establishing service organization system". Based on the research results, discourse on the organic role of government, communities and the private sector has been presented, suggesting policy and practical implications by proposing a discussion on how to intervene for elderly living alone in disaster situations such as COVID-19.

Social Factors Affecting Internet Searches on Cyber Bullying in Korea and America Using Social Big Data and Google Search Trends (소셜 빅데이터와 Google 검색트렌드를 활용한 한국과 미국의 사이버불링 검색에 영향을 미치는 요인 분석)

  • Song, Tae-Min;Song, Juyoung;Cheon, Mi-Kyung
    • The Journal of Bigdata
    • /
    • v.1 no.1
    • /
    • pp.67-75
    • /
    • 2016
  • The study analyzed big data extracted from Google and social media to identify factors related to searches on cyber bullying in Korea and America. Korea's cyber bullying analysis was conducted social big data collected from online news sites, blogs, $caf{\acute{e}}s$, social network services and message for between January 1, 2011 and March 31, 2013. Google search trends for the search words of stress, exercise, drinking, and cyber bullying were obtained for January 1, 2004 and December 22, 2013. The main results of this study were as follows: first, the significant factors stress were cyber bullying that Korea more than America. Secondly, a positive relationship was found between stress and drinking, exercise and cyber bullying both Korea and America. Thirdly, significant differences were found all path both Korea and America. The study shows that both adults and teenagers are influenced in Korea. We need to develop online application that if cyber bullying behavior was predicted can intervene in real time because these actual cyber bullying-related exposure to psychological and behavioral characteristic.

  • PDF

Examining the Disparity between Court's Assessment of Cognitive Impairment and Online Public Perception through Natural Language Processing (NLP): An Empirical Investigation (Natural Language Processing(NLP)를 활용한 법원의 판결과 온라인상 대중 인식간 괴리에 관한 실증 연구)

  • Seungkook Roh
    • The Journal of Bigdata
    • /
    • v.8 no.1
    • /
    • pp.11-22
    • /
    • 2023
  • This research aimed to examine the public's perception of the "rate of sentence reduction for reasons of mental and physical weakness" and investigate if it aligns with the actual practice. Various sources, such as the Supreme Court's Courtnet search system, the number of mental evaluation requests, and the number of articles and comments related to "mental weakness" on Naver News were utilized for the analysis. The findings indicate that the public has a negative opinion on reducing sentences due to mental and physical weakness, and they are dissatisfied with the vagueness of the standards. However, this study also confirms that the court strictly applies the reduction of responsibility for individuals with mental disabilities specified in Article 10 of the Criminal Act based on the analysis of actual judgments and the number of requests for psychiatric evaluation. In other words, even though the recognition of perpetrators' mental disorders is declining, the public does not seem to recognize this trend. This creates a negative impact on the public's trust in state institutions. Therefore, law enforcement agencies, such as the police and prosecutors, need to enforce the law according to clear standards to gain public trust. The judiciary also needs to make a firm decision on commuting sentences for mentally and physically infirm individuals and inform the public of the outcomes of its application.

Korean Food Review Analysis Using Large Language Models: Sentiment Analysis and Multi-Labeling for Food Safety Hazard Detection (대형 언어 모델을 활용한 한국어 식품 리뷰 분석: 감성분석과 다중 라벨링을 통한 식품안전 위해 탐지 연구)

  • Eun-Seon Choi;Kyung-Hee Lee;Wan-Sup Cho
    • The Journal of Bigdata
    • /
    • v.9 no.1
    • /
    • pp.75-88
    • /
    • 2024
  • Recently, there have been cases reported in the news of individuals experiencing symptoms of food poisoning after consuming raw beef purchased from online platforms, or reviews claiming that cherry tomatoes tasted bitter. This suggests the potential for analyzing food reviews on online platforms to detect food hazards, enabling government agencies, food manufacturers, and distributors to manage consumer food safety risks. This study proposes a classification model that uses sentiment analysis and large language models to analyze food reviews and detect negative ones, multi-labeling key food safety hazards (food poisoning, spoilage, chemical odors, foreign objects). The sentiment analysis model effectively minimized the misclassification of negative reviews with a low False Positive rate using a 'funnel' model. The multi-labeling model for food safety hazards showed high performance with both recall and accuracy over 96% when using GPT-4 Turbo compared to GPT-3.5. Government agencies, food manufacturers, and distributors can use the proposed model to monitor consumer reviews in real-time, detect potential food safety issues early, and manage risks. Such a system can protect corporate brand reputation, enhance consumer protection, and ultimately improve consumer health and safety.

A Comparative Analysis of the Prediction Models for the Direction of Stock Price Using the Online Company Reviews (기업 리뷰 정보를 활용한 주가 방향 예측 모델 비교 분석)

  • Lim, Yongtaek;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.8
    • /
    • pp.165-171
    • /
    • 2020
  • Most of the stock price prediction research using text mining uses news and SNS data. However, there is a weakness that it is difficult to get honest and vivid information about companies from them. This paper deals with the problem of the prediction for the direction of stock price by doing text mining the online company reviews of internal staff indicating employee satisfaction. The comparative analysis of the prediction models for the direction of stock price showed the prediction model, which adds internal employee reviews, has better performance than those that did not. This paper presents the convergence study using natural language processing in financial engineering. In the field of stock price prediction, This paper pursued a new methodology that used employee satisfaction. In practice, it is expected to provide useful information in the field of forecasting stock price direction.

A domain-specific sentiment lexicon construction method for stock index directionality (주가지수 방향성 예측을 위한 도메인 맞춤형 감성사전 구축방안)

  • Kim, Jae-Bong;Kim, Hyoung-Joong
    • Journal of Digital Contents Society
    • /
    • v.18 no.3
    • /
    • pp.585-592
    • /
    • 2017
  • As development of personal devices have made everyday use of internet much easier than before, it is getting generalized to find information and share it through the social media. In particular, communities specialized in each field have become so powerful that they can significantly influence our society. Finally, businesses and governments pay attentions to reflecting their opinions in their strategies. The stock market fluctuates with various factors of society. In order to consider social trends, many studies have tried making use of bigdata analysis on stock market researches as well as traditional approaches using buzz amount. In the example at the top, the studies using text data such as newspaper articles are being published. In this paper, we analyzed the post of 'Paxnet', a securities specialists' site, to supplement the limitation of the news. Based on this, we help researchers analyze the sentiment of investors by generating a domain-specific sentiment lexicon for the stock market.

Big Data Analysis on Daegu-Gyeongbuk Administrative Integration (대구·경북 행정통합에 대한 빅데이터 분석)

  • Song, Hwa Young;Park, Han Woo
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.5
    • /
    • pp.139-148
    • /
    • 2021
  • The study examines public attitude and reaction regarding administrative integration in Daegu and Gyeongbuk area. Specifically, it employs social big data including textual comments on online news articles and YouTube video clips. The collected data are analyzed in order to compare two periods, that is, before and after the inauguration of the Public Opinion Committee for One Daegu-Gyeongbuk. As a result, we have found that people's favorable response to administrative integration has gradually increased since the launch of the Committee. However, it still lacks specific administrative procedures and discussion topics among the frequently used words in the collected data. Thus, the Committee needs to provide a variety of information and materials related to administrative integration.

A Study on Applying Novel Reverse N-Gram for Construction of Natural Language Processing Dictionary for Healthcare Big Data Analysis (헬스케어 분야 빅데이터 분석을 위한 개체명 사전구축에 새로운 역 N-Gram 적용 연구)

  • KyungHyun Lee;RackJune Baek;WooSu Kim
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.3
    • /
    • pp.391-396
    • /
    • 2024
  • This study proposes a novel reverse N-Gram approach to overcome the limitations of traditional N-Gram methods and enhance performance in building an entity dictionary specialized for the healthcare sector. The proposed reverse N-Gram technique allows for more precise analysis and processing of the complex linguistic features of healthcare-related big data. To verify the efficiency of the proposed method, big data on healthcare and digital health announced during the Consumer Electronics Show (CES) held each January was collected. Using the Python programming language, 2,185 news titles and summaries mentioned from January 1 to 31 in 2010 and from January 1 to 31 in 2024 were preprocessed with the new reverse N-Gram method. This resulted in the stable construction of a dictionary for natural language processing in the healthcare field.