• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.028 seconds

An Analysis of Causes of Marine Incidents at sea Using Big Data Technique (빅데이터 기법을 활용한 항해 중 준해양사고 발생원인 분석에 관한 연구)

  • Kang, Suk-Young;Kim, Ki-Sun;Kim, Hong-Beom;Rho, Beom-Seok
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.24 no.4
    • /
    • pp.408-414
    • /
    • 2018
  • Various studies have been conducted to reduce marine accidents. However, research on marine incidents is only marginal. There are many reports of marine incidents, but the main content of existing studies has been qualitative, which makes quantitative analysis difficult. However, quantitative analysis of marine accidents is necessary to reduce marine incidents. The purpose of this paper is to analyze marine incident data quantitatively by applying big data techniques to predict marine incident trends and reduce marine accident. To accomplish this, about 10,000 marine incident reports were prepared in a unified format through pre-processing. Using this preprocessed data, we first derived major keywords for the Marine incidents at sea using text mining techniques. Secondly, time series and cluster analysis were applied to major keywords. Trends for possible marine incidents were predicted. The results confirmed that it is possible to use quantified data and statistical analysis to address this topic. Also, we have confirmed that it is possible to provide information on preventive measures by grasping objective tendencies for marine incidents that may occur in the future through big data techniques.

A Comparative Study on the Social Awareness of Metaverse in Korea and China: Using Big Data Analysis (한국과 중국의 메타버스에 관한 사회적 인식의 비교연구: 빅데이터 분석의 활용 )

  • Ki-youn Kim
    • Journal of Internet Computing and Services
    • /
    • v.24 no.1
    • /
    • pp.71-86
    • /
    • 2023
  • The purpose of this exploratory study is to compare the differences in public perceptual characteristics of Korean and Chinese societies regarding the metaverse using big data analysis. Due to the environmental impact of the COVID-19 pandemic, technological progress, and the expansion of new consumer bases such as generation Z and Alpha, the world's interest in the metaverse is drawing attention, and related academic studies have been also in full swing from 2021. In particular, Korea and China have emerged as major leading countries in the metaverse industry. It is a timely research question to discover the difference in social awareness using big data accumulated in both countries at a time when the amount of mentions on the metaverse has skyrocketed. The analysis technique identifies the importance of key words by analyzing word frequency, N-gram, and TF-IDF of clean data through text mining analysis, and analyzes the density and centrality of semantic networks to determine the strength of connection between words and their semantic relevance. Python 3.9 Anaconda data science platform 3 and Textom 6 versions were used, and UCINET 6.759 analysis and visualization were performed for semantic network analysis and structural CONCOR analysis. As a result, four blocks, each of which are similar word groups, were driven. These blocks represent different perspectives that reflect the types of social perceptions of the metaverse in both countries. Studies on the metaverse are increasing, but studies on comparative research approaches between countries from a cross-cultural aspect have not yet been conducted. At this point, as a preceding study, this study will be able to provide theoretical grounds and meaningful insights to future studies.

Design and Implementation of Effectively Interactive Data Structure Web Courseware (효과적으로 상호작용하는 자료구조 웹 코스웨어의 설계 및 구현)

  • Cho, Sang-Young;Lee, Hyun-Jung
    • The Journal of Korean Association of Computer Education
    • /
    • v.11 no.1
    • /
    • pp.75-83
    • /
    • 2008
  • The prior data structure coursewares have been limited to using a simple screen structure with text, pictures or plain animation so that they have been failed to promote favorable interaction between learners and instructors, and unnecessarily charges the screen. In order to overcome these problems, this paper provides an applet-based simulation environment which enables learner to operate and control the data structure operation with their own data, therefore, the learners can actively and positively participate in a study with this courseware. The instructors can easily deliver the education contents to the learners by using web simulation suitable for IT education media. Also, the courseware can offer a class feedback and required data for students estimation by recording a log.

  • PDF

Data Empowered Insights for Sustainability of Korean MNEs

  • PARK, Young-Eun
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.6 no.3
    • /
    • pp.173-183
    • /
    • 2019
  • This study aims to utilize big data contents of news and social media for developing a corporate strategy of multinational enterprises and their global decision-making through the data mining technique, especially text mining. In this paper, the data of 2 news media (BBC and CNN) and 2 social media (Facebook and Twitter) were collected for the three global leading Korean companies (Samsung, Hyundai Motor Company, and LG) from April, 2018 to April, 2019. The findings of this paper have shown that traditional news media and also modern social media have become devastating tools to extract global trends or phenomena for businesses. Moreover, this presents that a company can adopt a two-track strategy through two different types of media by deriving the key issues or trends from news media channels and also grasping consumers' sentiments, preference or issues of interest such as battery or design from social media. In addition, analyzing the texts of those media and understanding the association rules greatly contribute to the comparison between two different types of media channels to see the difference. Lastly, this provides meaningful and valuable data empowered insights to find a future direction comprehensively and develop a global strategy for sustainability of business.

Trend Analysis on Clothing Care System of Consumer from Big Data (빅데이터를 통한 소비자의 의복관리방식 트렌드 분석)

  • Koo, Young Seok
    • Fashion & Textile Research Journal
    • /
    • v.22 no.5
    • /
    • pp.639-649
    • /
    • 2020
  • This study investigates consumer opinions of clothing care and provides fundamental data to decision-making for oncoming development of clothing care system. Textom, a web-matrix program, was used to analyze big data collected from Naver and Daum with a keyword of "clothing care" from March 2019 to February 2020. A total of 22, 187 texts were shown from the big data collection. Collected big data were analyzed using text-mining, network, and CONCOR analysis. The results of this study were as follows. First, many keywords related to clothing care were shown from the result of frequency analysis such as style, Dryer, LG Electronics, Product, Customer, Clothing, and Styler. Consumers were well recognizing and having an interest in recent information related to the clothing care system. Second, various keywords such as product, function, brand, and performance, were linked to each other which were fundamentally related to the clothing care. The interest in products of the clothing care system were linked to product brands that were also naturally linked to consumer interest. Third, the keywords in the network showed similar attributes from the result of CONCOR analysis that were classified into 4 groups such as the characteristics of purchase, product, performance, and interest. Lastly, positive emotions including goodwill, interest, and joy on the clothing care system were strongly expressed from the result of the sentimental analysis.

Analyzing Box-Office Hit Factors Using Big Data: Focusing on Korean Films for the Last 5 Years

  • Hwang, Youngmee;Kim, Kwangsun;Kwon, Ohyoung;Moon, Ilyoung;Shin, Gangho;Ham, Jongho;Park, Jintae
    • Journal of information and communication convergence engineering
    • /
    • v.15 no.4
    • /
    • pp.217-226
    • /
    • 2017
  • Korea has the tenth largest film industry in the world; however, detailed analyses using the factors contributing to successful film commercialization have not been approached. Using big data, this paper analyzed both internal and external factors (including genre, release date, rating, and number of screenings) that contributed to the commercial success of Korea's top 10 ranking films in 2011-2015. The authors developed a WebCrawler to collect text data about each movie, implemented a Hadoop system for data storage, and classified the data using Map Reduce method. The results showed that the characteristic of "release date," followed closely by "rating" and "genre" were the most influential factors of success in the Korean film industry. The analysis in this study is considered groundwork for the development of software that can predict box-office performance.

Applying Keyword Analysis to Predicting Agriculture Product Price Index: The Case of the Chinese Farming Market

  • Wang, Zhi-yuan;Kwon, Ohbyung;Liu, Fan
    • Asia Pacific Journal of Business Review
    • /
    • v.1 no.1
    • /
    • pp.1-22
    • /
    • 2016
  • The prediction of prices of agricultural products in the agriculture IT sector plays a significant role in the economic life of consumers and anyone engaged in agricultural business, and as these prices fluctuate more often than do other prices, the prediction of these prices holds a great deal of research promise. For this reason, academic literature has provided studies on the factors influencing the prices of agricultural products and the price index. However, as these factors vary, they are difficult to predict, resulting in the challenge of acquiring quantitative data. China is one example of a country without a reliable prediction system for prices of agricultural products. Fortunately, disclosed heterogeneous data can be found on the Internet, which allows for the effective collection of factors related to the prediction of these product prices through the use of text mining. The data provided online is valuable in that they reflect the opinions of the general public in real-time. Accordingly, this study aims to use heterogeneous data from the Internet and suggest a model predicting the prices of agricultural products before functional analyses. Toward this end, data analyses were conducted on the Chinese agricultural products market, one of the largest markets in the world.

Frequency and Social Network Analysis of the Bible Data using Big Data Analytics Tools R (R을 이용한 성경 데이터의 빈도와 소셜 네트워크 분석)

  • Ban, ChaeHoon;Ha, JongSoo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.93-96
    • /
    • 2018
  • Big datatics technology that can store and analyze data and obtain new knowledge has been adjusted for importance in many fields of the society. Big data is emerging as an important problem in the field of information and communication technology, but the mind of continuous technology is rising. R, a tool that can analyze big data, is a language and environment that enables information analysis of statistical bases. In this thesis, we use this to analyze the Bible data. R is used to investigate the frequency of what text is distributed and analyze the Bible through analysis of social network.

  • PDF

A Speech Homomorphic Encryption Scheme with Less Data Expansion in Cloud Computing

  • Shi, Canghong;Wang, Hongxia;Hu, Yi;Qian, Qing;Zhao, Hong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.5
    • /
    • pp.2588-2609
    • /
    • 2019
  • Speech homomorphic encryption has become one of the key components in secure speech storing in the public cloud computing. The major problem of speech homomorphic encryption is the huge data expansion of speech cipher-text. To address the issue, this paper presents a speech homomorphic encryption scheme with less data expansion, which is a probabilistic statistics and addition homomorphic cryptosystem. In the proposed scheme, the original digital speech with some random numbers selected is firstly grouped to form a series of speech matrix. Then, a proposed matrix encryption method is employed to encrypt that speech matrix. After that, mutual information in sample speech cipher-texts is reduced to limit the data expansion. Performance analysis and experimental results show that the proposed scheme is addition homomorphic, and it not only resists statistical analysis attacks but also eliminates some signal characteristics of original speech. In addition, comparing with Paillier homomorphic cryptosystem, the proposed scheme has less data expansion and lower computational complexity. Furthermore, the time consumption of the proposed scheme is almost the same on the smartphone and the PC. Thus, the proposed scheme is extremely suitable for secure speech storing in public cloud computing.

Implementation of Recipe Recommendation System Using Ingredients Combination Analysis based on Recipe Data (레시피 데이터 기반의 식재료 궁합 분석을 이용한 레시피 추천 시스템 구현)

  • Min, Seonghee;Oh, Yoosoo
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.8
    • /
    • pp.1114-1121
    • /
    • 2021
  • In this paper, we implement a recipe recommendation system using ingredient harmonization analysis based on recipe data. The proposed system receives an image of a food ingredient purchase receipt to recommend ingredients and recipes to the user. Moreover, it performs preprocessing of the receipt images and text extraction using the OCR algorithm. The proposed system can recommend recipes based on the combined data of ingredients. It collects recipe data to calculate the combination for each food ingredient and extracts the food ingredients of the collected recipe as training data. And then, it acquires vector data by learning with a natural language processing algorithm. Moreover, it can recommend recipes based on ingredients with high similarity. Also, the proposed system can recommend recipes using replaceable ingredients to improve the accuracy of the result through preprocessing and postprocessing. For our evaluation, we created a random input dataset to evaluate the proposed recipe recommendation system's performance and calculated the accuracy for each algorithm. As a result of performance evaluation, the accuracy of the Word2Vec algorithm was the highest.