• Title/Summary/Keyword: Big data Processing

Search Result 1,063, Processing Time 0.031 seconds

A Comparative Analysis of the Changes in Perception of the Fourth Industrial Revolution: Focusing on Analyzing Social Media Data (4차 산업혁명에 대한 인식 변화 비교 분석: 소셜 미디어 데이터 분석을 중심으로)

  • You, Jae Eun;Choi, Jong Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.11
    • /
    • pp.367-376
    • /
    • 2020
  • The fourth industrial revolution will greatly contribute to the entry of objects into an intelligent society through technologies such as big data and an artificial intelligence. Through the revolution, we were able to understand human behavior and awareness, and through the use of an artificial intelligence, we established ourselves as a key tool in various fields such as medicine and science. However, the fourth industrial revolution has a negative side with a positive future. In this study, an analysis was conducted using text mining techniques based on unstructured big data collected through social media. We wanted to look at keywords related to the fourth industrial revolution by year (2016, 2017 and 2018) and understand the meaning of each keyword. In addition, we understood how the keywords related to the Fourth Industrial Revolution changed with the change of the year and wanted to use R to conduct a Keyword Analysis to identify the recognition flow closely related to the Fourth Industrial Revolution through the keyword flow associated with the Fourth Industrial Revolution. Finally, people's perceptions of the fourth industrial revolution were identified by looking at the positive and negative feelings related to the fourth industrial revolution by year. The analysis showed that negative opinions were declining year after year, with more positive outlook and future.

Development of an intelligent skin condition diagnosis information system based on social media

  • Kim, Hyung-Hoon;Ohk, Seung-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.8
    • /
    • pp.241-251
    • /
    • 2022
  • Diagnosis and management of customer's skin condition is an important essential function in the cosmetics and beauty industry. As the social media environment spreads and generalizes to all fields of society, the interaction of questions and answers to various and delicate concerns and requirements regarding the diagnosis and management of skin conditions is being actively dealt with in the social media community. However, since social media information is very diverse and atypical big data, an intelligent skin condition diagnosis system that combines appropriate skin condition information analysis and artificial intelligence technology is necessary. In this paper, we developed the skin condition diagnosis system SCDIS to intelligently diagnose and manage the skin condition of customers by processing the text analysis information of social media into learning data. In SCDIS, an artificial neural network model, AnnTFIDF, that automatically diagnoses skin condition types using artificial neural network technology, a deep learning machine learning method, was built up and used. The performance of the artificial neural network model AnnTFIDF was analyzed using test sample data, and the accuracy of the skin condition type diagnosis prediction value showed a high performance of about 95%. Through the experimental and performance analysis results of this paper, SCDIS can be evaluated as an intelligent tool that can be used efficiently in the skin condition analysis and diagnosis management process in the cosmetic and beauty industry. And this study can be used as a basic research to solve the new technology trend, customized cosmetics manufacturing and consumer-oriented beauty industry technology demand.

Analysis of Resident's Satisfaction and Its Determining Factors on Residential Environment: Using Zigbang's Apartment Review Bigdata and Deeplearning-based BERT Model (주거환경에 대한 거주민의 만족도와 영향요인 분석 - 직방 아파트 리뷰 빅데이터와 딥러닝 기반 BERT 모형을 활용하여 - )

  • Kweon, Junhyeon;Lee, Sugie
    • Journal of the Korean Regional Science Association
    • /
    • v.39 no.2
    • /
    • pp.47-61
    • /
    • 2023
  • Satisfaction on the residential environment is a major factor influencing the choice of residence and migration, and is directly related to the quality of life in the city. As online services of real estate increases, people's evaluation on the residential environment can be easily checked and it is possible to analyze their satisfaction and its determining factors based on their evaluation. This means that a larger amount of evaluation can be used more efficiently than previously used methods such as surveys. This study analyzed the residential environment reviews of about 30,000 apartment residents collected from 'Zigbang', an online real estate service in Seoul. The apartment review of Zigbang consists of an evaluation grade on a 5-point scale and the evaluation content directly described by the dweller. At first, this study labeled apartment reviews as positive and negative based on the scores of recommended reviews that include comprehensive evaluation about apartment. Next, to classify them automatically, developed a model by using Bidirectional Encoder Representations from Transformers(BERT), a deep learning-based natural language processing model. After that, by using SHapley Additive exPlanation(SHAP), extract word tokens that play an important role in the classification of reviews, to derive determining factors of the evaluation of the residential environment. Furthermore, by analyzing related keywords using Word2Vec, priority considerations for improving satisfaction on the residential environment were suggested. This study is meaningful that suggested a model that automatically classifies satisfaction on the residential environment into positive and negative by using apartment review big data and deep learning, which are qualitative evaluation data of residents, so that it's determining factors were derived. The result of analysis can be used as elementary data for improving the satisfaction on the residential environment, and can be used in the future evaluation of the residential environment near the apartment complex, and the design and evaluation of new complexes and infrastructure.

Clustering Algorithm using the DFP-Tree based on the MapReduce (맵리듀스 기반 DFP-Tree를 이용한 클러스터링 알고리즘)

  • Seo, Young-Won;Kim, Chang-soo
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.23-30
    • /
    • 2015
  • As BigData is issued, many applications that operate based on the results of data analysis have been developed, typically applications are products recommend service of e-commerce application service system, search service on the search engine service and friend list recommend system of social network service. In this paper, we suggests a decision frequent pattern tree that is combined the origin frequent pattern tree that is mining similar pattern to appear in the data set of the existing data mining techniques and decision tree based on the theory of computer science. The decision frequent pattern tree algorithm improves about problem of frequent pattern tree that have to make some a lot's pattern so it is to hard to analyze about data. We also proposes to model for a Mapredue framework that is a programming model to help to operate in distributed environment.

Item Recommendation Technique Using Spark (Spark를 이용한 항목 추천 기법에 관한 연구)

  • Yun, So-Young;Youn, Sung-Dae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.5
    • /
    • pp.715-721
    • /
    • 2018
  • With the spread of mobile devices, the users of social network services or e-commerce sites have increased dramatically, and the amount of data produced by the users has increased exponentially. E-commerce companies have faced a task regarding how to extract useful information from a vast amount of data produced by the users. To solve this problem, there are various studies applying big data processing technique. In this paper, we propose a collaborative filtering method that applies the tag weight in the Apache Spark platform. In order to elevate the accuracy of recommendation, the proposed method refines the tag data in the preprocessing process and categorizes the items and then applies the information of periods and tag weight to the estimate rating of the items. After generating RDD, we calculate item similarity and prediction values and recommend items to users. The experiment result indicated that the proposed method process large amounts of data quickly and improve the appropriateness of recommendation better.

A Study on Adaptive Learning Model for Performance Improvement of Stream Analytics (실시간 데이터 분석의 성능개선을 위한 적응형 학습 모델 연구)

  • Ku, Jin-Hee
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.1
    • /
    • pp.201-206
    • /
    • 2018
  • Recently, as technologies for realizing artificial intelligence have become more common, machine learning is widely used. Machine learning provides insight into collecting large amounts of data, batch processing, and taking final action, but the effects of the work are not immediately integrated into the learning process. In this paper proposed an adaptive learning model to improve the performance of real-time stream analysis as a big business issue. Adaptive learning generates the ensemble by adapting to the complexity of the data set, and the algorithm uses the data needed to determine the optimal data point to sample. In an experiment for six standard data sets, the adaptive learning model outperformed the simple machine learning model for classification at the learning time and accuracy. In particular, the support vector machine showed excellent performance at the end of all ensembles. Adaptive learning is expected to be applicable to a wide range of problems that need to be adaptively updated in the inference of changes in various parameters over time.

Exploring Strategy of Health Contents for Smart Media : Utilizing Information and Data (스마트 미디어 환경에 적합한 헬스 콘텐츠 전략 탐색 : 정보와 데이터 활용을 중심으로)

  • Yoon, Hongsuk;Shin, Dong-Hee
    • Journal of Digital Contents Society
    • /
    • v.16 no.1
    • /
    • pp.85-96
    • /
    • 2015
  • With the emergence of smart media and devices which are able to continuous data monitor, the usage patterns of access and acquire health information are changed. Health contents should be provided to promote intrinsic motivation considering the way of cognizing and processing data in human information interaction. For this, planning for prevention centered health contents is required to utilize personal information and medical big data for engagement with the contents. Therefore, this paper reviews previous studies of health communication dealing with users' information literacy like e-health literacy. In addition, the paper classifies ways of communication when mediate IT as immediacy, interaction and data capturing. In conclusion, strategies of health contents for promoting users' intrinsic motivation are explored and its implications are discussed.

Design of Meteorological Radar Pattern Classifier Using Clustering-based RBFNNs : Comparative Studies and Analysis (클러스터링 기반 RBFNNs를 이용한 기상레이더 패턴분류기 설계 : 비교 연구 및 해석)

  • Choi, Woo-Yong;Oh, Sung-Kwun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.5
    • /
    • pp.536-541
    • /
    • 2014
  • Data through meteorological radar includes ground echo, sea-clutter echo, anomalous propagation echo, clear echo and so on. Each echo is a kind of non-precipitation echoes and the characteristic of individual echoes is analyzed in order to identify with non-precipitation. Meteorological radar data is analyzed through pre-processing procedure because the data is given as big data. In this study, echo pattern classifier is designed to distinguish non-precipitation echoes from precipitation echo in meteorological radar data using RBFNNs and echo judgement module. Output performance is compared and analyzed by using both HCM clustering-based RBFNNs and FCM clustering-based RBFNNs.

An Implementation of Federated Learning based on Blockchain (블록체인 기반의 연합학습 구현)

  • Park, June Beom;Park, Jong Sou
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.89-96
    • /
    • 2020
  • Deep learning using an artificial neural network has been recently researched and developed in various fields such as image recognition, big data and data analysis. However, federated learning has emerged to solve issues of data privacy invasion and problems that increase the cost and time required to learn. Federated learning presented learning techniques that would bring the benefits of distributed processing system while solving the problems of existing deep learning, but there were still problems with server-client system and motivations for providing learning data. So, we replaced the role of the server with a blockchain system in federated learning, and conducted research to solve the privacy and security problems that are associated with federated learning. In addition, we have implemented a blockchain-based system that motivates users by paying compensation for data provided by users, and requires less maintenance costs while maintaining the same accuracy as existing learning. In this paper, we present the experimental results to show the validity of the blockchain-based system, and compare the results of the existing federated learning with the blockchain-based federated learning. In addition, as a future study, we ended the thesis by presenting solutions to security problems and applicable business fields.

An Analysis on the Safety Accident Network and Risk Level of Construction Machine and Equipment (건설기계·장비의 안전재해 네트워크 및 위험도 분석)

  • Shin, Won-Sang;Son, Chang-Baek
    • Journal of the Architectural Institute of Korea Structure & Construction
    • /
    • v.34 no.5
    • /
    • pp.35-42
    • /
    • 2018
  • In order to seek out methods to reduce safety accidents caused by construction machinery and equipment, this study collects data about safety accidents and draws main risk factors by construction from the data, through SNA. It aimed to suggest safety management points to be used in future construction fields, by analyzing risk index of such factors. The finding can be summarized: First, Backhoe Bucket is the risk factor for crash accidents of average workers in earth works; boring machines-maintenance is the risk factor for fall accidents of construction machinery operators in foundation works; bending machine-reinforcing rod processing is the risk factor for jamming accidents of reinforcing rod engineers in frame works; and mobile crane-hook is the risk factor for crash accidents of average workers in lifting works. Second, works can be arranged in turn, according to the risk index: earth, lifting, frame and foundation works. Risk factors can be also arranged according to the risk index: Backhoe in earth works, pile drivers in foundation works, bending machines in frame works and mobile cranes in lifting works. This study has some limits, in that it only analyzed main machinery/equipment, among various kinds of them, for earth, foundation, frame and temporary works (lifting works) and used data collected over three years. Therefore, it is necessary to conduct an analysis using big data, by collecting additional data about a lot of machinery/equipment in future construction fields.