• Title/Summary/Keyword: bigdata

Search Result 644, Processing Time 0.021 seconds

Korean Text Classification Using Randomforest and XGBoost Focusing on Seoul Metropolitan Civil Complaint Data (RandomForest와 XGBoost를 활용한 한국어 텍스트 분류: 서울특별시 응답소 민원 데이터를 중심으로)

  • Ha, Ji-Eun;Shin, Hyun-Chul;Lee, Zoon-Ky
    • The Journal of Bigdata
    • /
    • v.2 no.2
    • /
    • pp.95-104
    • /
    • 2017
  • In 2014, Seoul Metropolitan Government launched a response service aimed at responding promptly to civil complaints. The complaints received are categorized based on their content and sent to the department in charge. If this part can be automated, the time and labor costs will be reduced. In this study, we collected 17,700 cases of complaints for 7 years from June 1, 2010 to May 31, 2017. We compared the XGBoost with RandomForest and confirmed the suitability of Korean text classification. As a result, the accuracy of XGBoost compared to RandomForest is generally high. The accuracy of RandomForest was unstable after upsampling and downsampling using the same sample, while XGBoost showed stable overall accuracy.

  • PDF

Usage Pattern Analysis and Comparative Analysis among User Groups of Web Sites Using Process Mining Techniques (프로세스 마이닝을 이용한 웹 사이트의 이용 패턴 분석 및 그룹 간 비교 분석)

  • Kim, Seul-Gi;Jung, Jae-Yoon
    • The Journal of Bigdata
    • /
    • v.2 no.2
    • /
    • pp.105-114
    • /
    • 2017
  • Today, many services are supported on the web sites. Analysis of usage patterns of web site visitors is very important to optimize the use and efficiency of the web sites. In this study, analysis of usage patterns and comparative analysis of user groups were conducted by analyzing web access log provided by BPI Challenge 2016. This data provides access logs to the web site in the IT system of a Dutch Employee Insurance Agency (UWV). The customer information, and the click data describing the customers' behavior when using the agency's web site. In this study, we use process mining techniques to analyze the usage patterns of customers and the characteristics of customer groups, and ultimately improve the service quality of customers using web services.

  • PDF

A Study on Energy Management System of Sport Facilities using IoT and Bigdata (사물인터넷과 빅데이터를 이용한 스포츠 시설 에너지 관리시스템에 관한 연구)

  • Kwon, Yong-Kwang;Heo, Jun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.3
    • /
    • pp.59-64
    • /
    • 2020
  • In the Paris Climate Agreement, Korea submitted an ambitious goal of reducing the greenhouse gas emission forecast (BAU) by 37% by 2030. And as one of the countermeasures, a smart grid, an intelligent power grid, was presented. In order to apply the smart grid, EMS(Energy Management System) needs to be installed and operated in various fields, and the supply is delayed due to the lack of awareness of users and the limitations of system ROI. Therefore, recently, various data analysis and control technologies have been proposed to increase the efficiency of the installed EMS. In this study, we present a measurement control algorithm that analyzes and predicts big data collected by IoT using a SARIMA model to check and operate energy consumption of public sports facilities.

Sales Volume Prediction Model for Temperature Change using Big Data Analysis (빅데이터 분석을 이용한 기온 변화에 대한 판매량 예측 모델)

  • Back, Seung-Hoon;Oh, Ji-Yeon;Lee, Ji-Su;Hong, Jun-Ki;Hong, Sung-Chan
    • The Journal of Bigdata
    • /
    • v.4 no.1
    • /
    • pp.29-38
    • /
    • 2019
  • In this paper, we propose a sales forecasting model that forecasts the sales volume of short sleeves and outerwear according to the temperature change by utilizing accumulated big data from the online shopping mall 'A' over the past five years to increase sales volume and efficient inventory management. The proposed model predicts sales of short sleeves and outerwear according to temperature changes in 2018 by analyzing sales volume of short sleeves and outerwear from 2014 to 2017. Using the proposed sales forecasting model, we compared the sales forecasts of 2018 with the actual sales volume and found that the error rates are ±1.5% and ±8% for short sleeve and outerwear respectively.

  • PDF

Business Innovation Through Spatial Data Analysis: A Multi-Case Analysis (공간 데이터 분석 기반의 비즈니스의 혁신: 해외 사례 분석을 중심으로)

  • Ham, YuKun
    • The Journal of Bigdata
    • /
    • v.4 no.1
    • /
    • pp.83-97
    • /
    • 2019
  • With sensor and communication technology development, spatial data related to business activities is exploding. Spatial data is now evolving into atypical data about space over three dimensions, away from two-dimensional geographic data. In addition to the Fourth Industrial Revolution, which connects the virtual space with the real space, there is a great opportunity for companies to utilize it. The analysis of recent overseas cases shows that it is possible to analyze customized services by understanding the situation of customers and objects located in the space, to manage risk, and furthermore to innovate business processes by analyzing spatial data. In the future, business innovation that combines spatial data from various sources and real-time analysis of relationships and situations between people and objects in space is expected to expand in all business fields.

  • PDF

Trends in Social Media Participation and Change in ssues with Meta Analysis Using Network Analysis and Clustering Technique (소셜 미디어 참여에 관한 연구 동향과 쟁점의 변화: 네트워크 분석과 클러스터링 기법을 활용한 메타 분석을 중심으로)

  • Shin, Hyun-Bo;Seon, Hyung-Ju;Lee, Zoon-Ky
    • The Journal of Bigdata
    • /
    • v.4 no.1
    • /
    • pp.99-118
    • /
    • 2019
  • This study used network analysis and clustering techniques to analyze studies on social media participation. As a result of the main path analysis, 37 major studies were extracted and divided into two networks: community-related networks and new media-related. Network analysis and clustering result in four clusters. This study has the academic significance of using academic data to grasp research trends at a macro level and using network analysis and machine learning as a methodology.

  • PDF

A Study on the Effect of Mobile Cloud Computing Services Characteristics on the Intellectual Convergence and the Performance Expectancy in Construction Project: From the Perspective of the Social Capital (건설프로젝트에서 Mobile-Cloud Computing Service 특성이 정보융합과 기대성과에 미치는 영향에 관한 연구: 사회적 자본의 관점에서)

  • Kim, Youngwoo;Oh, Jay In
    • The Journal of Bigdata
    • /
    • v.4 no.1
    • /
    • pp.129-142
    • /
    • 2019
  • Construction projects have experienced many failures due to incomplete production environments. Thus, the purpose of this study is to use ICT resources leased during the construction period at the construction site and to introduce the Mobile Cloud Computing Service, which utilizes Cloud Computing Service and mobile devices such as smart phones, tablet PCs, and notebooks instead of physically wired communication networks. The characteristics of Mobile Cloud, such as rapid accuracy, shared collaboration, and ubiquity, will affect the social network among various construction site participants. we conducted empirical research on the introduction of Mobile Cloud to promote information exchange and convergence among the participants and mutual trust, ultimately improving the project performance.

  • PDF

The Analysis of HPAI Using CDR Data (CDR 자료를 이용한 고병원성 조류인플루엔자 분석)

  • Choi, Dae-Woo;Joo, Jae-Yun;Song, Yu-Han;Han, Ye-Ji
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.13-22
    • /
    • 2019
  • This study was conducted with funding from the government (Ministry of Agriculture, Food and Rural Affairs) in 2018 with support from the Agricultural, Food, and Rural Affairs Agency, 318069-03-HD040, and is based on artificial intelligence-based HPAI spread analysis and patterning. The inflow of highly pathogenic avian influenza is coming through migratory birds from abroad, but it is not known exactly what pathways provide the farm with the cause of the infection. And the transition between farms from the generated farms only assumes that the vehicle is the main cause, and the main cause of the spread is not exactly known. Based on the call detailed records (CDR) data provided by KT, the study aims to see how people visiting migratory bird-watching sites, presumed to be the site of the outbreak, will flow through infected farms.

  • PDF

Research on Financial Regulations Related RPA(Robotic Process Automation) (금융회사 RPA(로봇자동화) 관련 규제 연구)

  • Han, Taek-Ryong;Lee, Kyung-ho
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.47-59
    • /
    • 2019
  • Recently, the RPA (Robotic Process Automation) solution, which has been spreading in Korea and overseas, allows users to easily automate their tasks with the application GUI (Graphic User Interface), and the number of Korean financial companies which Implemented for automating their business is increasing now. However, as the major supervisory regulations that financial institutions must comply with are based on the existing traditional SDLC (Software Development Life Cycle), it is not proper to be directly applied to RPA that automates end-user works on the level of user's system interface. Therefore, in this paper, we organized the important financial supervisory rules and control items that should be considered for RPA implementation, then surveyed 24 financial companies which have implemented RPA for checking how they applied them. Finally, we would like to present the necessity of revision of related compliance.

  • PDF

Predicting and Interpreting Quality of CMP Process for Semiconductor Wafers Using Machine Learning (머신러닝을 이용한 반도체 웨이퍼 평탄화 공정품질 예측 및 해석 모형 개발)

  • Ahn, Jeong-Eon;Jung, Jae-Yoon
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.61-71
    • /
    • 2019
  • Chemical Mechanical Planarization (CMP) process that planarizes semiconductor wafer's surface by polishing is difficult to manage reliably since it is under various chemicals and physical machinery. In CMP process, Material Removal Rate (MRR) is often used for a quality indicator, and it is important to predict MRR in managing CMP process stably. In this study, we introduce prediction models using machine learning techniques of analyzing time-series sensor data collected in CMP process, and the classification models that are used to interpret process quality conditions. In addition, we find meaningful variables affecting process quality and explain process variables' conditions to keep process quality high by analyzing classification result.

  • PDF