• Title/Summary/Keyword: Big data Processing

Search Result 1,063, Processing Time 0.035 seconds

Open Platform for Improvement of e-Health Accessibility (의료정보서비스 접근성 향상을 위한 개방형 플랫폼 구축방안)

  • Lee, Hyun-Jik;Kim, Yoon-Ho
    • Journal of Digital Contents Society
    • /
    • v.18 no.7
    • /
    • pp.1341-1346
    • /
    • 2017
  • In this paper, we designed the open service platform based on integrated type of individual customized service and intelligent information technology with individual's complex attributes and requests. First, the data collection phase is proceed quickly and accurately to repeat extraction, transformation and loading. The generated data from extraction-transformation-loading process module is stored in the distributed data system. The data analysis phase is generated a variety of patterns that used the analysis algorithm in the field. The data processing phase is used distributed parallel processing to improve performance. The data providing should operate independently on device-specific management platform. It provides a type of the Open API.

Big Data Processing and Performance Improvement for Ship Trajectory using MapReduce Technique

  • Kim, Kwang-Il;Kim, Joo-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.10
    • /
    • pp.65-70
    • /
    • 2019
  • In recently, ship trajectory data consisting of ship position, speed, course, and so on can be obtained from the Automatic Identification System device with which all ships should be equipped. These data are gathered more than 2GB every day at a crowed sea port and used for analysis of ship traffic statistic and patterns. In this study, we propose a method to process ship trajectory data efficiently with distributed computing resources using MapReduce algorithm. In data preprocessing phase, ship dynamic and static data are integrated into target dataset and filtered out ship trajectory that is not of interest. In mapping phase, we convert ship's position to Geohash code, and assign Geohash and ship MMSI to key and value. In reducing phase, key-value pairs are sorted according to the same key value and counted the ship traffic number in a grid cell. To evaluate the proposed method, we implemented it and compared it with IALA waterway risk assessment program(IWRAP) in their performance. The data processing performance improve 1 to 4 times that of the existing ship trajectory analysis program.

A Multilayer Perceptron-Based Electric Load Forecasting Scheme via Effective Recovering Missing Data (효과적인 결측치 보완을 통한 다층 퍼셉트론 기반의 전력수요 예측 기법)

  • Moon, Jihoon;Park, Sungwoo;Hwang, Eenjun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.2
    • /
    • pp.67-78
    • /
    • 2019
  • Accurate electric load forecasting is very important in the efficient operation of the smart grid. Recently, due to the development of IT technology, many works for constructing accurate forecasting models have been developed based on big data processing using artificial intelligence techniques. These forecasting models usually utilize external factors such as temperature, humidity and historical electric load as independent variables. However, due to diverse internal and external factors, historical electrical load contains many missing data, which makes it very difficult to construct an accurate forecasting model. To solve this problem, in this paper, we propose a random forest-based missing data recovery scheme and construct an electric load forecasting model based on multilayer perceptron using the estimated values of missing data and external factors. We demonstrate the performance of our proposed scheme via various experiments.

Development of Cloud based Data Collection and Analysis for Manufacturing (클라우드 기반의 생산설비 데이터 수집 및 분석 시스템 개발)

  • Young-Dong Lee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.23 no.4
    • /
    • pp.216-221
    • /
    • 2022
  • The 4th industrial revolution is accelerating the transition to digital innovation in various aspects of our daily lives, and efforts for manufacturing innovation are continuing in the manufacturing industry, such as smart factories. The 4th industrial revolution technology in manufacturing can be used based on AI, big data, IoT, cloud, and robots. Through this, it is required to develop a technology to establish a production facility data collection and analysis system that has evolved from the existing automation and to find the cause of defects and minimize the defect rate. In this paper, we implemented a system that collects power, environment, and status data from production facility sites through IoT devices, quantifies them in real-time in a cloud computing environment, and displays them in the form of MQTT-based real-time infographics using widgets. The real-time sensor data transmitted from the IoT device is stored to the cloud server through a Rest API method. In addition, the administrator could remotely monitor the data on the dashboard and analyze it hourly and daily.

A Data-Consistency Scheme for the Distributed-Cache Storage of the Memcached System

  • Liao, Jianwei;Peng, Xiaoning
    • Journal of Computing Science and Engineering
    • /
    • v.11 no.3
    • /
    • pp.92-99
    • /
    • 2017
  • Memcached, commonly used to speed up the data access in big-data and Internet-web applications, is a system software of the distributed-cache mechanism. But it is subject to the severe challenge of the loss of recently uncommitted updates in the case where the Memcached servers crash due to some reason. Although the replica scheme and the disk-log-based replay mechanism have been proposed to overcome this problem, they generate either the overhead of the replica synchronization or the persistent-storage overhead that is caused by flushing related logs. This paper proposes a scheme of backing up the write requests (i.e., set and add) on the Memcached client side, to reduce the overhead resulting from the making of disk-log records or performing the replica consistency. If the Memcached server fails, a timestamp-based recovery mechanism is then introduced to replay the write requests (buffered by relevant clients), for regaining the lost-data updates on the rebooted Memcached server, thereby meeting the data-consistency requirement. More importantly, compared with the mechanism of logging the write requests to the persistent storage of the master server and the server-replication scheme, the newly proposed approach of backing up the logs on the client side can greatly decrease the time overhead by up to 116.8% when processing the write workloads.

Development of Load Profile Monitoring System Based on Cloud Computing in Automotive (클라우드 컴퓨팅 기반의 자동차 부하정보 모니터링 시스템 개발)

  • Cho, Hwee;Kim, Ki-Tae;Jang, Yun-Hee;Kim, Seung-Hwan;Kim, Jun-Su;Park, Keoun-Young;Jang, Joong-Soon;Kim, Jong-Man
    • Journal of Korean Society for Quality Management
    • /
    • v.43 no.4
    • /
    • pp.573-588
    • /
    • 2015
  • Purpose: For improving result of estimated remaining useful life in Prognostics and Health Management (PHM), a system which is able to consider a lot of environment and load data is required. Method: A load profile monitoring system was presented based on cloud computing for gathering and processing raw data which is included environment and load data. Result: Users can access results of load profile information on the Internet. The developed system provides information which consists of distribution of load data, basic statistics, etc. Conclusion: We developed the load profile monitoring system for considering much environment and load data. This system has advantages such as improving accessibility through smart device, reducing cost, and covering various conditions.

Recent R&D Trends for 3D Deep Learning (3D 딥러닝 기술 동향)

  • Lee, S.W.;Hwang, B.W.;Lim, S.J.;Yoon, S.U.;Kim, T.J.;Choi, J.S.;Park, C.J.
    • Electronics and Telecommunications Trends
    • /
    • v.33 no.5
    • /
    • pp.103-110
    • /
    • 2018
  • Studies on artificial intelligence have been developed for the past couple of decades. After a few periods of prosperity and recession, a new machine learning method, so-called Deep Learning, has been introduced. This is the result of high-quality big- data, an increase in computing power, and the development of new algorithms. The main targets for deep learning are 1D audio and 2D images. The application domain is being extended from a discriminative model, such as classification/segmentation, to a generative model. Currently, deep learning is used for processing 3D data. However, unlike 2D, it is not easy to acquire 3D learning data. Although low-cost 3D data acquisition sensors have become more popular owing to advances in 3D vision technology, the generation/acquisition of 3D data remains a very difficult problem. Moreover, it is not easy to directly apply an existing network model, such as a convolution network, owing to the variety of 3D data representations. In this paper, we summarize the 3D deep learning technology that have started to be developed within the last 2 years.

An Effective Reduction of Association Rules using a T-Algorithm (T-알고리즘을 이용한 연관규칙의 효과적인 감축)

  • Park, Jin-Hee;Chung, Hwan-Mook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.2
    • /
    • pp.285-290
    • /
    • 2009
  • An association rule mining has been studied to find hidden data pattern in data mining. A realization of fast processing method have became a big issue because it treated a great number of transaction data. The time which is derived by association rule finding method geometrically increase according to a number of item included data. Accordingly, the process to reduce the number of rules is necessarily needed. We propose the T-algorithm that is efficient rule reduction algorithm. The T-algorithm can reduce effectively the number of association rules. Because that the T-algorithm compares transaction data item with binary format. And improves a support and a confidence between items. The performance of the proposed T-algorithm is evaluated from a simulation.

Efficient Keyword Extraction from Social Big Data Based on Cohesion Scoring

  • Kim, Hyeon Gyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.10
    • /
    • pp.87-94
    • /
    • 2020
  • Social reviews such as SNS feeds and blog articles have been widely used to extract keywords reflecting opinions and complaints from users' perspective, and often include proper nouns or new words reflecting recent trends. In general, these words are not included in a dictionary, so conventional morphological analyzers may not detect and extract those words from the reviews properly. In addition, due to their high processing time, it is inadequate to provide analysis results in a timely manner. This paper presents a method for efficient keyword extraction from social reviews based on the notion of cohesion scoring. Cohesion scores can be calculated based on word frequencies, so keyword extraction can be performed without a dictionary when using it. On the other hand, their accuracy can be degraded when input data with poor spacing is given. Regarding this, an algorithm is presented which improves the existing cohesion scoring mechanism using the structure of a word tree. Our experiment results show that it took only 0.008 seconds to extract keywords from 1,000 reviews in the proposed method while resulting in 15.5% error ratio which is better than the existing morphological analyzers.

Design and Implementation of Big Data Analytics Framework for Disaster Risk Assessment (빅데이터 기반 재난 재해 위험도 분석 프레임워크 설계 및 구현)

  • Chai, Su-seong;Jang, Sun Yeon;Suh, Dongjun
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.771-777
    • /
    • 2018
  • This study proposes a big data based risk analysis framework to analyze more comprehensive disaster risk and vulnerability. We introduce a distributed and parallel framework that allows large volumes of data to be processed in a short time by using open-source disaster risk assessment tool. A performance analysis of the proposed system presents that it achieves a more faster processing time than that of the existing system and it will be possible to respond promptly to precise prediction and contribute to providing guideline to disaster countermeasures. Proposed system is able to support accurate risk prediction and mitigate severe damage, therefore will be crucial to giving decision makers or experts to prepare for emergency or disaster situation, and minimizing large scale damage to a region.