• Title/Summary/Keyword: Big data Processing

Search Result 1,063, Processing Time 0.03 seconds

The Method for Extracting Meaningful Patterns Over the Time of Multi Blocks Stream Data (시간의 흐름과 위치 변화에 따른 멀티 블록 스트림 데이터의 의미 있는 패턴 추출 방법)

  • Cho, Kyeong-Rae;Kim, Ki-Young
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.3 no.10
    • /
    • pp.377-382
    • /
    • 2014
  • Analysis techniques of the data over time from the mobile environment and IoT, is mainly used for extracting patterns from the collected data, to find meaningful information. However, analytical methods existing, is based to be analyzed in a state where the data collection is complete, to reflect changes in time series data associated with the passage of time is difficult. In this paper, we introduce a method for analyzing multi-block streaming data(AM-MBSD: Analysis Method for Multi-Block Stream Data) for the analysis of the data stream with multiple properties, such as variability of pattern and large capacitive and continuity of data. The multi-block streaming data, define a plurality of blocks of data to be continuously generated, each block, by using the analysis method of the proposed method of analysis to extract meaningful patterns. The patterns that are extracted, generation time, frequency, were collected and consideration of such errors. Through analysis experiments using time series data.

A Design of File Leakage Response System through Event Detection (이벤트 감지를 통한 파일 유출 대응 시스템 설계)

  • Shin, Seung-Soo
    • Journal of Industrial Convergence
    • /
    • v.20 no.7
    • /
    • pp.65-71
    • /
    • 2022
  • With the development of ICT, as the era of the 4th industrial revolution arrives, the amount of data is enormous, and as big data technologies emerge, technologies for processing, storing, and processing data are becoming important. In this paper, we propose a system that detects events through monitoring and judges them using hash values because the damage to important files in case of leakage in industries and public places is serious nationally and property. As a research method, an optional event method is used to compare the hash value registered in advance after performing the encryption operation in the event of a file leakage, and then determine whether it is an important file. Monitoring of specific events minimizes system load, analyzes the signature, and determines it to improve accuracy. Confidentiality is improved by comparing and determining hash values pre-registered in the database. For future research, research on security solutions to prevent file leakage through networks and various paths is needed.

In-memory Compression Scheme Based on Incremental Frequent Patterns for Graph Streams (그래프 스트림 처리를 위한 점진적 빈발 패턴 기반 인-메모리 압축 기법)

  • Lee, Hyeon-Byeong;Shin, Bo-Kyoung;Bok, Kyoung-Soo;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.1
    • /
    • pp.35-46
    • /
    • 2022
  • Recently, with the development of network technologies, as IoT and social network service applications have been actively used, a lot of graph stream data is being generated. In this paper, we propose a graph compression scheme that considers the stream graph environment by applying graph mining to the existing compression technique, which has been focused on compression rate and runtime. In this paper, we proposed Incremental frequent pattern based compression technique for graph streams. Since the proposed scheme keeps only the latest reference patterns, it increases the storage utilization and improves the query processing time. In order to show the superiority of the proposed scheme, various performance evaluations are performed in terms of compression rate and processing time compared to the existing method. The proposed scheme is faster than existing similar scheme when the number of duplicated data is large.

Design of a High-Speed Data Packet Allocation Circuit for Network-on-Chip (NoC 용 고속 데이터 패킷 할당 회로 설계)

  • Kim, Jeonghyun;Lee, Jaesung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.459-461
    • /
    • 2022
  • One of the big differences between Network-on-Chip (NoC) and the existing parallel processing system based on an off-chip network is that data packet routing is performed using a centralized control scheme. In such an environment, the best-effort packet routing problem becomes a real-time assignment problem in which data packet arriving time and processing time is the cost. In this paper, the Hungarian algorithm, a representative computational complexity reduction algorithm for the linear algebraic equation of the allocation problem, is implemented in the form of a hardware accelerator. As a result of logic synthesis using the TSMC 0.18um standard cell library, the area of the circuit designed through case analysis for the cost distribution is reduced by about 16% and the propagation delay of it is reduced by about 52%, compared to the circuit implementing the original operation sequence of the Hungarian algorithm.

  • PDF

A Study on the Reliability Improvement of Blockchain-based Ship Inspection Service (블록체인 기반 선박검사 서비스의 신뢰성 향상에 관한 연구)

  • Chun-Won Jang;Young-Soo Kang;Seung-Min Lee;Jun-Mo Park
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.25 no.1
    • /
    • pp.15-20
    • /
    • 2024
  • In the field of ship inspection in South Korea, due to outdated workflow processes, there is a possibility of tampering with inspection results. Accordingly, research is being conducted to prevent tampering with inspection results by introducing blockchain technology and cloud-based systems that allow real-time tracking and sharing of data, and to establish a transparent and efficient communication system. In this study, unit and integrated processes for overall data management and inspection execution related to ship inspection were implemented to automatically collect, manage, and track various inspection results occurring during the ship inspection process. Through this, it aimed to increase the efficiency of the ship inspection process overall, inducing growth in the ship inspection industry as a whole. The implemented web portal reached a level where trend analysis and comparative analysis with other ships based on inspection results are possible, and subsequent research aims to demonstrate the excellence of the system.

Analysis on the GPU Performance according to Hierarchical Memory Organization (계층적 메모리 구성에 따른 GPU 성능 분석)

  • Choi, Hongjun;Kim, Jongmyon;Kim, Cheolhong
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.3
    • /
    • pp.22-32
    • /
    • 2014
  • Recently, GPGPU has been widely used for general-purpose processing as well as graphics processing by providing optimized hardware for parallel processing. Memory system has big effects on the performance of parallel processing units such as GPU. In the GPU, hierarchical memory architecture is implemented for high memory bandwidth. Moreover, both memory address coalescing and memory request merging techniques are widely used. This paper analyzes the GPU performance according to various memory organizations. According to our simulation results, GPU performance improves by 15.5%, 21.5%, 25.5%, 30.9% as adding 8KB L1, 16KB L1, 32KB L1, 64KB L1 cache, respectively, compared to case without L1 cache. However, experimental results show that some benchmarks decrease performance since memory transaction increases due to data dependency. Moreover, average memory access latency is increased as the depth of hierarchical cache level increases when cache miss occurs significantly.

Generating and Controlling an Interlinking Network of Technical Terms to Enhance Data Utilization (데이터 활용률 제고를 위한 기술 용어의 상호 네트워크 생성과 통제)

  • Jeong, Do-Heon
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.1
    • /
    • pp.157-182
    • /
    • 2018
  • As data management and processing techniques have been developed rapidly in the era of big data, nowadays a lot of business companies and researchers have been interested in long tail data which were ignored in the past. This study proposes methods for generating and controlling a network of technical terms based on text mining technique to enhance data utilization in the distribution of long tail theory. Especially, an edit distance technique of text mining has given us efficient methods to automatically create an interlinking network of technical terms in the scholarly field. We have also used linked open data system to gather experimental data to improve data utilization and proposed effective methods to use data of LOD systems and algorithm to recognize patterns of terms. Finally, the performance evaluation test of the network of technical terms has shown that the proposed methods were useful to enhance the rate of data utilization.

Dynamic Load Shedding Scheme based on Input Rate of Spatial Data Stream and Data Density (공간 데이터스트림의 입력 빈도와 데이터 밀집도 기반의 동적 부하제한 기법)

  • Jeong, Weonil
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.3
    • /
    • pp.2158-2164
    • /
    • 2015
  • In u-GIS environments, various load shedding techniques have been researched in order to balance loads caused by input spatial data streams. However, typical load shedding methods on aspatial data lack regard for characteristics of spatial data, also previous load shedding approaches on spatial, which still lack regard for spatial data density or dynamic input data stream, give rise to troubles on spatial query processing performance and accuracy. Therefore, dynamic load shedding scheme over spatial data stream is proposed through stored spatial data deviation and load ratio of input data stream in order to improve spatial continuous query accuracy and performance in u-GIS environment. In proposed scheme, input data which are a big probability related to spatial continuous query may be a strong chance to be dropped relatively.

A Study on the Prediction of Strawberry Production in Machine Learning Infrastructure (머신러닝 기반 시설재배 딸기 생산량 예측 연구)

  • Oh, HanByeol;Lim, JongHyun;Yang, SeungWeon;Cho, YongYun;Shin, ChangSun
    • Smart Media Journal
    • /
    • v.11 no.5
    • /
    • pp.9-16
    • /
    • 2022
  • Recently, agricultural sites are automating into digital agricultural smart farms by applying technologies such as big data and Internet of Things (IoT). These smart farms aim to increase production and improve crop quality by measuring the environment of crops, investigating and processing data. Production prediction is an important study in smart farm digital agriculture, which is a high-tech agriculture, and it is necessary to analyze environmental data using big data and further standardized research to manage the quality of growth information data. In this paper, environmental and production data collected from smart farm strawberry farms were analyzed and studied. Based on regression analysis, crop production prediction models were analyzed using Ridge Regression, LightGBM, and XGBoost. Among the three models, the optimal model was XGBoost, and R2 showed 82.5 percent explanatory power. As a result of the study, the correlation between the amount of positive fluid absorption and environmental data was confirmed, and significant results were obtained for the production prediction study. In the future, it is expected to contribute to the prevention of environmental pollution and reduction of sheep through the management of sheep by studying the amount of sheep absorption, such as information on the growing environment of crops and the ingredients of sheep.

Spatial Computation on Spark Using GPGPU (GPGPU를 활용한 스파크 기반 공간 연산)

  • Son, Chanseung;Kim, Daehee;Park, Neungsoo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.5 no.8
    • /
    • pp.181-188
    • /
    • 2016
  • Recently, as the amount of spatial information increases, an interest in the study of spatial information processing has been increased. Spatial database systems extended from the traditional relational database systems are difficult to handle large data sets because of the scalability. SpatialHadoop extended from Hadoop system has a low performance, because spatial computations in SpationHadoop require a lot of write operations of intermediate results to the disk, resulting in the performance degradation. In this paper, Spatial Computation Spark(SC-Spark) is proposed, which is an in-memory based distributed processing framework. SC-Spark is extended from Spark in order to efficiently perform the spatial operation for large-scale data. In addition, SC-Spark based on the GPGPU is developed to improve the performance of the SC-Spark. SC-Spark uses the advantage of the Spark holding intermediate results in the memory. And GPGPU-based SC-Spark can perform spatial operations in parallel using a plurality of processing elements of an GPU. To verify the proposed work, experiments on a single AMD system were performed using SC-Spark and GPGPU-based SC-Spark for Point-in-Polygon and spatial join operation. The experimental results showed that the performance of SC-Spark and GPGPU-based SC-Spark were up-to 8 times faster than SpatialHadoop.