• Title/Summary/Keyword: Big data Processing

Search Result 1,063, Processing Time 0.034 seconds

Data Processing Method for Real-time Safety Supervision System in Railway (실시간 철도안전 관제를 위한 데이터 처리 방안 연구)

  • Shin, Kwang-Ho;Jung, Hye-Ran;Ahn, Jin
    • Journal of the Korean Society for Railway
    • /
    • v.19 no.4
    • /
    • pp.445-455
    • /
    • 2016
  • A goal of the Real-time railway safety supervision system is to improve the safety oversight efficiency and to prevent accidents by integrating existing distributed monitoring systems, train, signal, power and facilities. So, the system require better performance regarding real-time processing based on big data. The disk-based database that is used in existing railway control systems has a problem with real-time processing; memory-based databases haves a limitation in terms of big-data processing; and time series databases haves a limitation in terms of real-time processing. So, we need a new database architecture for simultaneous real-time processing based on big data. In this study, we review the existing railway monitoring systems and propose a new database architecture for a real-time railway safety supervision system.

Small Sample Face Recognition Algorithm Based on Novel Siamese Network

  • Zhang, Jianming;Jin, Xiaokang;Liu, Yukai;Sangaiah, Arun Kumar;Wang, Jin
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1464-1479
    • /
    • 2018
  • In face recognition, sometimes the number of available training samples for single category is insufficient. Therefore, the performances of models trained by convolutional neural network are not ideal. The small sample face recognition algorithm based on novel Siamese network is proposed in this paper, which doesn't need rich samples for training. The algorithm designs and realizes a new Siamese network model, SiameseFacel, which uses pairs of face images as inputs and maps them to target space so that the $L_2$ norm distance in target space can represent the semantic distance in input space. The mapping is represented by the neural network in supervised learning. Moreover, a more lightweight Siamese network model, SiameseFace2, is designed to reduce the network parameters without losing accuracy. We also present a new method to generate training data and expand the number of training samples for single category in AR and labeled faces in the wild (LFW) datasets, which improves the recognition accuracy of the models. Four loss functions are adopted to carry out experiments on AR and LFW datasets. The results show that the contrastive loss function combined with new Siamese network model in this paper can effectively improve the accuracy of face recognition.

Blockchain and AI-based big data processing techniques for sustainable agricultural environments (지속가능한 농업 환경을 위한 블록체인과 AI 기반 빅 데이터 처리 기법)

  • Yoon-Su Jeong
    • Advanced Industrial SCIence
    • /
    • v.3 no.2
    • /
    • pp.17-22
    • /
    • 2024
  • Recently, as the ICT field has been used in various environments, it has become possible to analyze pests by crops, use robots when harvesting crops, and predict by big data by utilizing ICT technologies in a sustainable agricultural environment. However, in a sustainable agricultural environment, efforts to solve resource depletion, agricultural population decline, poverty increase, and environmental destruction are constantly being demanded. This paper proposes an artificial intelligence-based big data processing analysis method to reduce the production cost and increase the efficiency of crops based on a sustainable agricultural environment. The proposed technique strengthens the security and reliability of data by processing big data of crops combined with AI, and enables better decision-making and business value extraction. It can lead to innovative changes in various industries and fields and promote the development of data-oriented business models. During the experiment, the proposed technique gave an accurate answer to only a small amount of data, and at a farm site where it is difficult to tag the correct answer one by one, the performance similar to that of learning with a large amount of correct answer data (with an error rate within 0.05) was found.

RMSE Comparison of SVD Algorithms for Tax Accountant Recommendation Service (세무사 추천 서비스를 위한 SVD 알고리즘의 RMSE 비교)

  • Won-Jib Kim;Ji-Hye Huh;Se-Bean Park;Su-Min Lee;Eu-Na Kwon
    • Annual Conference of KIPS
    • /
    • 2023.11a
    • /
    • pp.963-964
    • /
    • 2023
  • 추천 시스템은 사용자의 선호도를 정확히 파악하는 것이 중요하다. 이를 위해 사용자 데이터를 분석하여 추천을 제공하는 협업 필터링 알고리즘을 활용한다. 하지만 상품의 종류와 고객 수가 많아짐에 따라 사용자 선호도 정확도가 떨어지는 문제점이 있다. 이 문제를 해결하기 위해 제안된 방법은 모델 기반 협업 필터링이며, 이는 고객과 사용자의 정보를 직접적으로 추천하는 대신 모델을 학습시키는데 활용된다. 이에 논문은 추천시스템에서 자주 사용되는 모델 협업 필터링 기반 SVD 모델을 학습 전에 하이퍼파라미터를 조절하여 모델에 추정 정확도 값인 RMSE를 측정한다.

KoBERT-based for parents with disabilities Implementation of Emotion Analysis Communication Platform (장애아 부모를 위한 KoBERT 기반 감정분석 소통 플랫폼 구현)

  • Jae-Hyung Ha;Ji-Hye Huh;Won-Jib Kim;Jung-Hun Lee;Woo-Jung Park
    • Annual Conference of KIPS
    • /
    • 2023.11a
    • /
    • pp.1014-1015
    • /
    • 2023
  • 많은 장애아 부모들은 양육에 대한 스트레스, 미래에 대한 걱정으로 심리적으로 상당한 중압감을 느낀다. 이에 비해 매년 증가하는 장애인 수에 비해 장애아 부모 및 가족의 심리적·정신적 문제를 해결하기 위한 프로그램이 부족하다.[1] 이를 해결하고자 본 논문에서는 감정분석 소통 플랫폼을 제안한다. 제안하는 플랫폼은 KoBERT 모델을 fine-tunning 하여 사용자의 일기 속 감정을 분석하여 장애아를 둔 부모 및 가족 간의 소통을 돕는다. 성능평가는 제안하는 플랫폼의 주요 기능인 KoBERT 기반 감정분석의 성능을 확인하기위해 텍스트 분류 모델로 널리 사용되고 있는 LSTM, Bi-LSTM, GRU 모델 별 성능지표들과 비교 분석한다. 성능 평가결과 KoBERT 의 정확도가 다른 분류군의 정확도보다 평균 31.4% 높은 성능을 보였고, 이 외의 지표에서도 비교적 높은 성능을 기록했다.

An Efficient Algorithm of Data Anonymity based on Anonymity Groups (익명 그룹 기반의 효율적인 데이터 익명화 알고리즘)

  • Kwon, Ho Yeol
    • Journal of Industrial Technology
    • /
    • v.36
    • /
    • pp.89-92
    • /
    • 2016
  • In this paper, we propose an efficient anonymity algorithm for personal information protections in big data systems. Firstly, we briefly introduce fundamental algorithms of k-anonymity, l-diversity, t-closeness. And then we propose an anonymity algorithm using controlling the size of anonymity groups as well as exchanging the data tuple between anonymity groups. Finally, we demonstrate an example on which proposed algorithm applied. The proposed scheme gave an efficient and simple algorithms for the processing of a big amount of data.

  • PDF

Scalable Big Data Pipeline for Video Stream Analytics Over Commodity Hardware

  • Ayub, Umer;Ahsan, Syed M.;Qureshi, Shavez M.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.4
    • /
    • pp.1146-1165
    • /
    • 2022
  • A huge amount of data in the form of videos and images is being produced owning to advancements in sensor technology. Use of low performance commodity hardware coupled with resource heavy image processing and analyzing approaches to infer and extract actionable insights from this data poses a bottleneck for timely decision making. Current approach of GPU assisted and cloud-based architecture video analysis techniques give significant performance gain, but its usage is constrained by financial considerations and extremely complex architecture level details. In this paper we propose a data pipeline system that uses open-source tools such as Apache Spark, Kafka and OpenCV running over commodity hardware for video stream processing and image processing in a distributed environment. Experimental results show that our proposed approach eliminates the need of GPU based hardware and cloud computing infrastructure to achieve efficient video steam processing for face detection with increased throughput, scalability and better performance.

Development of the Unified Database Design Methodology for Big Data Applications - based on MongoDB -

  • Lee, Junho;Joo, Kyungsoo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.3
    • /
    • pp.41-48
    • /
    • 2018
  • The recent sudden increase of big data has characteristics such as continuous generation of data, large amount, and unstructured format. The existing relational database technologies are inadequate to handle such big data due to the limited processing speed and the significant storage expansion cost. Current implemented solutions are mainly based on relational database that are no longer adapted to these data volume. NoSQL solutions allow us to consider new approaches for data warehousing, especially from the multidimensional data management point of view. In this paper, we develop and propose the integrated design methodology based on MongoDB for big data applications. The proposed methodology is more scalable than the existing methodology, so it is easy to handle big data.

Big data-based piping material analysis framework in offshore structure for contract design

  • Oh, Min-Jae;Roh, Myung-Il;Park, Sung-Woo;Chun, Do-Hyun;Myung, Sehyun
    • Ocean Systems Engineering
    • /
    • v.9 no.1
    • /
    • pp.79-95
    • /
    • 2019
  • The material analysis of an offshore structure is generally conducted in the contract design phase for the price quotation of a new offshore project. This analysis is conducted manually by an engineer, which is time-consuming and can lead to inaccurate results, because the data size from previous projects is too large, and there are so many materials to consider. In this study, the piping materials in an offshore structure are analyzed for contract design using a big data framework. The big data technologies used include HDFS (Hadoop Distributed File System) for data saving, Hive and HBase for the database to handle the saved data, Spark and Kylin for data processing, and Zeppelin for user interface and visualization. The analyzed results show that the proposed big data framework can reduce the efforts put toward contract design in the estimation of the piping material cost.

Transaction Processing Method for NoSQL Based Column

  • Kim, Jeong-Joon
    • Journal of Information Processing Systems
    • /
    • v.13 no.6
    • /
    • pp.1575-1584
    • /
    • 2017
  • As interest in big data has increased recently, NoSQL, a solution for storing and processing big data, is getting attention. NoSQL supports high speed, high availability, and high scalability, but is limited in areas where data integrity is important because it does not support multiple row transactions. To overcome these drawbacks, many studies are underway to support multiple row transactions in NoSQL. However, existing studies have a disadvantage that the number of transactions that can be processed per unit of time is low and performance is degraded. Therefore, in this paper, we design and implement a multi-row transaction system for data integrity in big data environment based on HBase, a column-based NoSQL which is widely used recently. The multi-row transaction system efficiently performs multi-row transactions by adding columns to manage transaction information for every user table. In addition, it controls the execution, collision, and recovery of multiple row transactions through the transaction manager, and it communicates with HBase through the communication manager so that it can exchange information necessary for multiple row transactions. Finally, we performed a comparative performance evaluation with HAcid and Haeinsa, and verified the superiority of the multirow transaction system developed in this paper.