• Title/Summary/Keyword: Big data Processing

Search Result 1,063, Processing Time 0.026 seconds

Developing an integrated System and Network performance monitoring environment for High-speed Big data transfer on ScienceDMZ technology (ScienceDMZ 기반 초고속 빅데이터 전송을 위한 시스템과 네트워크 통합 성능 모니터링 환경 개발)

  • Kim, Dong-Hak;Moon, Jeong-Hoon;Lee, Sang-gwon;Park, Jong-sun;Kim, Byung-Seo
    • Annual Conference of KIPS
    • /
    • 2018.10a
    • /
    • pp.110-113
    • /
    • 2018
  • 본 논문은 최근 데이터 집약형과학분야 연구개발의 비약적인 발전과 관측, 실험, 분석 장비들의 고도화에 따라 생산되는 과학데이터의 빅데이터화, 고부가가치화 등으로 연구 패러다임의 변화가 빅데이터 중심으로 가속되고 있다. 이러한 과학 빅데이터는 ExaByte 급의 대용량으로서 한 곳에서 관리되기보다는 전 세계적으로 분산되어 관리 운영되고 있다. 응용연구자들은 이러한 과학 빅데이터에 대한 초고속 전송/저장/공유에 대한 요구가 높아지고 있으며, 이러한 문제의 해결을 위해 ScienceDMZ 기반의 다양한 고속전송환경이 구축 개발되고 있다. 따라서 본 논문에서는 장러기 빅데이터 전송을 위한 ScienceDMZ의 핵심 기술인 DTN(Data Transfer Node)을 통한 빅데이터의 장거리 전송 시 고대역 네트워크 환경과 시스템 성능에 대한 통합 모니터링 환경을 구축 개발하였다.

Auto Configuration Module for Logstash in Elasticsearch Ecosystem

  • Ahmed, Hammad;Park, Yoosang;Choi, Jongsun;Choi, Jaeyoung
    • Annual Conference of KIPS
    • /
    • 2018.10a
    • /
    • pp.39-42
    • /
    • 2018
  • Log analysis and monitoring have a significant importance in most of the systems. Log management has core importance in applications like distributed applications, cloud based applications, and applications designed for big data. These applications produce a large number of log files which contain essential information. This information can be used for log analytics to understand the relevant patterns from varying log data. However, they need some tools for the purpose of parsing, storing, and visualizing log informations. "Elasticsearch, Logstash, and Kibana"(ELK Stack) is one of the most popular analyzing tools for log management. For the ingestion of log files configuration files have a key importance, as they cover all the services needed to input, process, and output the log files. However, creating configuration files is sometimes very complicated and time consuming in many applications as it requires domain expertise and manual creation. In this paper, an auto configuration module for Logstash is proposed which aims to auto generate the configuration files for Logstash. The primary purpose of this paper is to provide a mechanism, which can be used to auto generate the configuration files for corresponding log files in less time. The proposed module aims to provide an overall efficiency in the log management system.

A Study on the Cooling Energy Saving System for Data Centers Using Multi-Machine Learning (다중 기계 학습을 활용한 데이터 센터의 냉방 에너지 절감 시스템에 관한 연구)

  • Jang, Hyun-Cheol
    • Annual Conference of KIPS
    • /
    • 2019.05a
    • /
    • pp.458-460
    • /
    • 2019
  • 최근 클라우드 시스템 환경이 점차 늘어남에 따라 데이터 센터(IDC) 구축이 점차 늘어나가고 있다. 데이터 센터는 최근 부각하고 있는 4 차 산업 영역에서 사물 인터넷(IoT), 자율주행차 등 에서 처리될 대용량 데이터로 인한 이를 처리하는 중요한 역할을 담당하고 있다. 데이터센터 운영에는 대량의 에너지가 필요하다. 수 많은 컴퓨터에서 발생하는 열에너지를 처리하기 위하여 대량의 전력 냉방 에너지를 소비하고 있다. 냉방 공조 운영은 데이터 센터 운영에 중요한 역할을 한다. 이유는 많은 컴퓨터를 가동하는 비용보다 부대 시설로 운영되는 냉방 에너지를 보다 많이 소비하는 현상까지 발생하고 있다. 이에 최근 데이터 센터 냉방 공조 운영을 효율화하는 것에 연구를 맞추고 있다. 본 논문에서는 냉방 공조 운영 효율화 하도록 하기 위해서 다중 기계 학습을 활용한 데이터 센터의 냉방 에너지 절감 시스템을 제안하고자 한다. 기존의 단수 알고리즘을 활용하여 머신 러닝의 모델구현 방식이 아닌 다중의 기계 학습을 통하여 최적화된 모델을 일일 배치로 생성하여 예측을 하는 시스템이다. 본 시스템을 통하여 사전에 최적화된 냉방 운영을 하여 기존 데이터 센터의 운영되는 과다 냉방을 감축 시켜 에너지를 절감해주는 기능을 제공한다. 본 논문 시스템 연구 결과는 폭발적으로 늘어가고 있는 데이터 센터의 에너지 효율화에 기여할 수 있고, 클라우드 사업에서 경쟁력을 줄 수 있는 운영 시스템 방안을 제시한다.

Performance Enhancement and Evaluation of Distributed File System for Cloud (클라우드 분산 파일 시스템 성능 개선 및 평가)

  • Lee, Jong Hyuk
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.7 no.11
    • /
    • pp.275-280
    • /
    • 2018
  • The choice of a suitable distributed file system is required for loading large data and high-speed processing through subsequent applications in a cloud environment. In this paper, we propose a write performance improvement method based on GlusterFS and evaluate the performance of MapRFS, CephFS and GlusterFS among existing distributed file systems in cloud environment. The write performance improvement method proposed in this paper enhances the response time by changing the synchronization level used by the synchronous replication method from disk to memory. Experimental results show that the distributed file system to which the proposed method is applied is superior to other distributed file systems in the case of sequential write, random write and random read.

A Transformer-Based Emotion Classification Model Using Transfer Learning and SHAP Analysis (전이 학습 및 SHAP 분석을 활용한 트랜스포머 기반 감정 분류 모델)

  • Subeen Leem;Byeongcheon Lee;Insu Jeon;Jihoon Moon
    • Annual Conference of KIPS
    • /
    • 2023.05a
    • /
    • pp.706-708
    • /
    • 2023
  • In this study, we embark on a journey to uncover the essence of emotions by exploring the depths of transfer learning on three pre-trained transformer models. Our quest to classify five emotions culminates in discovering the KLUE (Korean Language Understanding Evaluation)-BERT (Bidirectional Encoder Representations from Transformers) model, which is the most exceptional among its peers. Our analysis of F1 scores attests to its superior learning and generalization abilities on the experimental data. To delve deeper into the mystery behind its success, we employ the powerful SHAP (Shapley Additive Explanations) method to unravel the intricacies of the KLUE-BERT model. The findings of our investigation are presented with a mesmerizing text plot visualization, which serves as a window into the model's soul. This approach enables us to grasp the impact of individual tokens on emotion classification and provides irrefutable, visually appealing evidence to support the predictions of the KLUE-BERT model.

A Study on Networking Technology for Cloud Data Centers (클라우드 데이터센터를 위한 네트워킹 기술에 관한 연구)

  • Choi, Jung-Yul
    • Journal of Digital Convergence
    • /
    • v.14 no.2
    • /
    • pp.235-243
    • /
    • 2016
  • Legacy data centers are transforming toward cloud data centers according to the advance of mobile and Internet of Things technology, processing of big data, and development of cloud computing technology. The goal of cloud data centers is to efficiently manage energy and facility, and to rapidly provide service demands to users by operating virtualized ICT(Information and Communication Technology) resources. Accordingly, it requires to configure and operate networks for efficiently providing virtualized ICT resources. This paper analyzes networking technologies suitable for cloud data centers and presents ways to efficiently operate the data center.

A Study on the Development Direction of Medical Image Information System Using Big Data and AI (빅데이터와 AI를 활용한 의료영상 정보 시스템 발전 방향에 대한 연구)

  • Yoo, Se Jong;Han, Seong Soo;Jeon, Mi-Hyang;Han, Man Seok
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.9
    • /
    • pp.317-322
    • /
    • 2022
  • The rapid development of information technology is also bringing about many changes in the medical environment. In particular, it is leading the rapid change of medical image information systems using big data and artificial intelligence (AI). The prescription delivery system (OCS), which consists of an electronic medical record (EMR) and a medical image storage and transmission system (PACS), has rapidly changed the medical environment from analog to digital. When combined with multiple solutions, PACS represents a new direction for advancement in security, interoperability, efficiency and automation. Among them, the combination with artificial intelligence (AI) using big data that can improve the quality of images is actively progressing. In particular, AI PACS, a system that can assist in reading medical images using deep learning technology, was developed in cooperation with universities and industries and is being used in hospitals. As such, in line with the rapid changes in the medical image information system in the medical environment, structural changes in the medical market and changes in medical policies to cope with them are also necessary. On the other hand, medical image information is based on a digital medical image transmission device (DICOM) format method, and is divided into a tomographic volume image, a volume image, and a cross-sectional image, a two-dimensional image, according to a generation method. In addition, recently, many medical institutions are rushing to introduce the next-generation integrated medical information system by promoting smart hospital services. The next-generation integrated medical information system is built as a solution that integrates EMR, electronic consent, big data, AI, precision medicine, and interworking with external institutions. It aims to realize research. Korea's medical image information system is at a world-class level thanks to advanced IT technology and government policies. In particular, the PACS solution is the only field exporting medical information technology to the world. In this study, along with the analysis of the medical image information system using big data, the current trend was grasped based on the historical background of the introduction of the medical image information system in Korea, and the future development direction was predicted. In the future, based on DICOM big data accumulated over 20 years, we plan to conduct research that can increase the image read rate by using AI and deep learning algorithms.

National Awareness of the 2019 World Swimming Championships using Big Data from Social Network Analysis (소셜네트워크 분석의 빅데이터를 활용한 2019세계수영선수권 대회의 국내 인식조사)

  • Kim, Gi-Tak
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.4
    • /
    • pp.173-184
    • /
    • 2019
  • The data processing of this study is based on the word data search in social media through textom and the big data analysis is carried out and three areas (2019 Gwangju World Swimming Championships, 2019 Gwangju World Swimming Masters Competition, 2019 World Swimming Championships Problem) was consistently handled through data collection and refinement in the web environment. We applied the collected words to the program of Ucinet6, visualized them, and conducted a CONCOR analysis to grasp the similar relationship of words and to identify the cluster of common factors. As a result of the analysis, the clusters related to the 2019 Gwangju World Swimming Championships mainly consisted of four major areas of recognition and perception, mainly searching for operational aspects related to the swimming championship, and the community related to the 2019 Gwangju World Swimming Masters Competition Is mainly searched for the promotion of the Masters Competition and the aspect of the competition divided into two areas of major recognition and peripheral recognition. The cluster related to the problems of the 2019 Gwangju World Swimming Championships is divided into five areas, And they are mainly searching for the place, operation, institution, event, etc. of the problem of the swimming championship.

Ethical and Legal Implications of AI-based Human Resources Management (인공지능(AI) 기반 인사관리의 윤리적·법적 영향)

  • Jungwoo Lee;Jungsoo Lee;Ji Hun kwon;Minyi Cha;Kyu Tae Kim
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.25 no.2
    • /
    • pp.100-112
    • /
    • 2024
  • This study investigates the ethical and legal implications of utilizing artificial intelligence (AI) in human resource management, with a particular focus on AI interviews in the recruitment process. AI, defined as the capability of computer programs to perform tasks associated with human intelligence such as reasoning, learning, and adapting, is increasingly being integrated into HR practices. The deployment of AI in recruitment, specifically through AI-driven interviews, promises efficiency and objectivity but also raises significant ethical and legal concerns. These concerns include potential biases in AI algorithms, transparency in AI decision-making processes, data privacy issues, and compliance with existing labor laws and regulations. By analyzing case studies and reviewing relevant literature, this paper aims to provide a comprehensive understanding of these challenges and propose recommendations for ensuring ethical and legal compliance in AI-based HR practices. The findings suggest that while AI can enhance recruitment efficiency, it is imperative to establish robust ethical guidelines and legal frameworks to mitigate risks and ensure fair and transparent hiring practices.

Identify the Failure Mode of Weapon System (or equipment) using Machine Learning (Machine Learning을 이용한 무기 체계(or 구성품) 고장 유형 식별)

  • Park, Yun-Kyung;Lee, Hye-Won;Kim, Sang-Moon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.8
    • /
    • pp.64-70
    • /
    • 2018
  • The development of weapon systems (or components) is hindered by the number of tests due to the limited development period and cost, which reduces the scale of accumulated data related to failures. Nevertheless, because a large amount of failure data and maintenance details during the operational period are managed by computerized data, the cause of failure of weapon systems (or components) can be analyzed using the data. On the other hand, analyzing the failure and maintenance details of various weapon systems is difficult because of the variation among groups and companies, and details of the cause of failure are described as unstructured text data. Fortunately, the recent developments of big data processing technology, machine learning algorithm, and improved HW computation ability have supported major research into various methods for processing the above unstructured data. In this paper, unstructured data related to the failure / maintenance of defense weapon systems (or components) is presented by applying doc2vec, a machine learning technique, to analyze the failure cases.