• Title/Summary/Keyword: batch processing

Search Result 297, Processing Time 0.027 seconds

Implementation of dual cluster service environment using a job batch scheduler and a container orchestration tool (작업 배치 스케줄러와 컨테이너 오케스트레이션 툴을 활용한 이중 클러스터 서비스 환경 구현)

  • Min-Woo Kwon;Gukhua Lee;Do-Sik An;Taeyoung Hong
    • Annual Conference of KIPS
    • /
    • 2024.10a
    • /
    • pp.13-15
    • /
    • 2024
  • KISTI 슈퍼컴퓨팅인프라센터에서는 AI 연구자들을 위해 GPU기반의 클러스터 시스템인 뉴론을 구축하여 서비스하고 있다. 뉴론은 기본적으로 작업 배치 스케줄러인 SLURM을 통해 자원 분배 서비스를 제공하고 있다. 최근 컨테이너 이미지 기반의 클라우드 서비스에 대한 요구가 많아지면서 뉴론에서도 컨테이너 오케스트레이션 툴을 활용한 서비스인 웹 기반의 MyKSC를 제공하고 있다. 본 논문에서는 작업 배치 스케줄러와 컨테이너 오케스트레이션 툴을 활용한 이중 클러스터 서비스 환경을 구현하는 기법에 대해서 소개한다.

PF-GEMV: Utilization maximizing architecture in fast matrix-vector multiplication for GPT-2 inference

  • Hyeji Kim;Yeongmin Lee;Chun-Gi Lyuh
    • ETRI Journal
    • /
    • v.46 no.5
    • /
    • pp.817-828
    • /
    • 2024
  • Owing to the widespread advancement of transformer-based artificial neural networks, artificial intelligence (AI) processors are now required to perform matrix-vector multiplication in addition to the conventional matrix-matrix multiplication. However, current AI processor architectures are optimized for general matrix-matrix multiplications (GEMMs), which causes significant throughput degradation when processing general matrix-vector multiplications (GEMVs). In this study, we proposed a port-folding GEMV (PF-GEMV) scheme employing multiformat and low-precision techniques while reusing an outer product-based processor optimized for conventional GEMM operations. This approach achieves 93.7% utilization in GEMV operations with an 8-bit format on an 8 × 8 processor, thus resulting in a 7.5 × increase in throughput compared with that of the original scheme. Furthermore, when applied to the matrix operation of the GPT-2 large model, an increase in speed by 7 × is achieved in single-batch inferences.

Comparison of Performance Between Incremental and Batch Learning Method for Information Analysis of Cyber Surveillance and Reconnaissance (사이버 감시정찰의 정보 분석에 적용되는 점진적 학습 방법과 일괄 학습 방법의 성능 비교)

  • Shin, Gyeong-Il;Yooun, Hosang;Shin, DongIl;Shin, DongKyoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.3
    • /
    • pp.99-106
    • /
    • 2018
  • In the process of acquiring information through the cyber ISR (Intelligence Surveillance Reconnaissance) and research into the agent to help decision-making, periodic communication between the C&C (Command and Control) server and the agent may not be possible. In this case, we have studied how to effectively surveillance and reconnaissance. Due to the network configuration, agents planted on infiltrated computers can not communicate seamlessly with C&C servers. In this case, the agent continues to collect data continuously, and in order to analyze the collected data within a short time in When communication is possible with the C&C server, it can utilize limited resources and time to continue its mission without being discovered. This research shows the superiority of incremental learning method over batch method through experiments. At an experiment with the restricted memory of 500 mega bytes, incremental learning method shows 10 times decrease in learning time. But at an experiment with the reuse of incorrectly classified data, the required time for relearn takes twice more.

Continuous Query Processing in Data Streams Using Duality of Data and Queries (데이타와 질의의 이원성을 이용한 데이타스트림에서의 연속질의 처리)

  • Lim Hyo-Sang;Lee Jae-Gil;Lee Min-Jae;Whang Kyu-Young
    • Journal of KIISE:Databases
    • /
    • v.33 no.3
    • /
    • pp.310-326
    • /
    • 2006
  • In this paper, we deal with a method of efficiently processing continuous queries in a data stream environment. We classify previous query processing methods into two dual categories - data-initiative and query-initiative - depending on whether query processing is initiated by selecting a data element or a query. This classification stems from the fact that data and queries have been treated asymmetrically. For processing continuous queries, only data-initiative methods have traditionally been employed, and thus, the performance gain that could be obtained by query-initiative methods has been overlooked. To solve this problem, we focus on an observation that data and queries can be treated symmetrically. In this paper, we propose the duality model of data and queries and, based on this model, present a new viewpoint of transforming the continuous query processing problem to a multi-dimensional spatial join problem. We also present a continuous query processing algorithm based on spatial join, named Spatial Join CQ. Spatial Join CQ processes continuous queries by finding the pairs of overlapping regions from a set of data elements and a set of queries defined as regions in the multi-dimensional space. The algorithm achieves the effects of both of the two dual methods by using the spatial join, which is a symmetric operation. Experimental results show that the proposed algorithm outperforms earlier methods by up to 36 times for simple selection continuous queries and by up to 7 times for sliding window join continuous queries.

Synthesis of Garnet in the Ca-Ce-Gd-Zr-Fe-O System (Ca-Gd-Ce-Zr-Fe-O계에서의 석류석 합성 연구)

  • Chae Soo-Chun;Jang Young-Nam;Bae In-Kook;Yudintsev S.V.
    • Economic and Environmental Geology
    • /
    • v.38 no.2 s.171
    • /
    • pp.187-196
    • /
    • 2005
  • Structural sites which cations can occupy in garnet structure are centers of the tetrahedron, octahedron, and distorted cube sharing edges with the tetrahedron and octahedron. Among them, the size of cation occuping at tetrahedral site (the center of tetrahedron) is closely related with the size of a unit cell of garnet. Accordingly, garnet containing iron with relative large ionic radii in tetrahedral site can be considered as a promising matrix for the immobilization of the elements with large ionic radii, such as actinides in radioactive wastes. We synthesized several garnets with the batch composition of $Ca_{1.5}GdCe_{0.5}ZrFeFe_3O_{12}$, and studied their properties and phase relations under various conditions. Mixed samples were fabricated in a pellet form under a pressure of $200{\~}400{\cal}kg/{\cal}cm^2$ and were sintered in the temperature range of $1100\~1400^{\circ}C$ in air and under oxygen atmospheres. Phase identification and chemical analysis of synthesized samples were conducted by XRD and SEM/EDS. In results, garnet was obtained as the main phase at $1300^{\circ}C$, an optimum condition in this system, even though some minor phases like perovskite and unknown phase were included. The compositions of garnet and perovskite synthesized from the batch composition of $Ca_{1.5}GdCe_{0.5}ZrFeFe_3O_{12}$ were ranged $[Ca_{l.2-1.8}Gd_{0.9-1.4}Ce_{0.3-0.5}]^{VIII}[Zr_{0.8-1.3}Fe_{0.7-1.2}]^{VI}[Fe_{2.9-3.1}]^{IV}O_{12}$ and $Ca_{0.1-0.5}Gd_{0.0-0.8}Ce_{0.1-0.5}\;Zr_{0.0-0.2}Fe_{0.9-1.1}O_3$, respectively. Ca content was exceeded and Ce content was depleted in the 8-coordinated site, comparing to the initial batch composition. This phenomena was closely related to the content of Zr and Fe in the 6-coordinated site.

A Study on the Estimation of Object's Dimension based on the Vision System Model of Extended Kalman filtering (확장칼만 필터링의 비젼시스템 모델을 이용한 물체 치수 측정에 관한 연구)

  • Jang, W.S.;Ahn, H.C.;Kim, K.S.
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.25 no.2
    • /
    • pp.110-116
    • /
    • 2005
  • It is very important to reduce the computational processing time for the application of the vision system in real time such as inspection, the determination of object's dimension and welding etc, because the vision system model involves a lot of measurement data acquired by CCD camera. Also, a lot of computation time is required in estimating the parameters in the vision system model if the iterative batch estimation method such as Newton Raphson is used. Thus, the effective computation method such as the Extended Kalman Filtering(EKF) is required to solve the above problems. The EKF has much advantages in that it takes explicitly into account the measurement uncertainties, and is a simple and efficient recursive procedures. Thus, this study is to develop the EKF algorithm to compute the parameters in the vision system model in real time. This vision system model involves the six parameters to account for the cameras inner and outer parameters. Also the EKF is applied to estimate the object's dimension. Finally, practicality of the estimation scheme of the vision system based on the EKF is verified experimently by performing the estimation of object's dimension.

Flotation for Improving Grade of Domestic Fine Coal (국내산(國內産) 미립(微粒) 석탄(石炭)의 품위향상(品位向上)을 위한 부유선별(浮遊選別) 연구(硏究))

  • Han, Oh-Hyung;Kim, Min-Gyu;Kim, Byoung-Gon
    • Resources Recycling
    • /
    • v.22 no.6
    • /
    • pp.64-72
    • /
    • 2013
  • The purpose of this study is to confirm the possibility of obtaining high grade coal from 57.39% of fixed carbon fine coal. Also, the mineralogical, physical/chemical and liberation characteristics are to be identified to decrease in ash amount, during the pre-processing of clean coal technology. In this study, batch flotation and CPT column flotation proper for the processing of fine particles were used with the variation in kinds and quantity of frother, collector and depressant. Also air flow rate and feeding rates were examined. As a result of batch flotation using 20% of pulp density DMU 101 collector(100 mL/ton), AF65 frother(300 mL/ton), sodium metaphosphate depressant (1 kg/ton), 67.57% of ash rejection and 70.90% of combustible recovery were obtained. The result of CPT column flotation was 85.59% of ash rejection and 88.97% of combustible recovery under the conditions of 5% of pulp density, DMU-101 collector (100 mL/ton), AF65 frother(10 L/ton), SMP depressant(1 kg/ton), wash water(100 mL/min.) and air flow rate(1,200 mL/min.).

Preparation of Seaweed Calcium Microparticles by Wet-grinding Process and their Particle Size Distribution Analysis (초미세습식분쇄공정의 공정변수에 따른 해조칼슘의 입자크기 분석)

  • Han, Min-Woo;Youn, Kwang-Sup
    • Food Engineering Progress
    • /
    • v.13 no.4
    • /
    • pp.269-274
    • /
    • 2009
  • The main objective of this study was to establish optimum condition of wet grinding process for manufacturing microparticulated seaweed calcium. Process parameters such as concentration of forming agent, rotor speed, bead size, feed rate, and grinding time were adapted during wet-grinding of seaweed calcium. The particle size range of the raw seaweed calcium was 10-20 $\mu$m. The calcium particles were reduced to under 1 $\mu$m as nano scale after grinding. Gum arabic was suitable for forming agent and 5%(w/v) concentration was the most effective in grinding efficiency. A wet-grinding process operated at 4,000 rpm rotor speed, 0,4 mm bead size, and 0.4 L/hr feeding rate, respectively, produced less than 600 mm(>>90%)-sized particles. In batch systems, 8 cycles of grinding showed higher efficiency, but 20 min of grinding time in continuous processing was more efficient to reduce particle size than the batch processing. Based on the result, the optimum conditions of the wet grinding process were established: operation time of 20 minutes, rotor speed of 4,000 rpm, bead size of 0.4 mm, feed rate of 40 mL/min and 30% mixing ration with water. The size of the resulting ultra fine calcium particles ranged between 40 and 660 mm.

Implementation of AI-based Object Recognition Model for Improving Driving Safety of Electric Mobility Aids (전동 이동 보조기기 주행 안전성 향상을 위한 AI기반 객체 인식 모델의 구현)

  • Je-Seung Woo;Sun-Gi Hong;Jun-Mo Park
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.23 no.3
    • /
    • pp.166-172
    • /
    • 2022
  • In this study, we photograph driving obstacle objects such as crosswalks, side spheres, manholes, braille blocks, partial ramps, temporary safety barriers, stairs, and inclined curb that hinder or cause inconvenience to the movement of the vulnerable using electric mobility aids. We develop an optimal AI model that classifies photographed objects and automatically recognizes them, and implement an algorithm that can efficiently determine obstacles in front of electric mobility aids. In order to enable object detection to be AI learning with high probability, the labeling form is labeled as a polygon form when building a dataset. It was developed using a Mask R-CNN model in Detectron2 framework that can detect objects labeled in the form of polygons. Image acquisition was conducted by dividing it into two groups: the general public and the transportation weak, and image information obtained in two areas of the test bed was secured. As for the parameter setting of the Mask R-CNN learning result, it was confirmed that the model learned with IMAGES_PER_BATCH: 2, BASE_LEARNING_RATE 0.001, MAX_ITERATION: 10,000 showed the highest performance at 68.532, so that the user can quickly and accurately recognize driving risks and obstacles.

Implementation of AI-based Object Recognition Model for Improving Driving Safety of Electric Mobility Aids (객체 인식 모델과 지면 투영기법을 활용한 영상 내 다중 객체의 위치 보정 알고리즘 구현)

  • Dong-Seok Park;Sun-Gi Hong;Jun-Mo Park
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.2
    • /
    • pp.119-125
    • /
    • 2023
  • In this study, we photograph driving obstacle objects such as crosswalks, side spheres, manholes, braille blocks, partial ramps, temporary safety barriers, stairs, and inclined curb that hinder or cause inconvenience to the movement of the vulnerable using electric mobility aids. We develop an optimal AI model that classifies photographed objects and automatically recognizes them, and implement an algorithm that can efficiently determine obstacles in front of electric mobility aids. In order to enable object detection to be AI learning with high probability, the labeling form is labeled as a polygon form when building a dataset. It was developed using a Mask R-CNN model in Detectron2 framework that can detect objects labeled in the form of polygons. Image acquisition was conducted by dividing it into two groups: the general public and the transportation weak, and image information obtained in two areas of the test bed was secured. As for the parameter setting of the Mask R-CNN learning result, it was confirmed that the model learned with IMAGES_PER_BATCH: 2, BASE_LEARNING_RATE 0.001, MAX_ITERATION: 10,000 showed the highest performance at 68.532, so that the user can quickly and accurately recognize driving risks and obstacles.