• Title/Summary/Keyword: and Parallel Processing

Search Result 2,013, Processing Time 0.029 seconds

GPU-Optimized BVH and R-Triangle Methods for Rapid Self-Intersection Handling in Fabrics

  • Jong-Hyun Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.8
    • /
    • pp.59-65
    • /
    • 2024
  • In this paper, we present a GPU-based acceleration of computationally intensive self-collision processing in triangular mesh-based cloth simulation. For Compute Unified Device Architecture (CUDA)-based parallel optimization, we propose 1) an efficient way to build, update, and traverse the Bounding Volume Hierarchy (BVH) tree on the GPU, and 2) optimize the Representative-Triangle (R-Triangle) technique on the GPU to minimize primitive collision checking in triangular mesh-based cloth simulations. As a result, the proposed method can handle self-collisions and object collisions of cloth simulation in GPU environment faster and more efficiently than CPU-based algorithms, and experiments on various scenes show that it can achieve simulation results that are 5x to 10x faster. Since the proposed method is optimized for BVH on GPU, it can be easily integrated into various algorithms and fields that utilize BVH.

The 3-Phase Induction Motor Speed Control by the MRA-DSM controller (MRA-DSM 제어기를 이용한 3상 유도전동기의 속도 제어)

  • 원영진;한완옥;박진홍;이종규;이성백
    • The Proceedings of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.9 no.1
    • /
    • pp.54-62
    • /
    • 1995
  • This paper is a study on a speed control of an induction motor used the MRA-DSM(Mode1 Reference Adaptive-Discrete Sliding Mode) controller. In this paper, when controls motor speed, DSM algorithm is proposed for having Robustness against disturbance and parameter variation. and it is also proposed MRA-DSM including the additional load model reference algorithm, which can be compensated the discontinuous control imputs at sliding mode and followed the model Preference independent of parameter variation of control subjects. The control system is composed of the parallel processing control system using the microprocessor for maximizing the performance of control systems and the real time processing. Also it simplifies the hardware composed of controlling the system by software and improves the reliability of the system. And while MRA-DSM control, faster response characteristics of 27.2 % is obtained than DSM control.

  • PDF

Recommendation System Using Big Data Processing Technique (빅 데이터 처리 기법을 적용한 추천 시스템에 관한 연구)

  • Yun, So-Young;Youn, Sung-Dae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.6
    • /
    • pp.1183-1190
    • /
    • 2017
  • With the development of network and IT technology, people are searching and purchasing items they want, not bounded by places. Therefore, there are various studies on how to solve the scalability problem due to the rapidly increasing data in the recommendation system. In this paper, we propose an item-based collaborative filtering method using Tag weight and a recommendation technique using MapReduce method, which is a distributed parallel processing method. In order to improve speed and efficiency, the proposed method classifies items into categories in the preprocessing and groups according to the number of nodes. In each distributed node, data is processed by going through Map-Reduce step 4 times. In order to recommend better items to users, item tag weight is used in the similarity calculation. The experiment result indicated that the proposed method has been more enhanced the appropriacy compared to item-based method, and run efficiently on the large amounts of data.

VLSI Implementation of Low-Power Motion Estimation Using Reduced Memory Accesses and Computations (메모리 호출과 연산횟수 감소기법을 이용한 저전력 움직임추정 VLSI 구현)

  • Moon, Ji-Kyung;Kim, Nam-Sub;Kim, Jin-Sang;Cho, Won-Kyung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.5A
    • /
    • pp.503-509
    • /
    • 2007
  • Low-power motion estimation is required for video coding in portable information devices. In this paper, we propose a low-power motion estimation algorithm and 1-D systolic may VLSI architecture using full search block matching algorithm (FSBMA). Main power dissipation sources of FSBMA are complex computations and frequent memory accesses for data in the search area. In the proposed algorithm, memory accesses and computations are reduced by using 1D PE (processing array) array architecture performing motion estimation of two neighboring blocks in parallel and by skipping unnecessary computations during motion estimation. The VLSI implementation results of the algorithm show that the proposed VLSI architecture can save 9.3% power dissipation and can operate two times faster than an existing low-power motion estimator.

VLSI Architecture Design of Reconstruction Filter for Morphological Image Segmentation (형태학적 영상 분할을 위한 재구성 필터의 VLSI 구조 설계)

  • Lee, Sang-Yeol;Chung, Eui-Yoon;Lee, Ho-Young;Kim, Hee-Soo;Ha, Yeong-Ho
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.12
    • /
    • pp.41-50
    • /
    • 1999
  • In this paper, the new VLSI architecture of a reconstruction filter for morphological image segmentation is proposed. The filter, based on the $h_{max}$ operation, simplifies the interior of each region while preserving the boundary information. The proposed architecture adopts a partitioned memory structure and an efficient image scanning strategy to reduce the operations. The proposed memory partitioning scheme makes it possible that every data required for processing can be read from each memory at a time, resulting in parallel data processing. By the extended connectivity consideration, the operation is much decreased because more simplification is achieved in scanning stage. The selective raster scan strategy endows the satisfactory noise removal capability with negligible hardware complexity increase. The proposed architecture is designed using VHDL, and functional evaluation is performed by the CAD tool, Mentor. The experiment results show that the proposed architecture can simplify image profile with less than 18% operations of the conventional method.

  • PDF

Numerical investigations on stability evaluation of a jointed rock slope during excavation using an optimized DDARF method

  • Li, Yong;Zhou, Hao;Dong, Zhenxing;Zhu, Weishen;Li, Shucai;Wang, Shugang
    • Geomechanics and Engineering
    • /
    • v.14 no.3
    • /
    • pp.271-281
    • /
    • 2018
  • A jointed rock slope stability evaluation was simulated by a discontinuous deformation analysis numerical method to investigate the process and safety factors for different crack distributions and different overloading situations. An optimized method using Discontinuous Deformation Analysis for Rock Failure (DDARF) is presented to perform numerical investigations on the jointed rock slope stability evaluation of the Dagangshan hydropower station. During the pre-processing of establishing the numerical model, an integrated software system including AutoCAD, Screen Capture, and Excel is adopted to facilitate the implementation of the numerical model with random joint network. These optimizations during the pre-processing stage of DDARF can remarkably improve the simulation efficiency, making it possible for complex model calculation. In the numerical investigations on the jointed rock slope stability evaluations using the optimized DDARF, three calculation schemes have been taken into account in the numerical model: (I) no joint; (II) two sets of regular parallel joints; and (III) multiple sets of random joints. This model is capable of replicating the entire processes including crack initiation, propagation, formation of shear zones, and local failures, and thus is able to provide constructive suggestions to supporting schemes for the slope. Meanwhile, the overloading numerical simulations under the same three schemes have also been performed. Overloading safety factors of the three schemes are 5.68, 2.42 and 1.39, respectively, which are obtained by analyzing the displacement evolutions of key monitoring points during overloading.

InterCom : Design and Implementation of an Agent-based Internet Computing Environment (InterCom : 에이전트 기반 인터넷 컴퓨팅 환경 설계 및 구현)

  • Kim, Myung-Ho;Park, Kweon
    • The KIPS Transactions:PartA
    • /
    • v.8A no.3
    • /
    • pp.235-244
    • /
    • 2001
  • Development of network and computer technology results in many studies to use physically distributed computers as a single resource. Generally, these studies have focused on developing environments based on message passing. These environments are mainly used to solve problems for scientific computation and process in parallel suing inside parallelism of the given problems. Therefore, these environments provide high parallelism generally, while it is difficult to program and use as well as it is required to have user accounts in the distributed computers. If a given problem is divided into completely independent subproblems, more efficient environment can be provided. We can find these problems in bio-informatics, 3D animatin, graphics, and etc., so the development of new environment for these problems can be considered to be very important. Therefore, we suggest new environment called InterCom based on a proxy computing, which can solve these problems efficiently, and explain the implementation of this environment. This environment consists of agent, server, and client. Merits of this environment are easy programing, no need of user accounts in the distributed computers, and easiness by compiling distributed code automatically.

  • PDF

Ontology and Sequential Rule Based Streaming Media Event Recognition (온톨로지 및 순서 규칙 기반 대용량 스트리밍 미디어 이벤트 인지)

  • Soh, Chi-Seung;Park, Hyun-Kyu;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.43 no.4
    • /
    • pp.470-479
    • /
    • 2016
  • As the number of various types of media data such as UCC (User Created Contents) increases, research is actively being carried out in many different fields so as to provide meaningful media services. Amidst these studies, a semantic web-based media classification approach has been proposed; however, it encounters some limitations in video classification because of its underlying ontology derived from meta-information such as video tag and title. In this paper, we define recognized objects in a video and activity that is composed of video objects in a shot, and introduce a reasoning approach based on description logic. We define sequential rules for a sequence of shots in a video and describe how to classify it. For processing the large amount of increasing media data, we utilize Spark streaming, and a distributed in-memory big data processing framework, and describe how to classify media data in parallel. To evaluate the efficiency of the proposed approach, we conducted an experiment using a large amount of media ontology extracted from Youtube videos.

Realization of the Pulse Doppler Radar Signal Processor with an Expandable Feature using the Multi-DSP Based Morocco-2 Board (다중 DSP 구조의 Morocco-2 보드를 이용한 확장성을 갖는 펄스 도플러 레이다 신호처리기 구현)

  • 조명제;임중수
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.12 no.7
    • /
    • pp.1147-1156
    • /
    • 2001
  • In this paper, a new design architecture of radar signal processor in real time is proposed. It has been designed and implemented under the consideration to minimize the inter-processor communication overhead and to maintain the coherence in Doppler pulse domain and in range domain. Its structure can be easily reconfigured and reprogrammed in accordance with an addition of function algorithm or a modification of operational scenario. As we designed a task configuration for parallel processing from measures of computation time for function algorithms and transmission time for results by signal processing, data exchange between processors for performing of function algorithms could be fully removed. Morocco-2 board equipped ADSP-21060 processor of Analog Devices inc. and APEX-3.2 developed for SHARC DSP were used to construct the radar signal processor.

  • PDF

Non-Photorealistic Rendering Using CUDA-Based Image Segmentation (CUDA 기반 영상 분할을 사용한 비사실적 렌더링)

  • Yoon, Hyun-Cheol;Park, Jong-Seung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.11
    • /
    • pp.529-536
    • /
    • 2015
  • When rendering both three-dimensional objects and photo images together, the non-photorealistic rendering results are in visual discord since the two contents have their own independent color distributions. This paper proposes a non-photorealistic rendering technique which renders both three-dimensional objects and photo images such as cartoons and sketches. The proposed technique computes the color distribution property of the photo images and reduces the number of colors of both photo images and 3D objects. NPR is performed based on the reduced colormaps and edge features. To enhance the natural scene presentation, the image region segmentation process is preferred when extracting and applying colormaps. However, the image segmentation technique needs a lot of computational operations. It takes a long time for non-photorealistic rendering for large size frames. To speed up the time-consuming segmentation procedure, we use GPGPU for the parallel computing using the GPU. As a result, we significantly improve the execution speed of the algorithm.