• Title/Summary/Keyword: parallel computer processing

Search Result 652, Processing Time 0.026 seconds

Introduction and Improvement of Genetic Programming for Intelligent Fuzzy Robots

  • Murai, Yasuyuki;Matsumura, Koki;Tatsumi, Hisayuki;Tsuji, Hiroyuki;Tokumasu, Shinji
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.388-391
    • /
    • 2003
  • We've been following research on the obstacle avoidance that is based on fuzzy control. We previously proposed a new method of automatically generating membership functions, which play an important role in improving accuracy of fuzzy control, by using genetic programming (GP). In this paper, we made two improvements to our proposed method, for the purpose of achieving better intelligence in fuzzy robots. First, the mutation rate is made to change dynamically, according to the coupled chaotic system. Secondly, the population partitioning using deme is introduced by parallel processing. The effectiveness of these improvements is demonstrated through several computer simulations.

  • PDF

A Study on Parallel Performance Optimization Method for Acceleration of High Resolution SAR Image Processing (고해상도 SAR 영상처리 고속화를 위한 병렬 성능 최적화 기법 연구)

  • Lee, Kyu Beom;Kim, Gyu Bin;An, Sol Bo Reum;Cho, Jin Yeon;Lim, Byoung-Gyun;Kim, Dong-Hyun;Kim, Jeong Ho
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.46 no.6
    • /
    • pp.503-512
    • /
    • 2018
  • SAR(Synthetic Aperture Radar) is a technology to acquire images by processing signals obtained from radar, and there is an increasing demand for utilization of high-resolution SAR images. In this paper, for high-speed processing of high-resolution SAR image data, a study for SAR image processing algorithms to achieve optimal performance in multi-core based computer architecture is performed. The performance deterioration due to a large amount of input/output data for high resolution images is reduced by maximizing the memory utilization, and the parallelization ratio of the code is increased by using dynamic scheduling and nested parallelism of OpenMP. As a result, not only the total computation time is reduced, but also the upper bound of parallel performance is increased and the actual parallel performance on a multi-core system with 10 cores is improved by more than 8 times. The result of this study is expected to be used effectively in the development of high-resolution SAR image processing software for multi-core systems with large memory.

An Implementation and Verification of Performance Monitor for Parallel Signal Processing System (병렬신호처리시스템을 위한 성능 모니터의 구현 및 검증)

  • Lee Won-Joo;Kim Hyo-Nam
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.313-322
    • /
    • 2005
  • In this paper, we implement and verify performance monitor for parallel signal processing system, using DSP Starter Kit(DSK) of which the basic Processor is TMS302C6711 chip. The key ideas of this performance monitor is, using Real Time Data Exchange(RTDX) for the Purpose of real-time data transfer and function of DSP/BIOS, the ability to measure the Performance measure like DSP workload, memory usage, and bridge traffic. In the simulation, FFT, 2D FFT, Matrix Multiplication, and Fir Filter, which are widely used DSP algorithms, have been employed. Using performance monitor and Code Composer Studio from Texas Instrument(Tl) , the result has been recorded according to different frequencies, data sizes, and buffer sizes for a single wave file. The accuracy of our performance monitor has been verified by comparing those recorded results.

  • PDF

Implementation of Pedestrian Detection and Tracking with GPU at Night-time (GPU를 이용한 야간 보행자 검출과 추적 시스템 구현)

  • Choi, Beom-Joon;Yoon, Byung-Woo;Song, Jong-Kwan;Park, Jangsik
    • Journal of Broadcast Engineering
    • /
    • v.20 no.3
    • /
    • pp.421-429
    • /
    • 2015
  • This paper is about an approach for pedestrian detection and tracking with infrared imagery. We used the CUDA(Computer Unified Device Architecture) that is a parallel processing language in order to improve the speed of video-based pedestrian detection and tracking. The detection phase is performed by Adaboost algorithm based on Haar-like features. Adaboost classifier is trained with datasets generated from infrared images. After detecting the pedestrian with the Adaboost classifier, we proposed a particle filter tracking strategies on HSV histogram feature that exploit adaptively at the same time. The proposed approach is implemented on an NVIDIA Jetson TK1 developer board that is full-featured device ideal for software development within the Linux environment. In this paper, we presented the results of parallel processing with the NVIDIA GPU on the CUDA development environment for detection and tracking of pedestrians. We compared the object detection and tracking processing time for night-time images on both GPU and CPU. The result showed that the detection and tracking speed of the pedestrian with GPU is approximately 6 times faster than that for CPU.

Multi-Threaded Parallel H.264/AVC Decoder for Multi-Core Systems (멀티코어 시스템을 위한 멀티스레드 H.264/AVC 병렬 디코더)

  • Kim, Won-Jin;Cho, Keol;Chung, Ki-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.11
    • /
    • pp.43-53
    • /
    • 2010
  • Wide deployment of high resolution video services leads to active studies on high speed video processing. Especially, prevalent employment of multi-core systems accelerates researches on high resolution video processing based on parallelization of multimedia software. In this paper, we propose a novel parallel H.264/AVC decoding scheme on a multi-core platform. Parallel H.264/AVC decoding is challenging not only because parallelization may incur significant synchronization overhead but also because software may have complicated dependencies. To overcome such issues, we propose a novel approach called Multi-Threaded Parallelization(MTP). In MTP, to reduce synchronization overhead, a separate thread is allocated to each stage in the pipeline. In addition, an efficient memory reuse technique is used to reduce the memory requirement. To verify the effectiveness of the proposed approach, we parallelized FFmpeg H.264/AVC decoder with the proposed technique using OpenMP, and carried out experiments on an Intel Quad-Core platform. The proposed design performs better than FFmpeg H.264/AVC decoder before the parallelization by 53%. We also reduced the amount of memory usage by 65% and 81% for a high-definition(HD) and a full high-definition(FHD) video, respectively compared with that of popular existing method called 2Dwave.

Automatic Electronic Cleansing in Computed Tomography Colonography Images using Domain Knowledge

  • Manjunath, KN;Siddalingaswamy, PC;Prabhu, GK
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.18
    • /
    • pp.8351-8358
    • /
    • 2016
  • Electronic cleansing is an image post processing technique in which the tagged colonic content is subtracted from colon using CTC images. There are post processing artefacts, like: 1) soft tissue degradation; 2) incomplete cleansing; 3) misclassification of polyp due to pseudo enhanced voxels; and 4) pseudo soft tissue structures. The objective of the study was to subtract the tagged colonic content without losing the soft tissue structures. This paper proposes a novel adaptive method to solve the first three problems using a multi-step algorithm. It uses a new edge model-based method which involves colon segmentation, priori information of Hounsfield units (HU) of different colonic contents at specific tube voltages, subtracting the tagging materials, restoring the soft tissue structures based on selective HU, removing boundary between air-contrast, and applying a filter to clean minute particles due to improperly tagged endoluminal fluids which appear as noise. The main finding of the study was submerged soft tissue structures were absolutely preserved and the pseudo enhanced intensities were corrected without any artifact. The method was implemented with multithreading for parallel processing in a high performance computer. The technique was applied on a fecal tagged dataset (30 patients) where the tagging agent was not completely removed from colon. The results were then qualitatively validated by radiologists for any image processing artifacts.

An Improved Convex Hull Algorithm Considering Sort in Plane Point Set (평면 점집합에서 정렬을 고려한 개선된 컨벡스 헐 알고리즘)

  • Park, Byeong-Ju;Lee, Jae-Heung
    • Journal of IKEEE
    • /
    • v.17 no.1
    • /
    • pp.29-35
    • /
    • 2013
  • In this paper, we suggest an improved Convex Hull algorithm considering sort in plane point set. This algorithm has low computational complexity since processing data are reduced by characteristic of extreme points. Also it obtains a complete convex set with just one processing using an convex vertex discrimination criterion. Initially it requires sorting of point set. However we can't quickly sort because of its heavy operations. This problem was solved by replacing value and index. We measure the execution time of algorithms by generating a random set of points. The results of the experiment show that it is about 2 times faster than the existing algorithm.

Task Balancing Scheme of MPI Gridding for Large-scale LiDAR Data Interpolation (대용량 LiDAR 데이터 보간을 위한 MPI 격자처리 과정의 작업량 발란싱 기법)

  • Kim, Seon-Young;Lee, Hee-Zin;Park, Seung-Kyu;Oh, Sang-Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.9
    • /
    • pp.1-10
    • /
    • 2014
  • In this paper, we propose MPI gridding algorithm of LiDAR data that minimizes the communication between the cores. The LiDAR data collected from aircraft is a 3D spatial information which is used in various applications. Since there are many cases where the LiDAR data has too high resolution than actually required or non-surface information is included in the data, filtering the raw LiDAR data is required. In order to use the filtered data, the interpolation using the data structure to search adjacent locations is conducted to reconstruct the data. Since the processing time of LiDAR data is directly proportional to the size of it, there have been many studies on the high performance parallel processing system using MPI. However, previously proposed methods in parallel approach possess possible performance degradations such as imbalanced data size among cores or communication overhead for resolving boundary condition inconsistency. We conduct empirical experiments to verify the effectiveness of our proposed algorithm. The results show that the total execution time of the proposed method decreased up to 4.2 times than that of the conventional method on heterogeneous clusters.

A Parallel Approach for Accurate and High Performance Gridding of 3D Point Data (3D 점 데이터 그리딩을 위한 고성능 병렬처리 기법)

  • Lee, Changseop;Rizki, Permata Nur Miftahur;Lee, Heezin;Oh, Sangyoon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.3 no.8
    • /
    • pp.251-260
    • /
    • 2014
  • 3D point data is utilized in various industry domains for its high accuracy to the surface information of an object. It is substantially utilized in geography for terrain scanning and analysis. Generally, 3D point data need to be changed by Gridding which produces a regularly spaced array of z values from irregularly spaced xyz data. But it requires long processing time and high resource cost to interpolate grid coordination. Kriging interpolation in Gridding has attracted because Kriging interpolation has more accuracy than other methods. However it haven't been used frequently since a processing is complex and slow. In this paper, we presented a parallel Gridding algorithm which contains Kriging and an application of grid data structure to fit MapReduce paradigm to this algorithm. Experiment was conducted for 1.6 and 4.3 billions of points from Airborne LiDAR files using our proposed MapReduce structure and the results show that the total execution time is decreased more than three times to the convention sequential program on three heterogenous clusters.

Performance Analysis of a NOW According to the Number of Processes and Execution Time (프로세스의 수와 실행시간에 따른 NOW의 성능 분석)

  • 조수현;김영학
    • The Journal of the Korea Contents Association
    • /
    • v.2 no.3
    • /
    • pp.135-145
    • /
    • 2002
  • Recently, instead of a high-cost supercomputer, there haws been widely used a NOW system that consists of low-cost PCs and workstations connected all over the network In a NOW, performance for parallel processing depends on the computation pouter of each computer and communication time. Currently, a lot of methods have been proposed in order to increase the performance of parallel processing. However, the previous results have been studied in the view of balancing work load as the computation pouter of each computer. If a computer has multiple work precesses in a NOW, we can predict a decrease of communication tire needed in message passing, Therefore, in this paper, we analyzes factors of improving the performance in the view of work precesses, and evaluates experimently an effect on total performance as the number of work processes increases. Also, we propose a new broadcasting method to be used in experiment of this paper. This paper uses the LAM/MPI for an experimental evaluation.

  • PDF