• Title/Summary/Keyword: and Parallel Processing

Search Result 2,013, Processing Time 0.031 seconds

FFT에 기반한 병렬 디지털 신호처리시스템의 성능분석

  • 박준석;전창호;박성주;이동호;오원천;한기택
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.1
    • /
    • pp.3-9
    • /
    • 1999
  • This paper concerns performance of a parallel digital signal processing system. The performance of the system is analyzed in terms of CPU cycles required for 1024-point FFT computation. The number of cycles is estimated in three different approaches; FFT algorithm-based, assembly level source code-based, and probability-based. The results of analysis indicate that on a bus-based system the best performance for FFT is achieved with a single board. Because in some applications like FFT, where frequent data exchanges among processors occur, the number of communication cycles increases as the number of boards. It is observed that inter-board communication degrades overall system performance for the FFT computation. Also shown is that linear increase in performance can be obtained if multiple buses are employed.

  • PDF

A Comparison Algorithm of Rectangularly Partitioned Regions (직사각형으로 분할된 영역 비교 알고리즘)

  • Jung, Hae-Jae
    • Convergence Security Journal
    • /
    • v.6 no.2
    • /
    • pp.53-60
    • /
    • 2006
  • In the applications such as CAD or image processing, a variety of geometric objects are manipulated. A polygon in which all the edges are parallel to x- or y-axis is decomposed into simple rectangles for efficient handling. But, depending on the partitioning algorithms, the same region can be decomposed into a completely different set of rectangles in the number, size and shape of rectangles. So, it is necessary an algorithm that compares two sets of rectangles extracted from two scenes such as CAD or image to see if they represent the same region. This paper proposes an efficient algorithm that compares two sets of rectangles. The proposed algorithm is not only simpler than the algorithm based on sweeping method, but also reduces the number $O(n^2)$ of overlapped rectangles from the algorithm based on a balanced binary tree to O(nlogn).

  • PDF

An Application of Generic Algorithms to the Distribution System Loss Minimization Re-cofiguration Problem (배전손실 최소화 문제에 있어서 유전알고리즘의 수속특성에 관한 연구)

  • Choi, Dai-Seub;Lee, Sang-Il;Oh, Geum-Kon;Kim, Chang-Suk;Choi, Chang-Joo
    • Proceedings of the KIEE Conference
    • /
    • 2001.07a
    • /
    • pp.6-9
    • /
    • 2001
  • This paper presents a new method which applies a genetic algorithm(GA) for determining which sectionalizing switch to operate in order to solve the distribution system loss minimization re-configuration problem. The distribution system loss minimization re-configuration problem is in essence a 0-1 planning problem which means that for typical system scales the number of combinations requiring searches becomes extremely large. In order to deal with this problem, a new approach which applies a GA was presented. Briefly, GA are a type of random number search method, however, they incorporate a multi-point search feature. Further, every point is not is not separately and respectively renewed, therefore, if parallel processing is applied, we can expect a fast solution algorithm to result.

  • PDF

Program Slicing in the Presence of Complicated Data Structure (복잡한 자료 구조를 지니는 프로그램 슬라이싱)

  • Ryu, Ho-Yeon;Park, Joong-Yang;Park, Jae-Heung
    • The KIPS Transactions:PartD
    • /
    • v.10D no.6
    • /
    • pp.999-1010
    • /
    • 2003
  • Program slicing is s method to extract the statements from the program which have an influence on the value of a variable at a paricular point of the program. Program slicing is applied for many applications, such as program degugging, program testing, program integration, parallel program execution, software metrics, reverse engineering, and software maintenance, etc. This paper is the study to create the exact slice in the presence of Object Reference State Graph to generate more exactly static analysis information of objects in the program of the presence of complicated data structure.

Structure of Low-Power MOS Current-Mode Logic Circuit with Sleep-Transistor (슬립 트랜지스터를 이용한 저 전력 MOS 전류모드 논리회로 구조)

  • Kim, Jeong-Beom
    • The KIPS Transactions:PartA
    • /
    • v.15A no.2
    • /
    • pp.69-74
    • /
    • 2008
  • This paper proposes a structure of low-power MOS current-mode logic circuit with sleep-transistor to reduce the leakage current. The sleep-transistor is used to high-threshold voltage transistor to minimize the leakage current. The $16\;{\times}\;16$ bit parallel multiplier is designed by the proposed circuit structure. Comparing with the conventional MOS current-model logic circuit, the circuit achieves the reduction of the power consumption in sleep mode by 1/50. This circuit is designed with Samsung $0.35\;{\mu}m$ CMOS process. The validity and effectiveness are verified through the HSPICE simulation.

Alarm Diagnosis Monitoring System of RCP using Self Dynamic Neural Networks (자기 동적 신경망을 이용한 RCP의 경보 진단 시스템)

  • Ryoo, Dong-Wan;Kim, Dong-Hoon;Lee, Cheol-Kwon;Seong, Seung-Hwan;Seo, Bo-Hyeok
    • Proceedings of the KIEE Conference
    • /
    • 2000.07d
    • /
    • pp.2488-2491
    • /
    • 2000
  • A Neural network is possible to nonlinear function mapping and parallel processing. Therefore It has been developing for a Diagnosis system of nuclear plower plant. In general Neural Networks is a static mapping but Dynamic Neural Network(DNN) is dynamic mapping. When a fault occur in system, a state of system is changed with transient state. Because of a previous state signal is considered as a information. DNN is better suited for diagnosis systems than static neural network. But a DNN has many weights, so a real time implementation of diagnosis system is in need of a rapid network architecture. This paper presents a algorithm for RCP monitoring Alarm diagnosis system using Self Dynamic Neural Network(SDNN). SDNN has considerably fewer weights than a general DNN. Since there is no interlink among the hidden layer. The effectiveness of Alarm diagnosis system using the proposed algorithm is demonstrated by applying to RCP monitoring in Nuclear power plant.

  • PDF

Extending Caffe for Machine Learning of Large Neural Networks Distributed on GPUs (대규모 신경회로망 분산 GPU 기계 학습을 위한 Caffe 확장)

  • Oh, Jong-soo;Lee, Dongho
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.7 no.4
    • /
    • pp.99-102
    • /
    • 2018
  • Caffe is a neural net learning software which is widely used in academic researches. The GPU memory capacity is one of the most important aspects of designing neural net architectures. For example, many object detection systems require to use less than 12GB to fit a single GPU. In this paper, we extended Caffe to allow to use more than 12GB GPU memory. To verify the effectiveness of the extended software, we executed some training experiments to determine the learning efficiency of the object detection neural net software using a PC with three GPUs.

Optimizing 360 Video Parallel Processing for Asymmetric Core in Mobile VR (모바일 VR 을 위한 비대칭 코어에 최적화된 360 비디오 병렬처리)

  • Roh, Hyun-Joon;Ryu, Yeongil;Ryu, Eun-Seok
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.06a
    • /
    • pp.96-99
    • /
    • 2018
  • 최근 초고화질 영상뿐만 아니라 360 비디오 콘텐츠의 보급이 확산되고 있다. 이미 대중적으로 보급된 스마트폰을 통해 누구나 쉽게 이 360 비디오 콘텐츠를 접할 수 있는데, 스마트폰의 성능은 제한적일 수 밖에 없다. 따라서 본 논문은 모바일 VR 에서 360 비디오 병렬처리를 수행할 때 보다 적합한 최적화 방법 2 가지를 소개한다. 이를 위해 전력 소모를 줄이는 장점으로 인해 모바일 디바이스에 많이 사용되는 비대칭 멀티코어의 특징을 활용한다. 두 방법 모두 공통적으로 각 코어의 성능비와 할당되는 작업량을 비례하게 하여 디코딩 작업의 시간을 줄이는 방법들이다. 첫 번째 방법은 영상을 타일로 분할할 때 각 코어의 성능비와 비례하게 분할하는 방법이다. 해당 기법을 적용하기 위해서, 비디오 크기별 연산 복잡도 분석 모델을 활용하여 사용한다. 제안하는 기법을 사용한 실험 결과, 평균적으로 약 25%의 디코딩 시간 개선을 보였다. 두 번째 방법은 타일로 분할된 영상의 각 복잡도 정도를 PU 의 양으로 추정하여, 각 코어의 성능비와 비례하게 코어에 할당하는 방법이다. 해당 기법을 사용하기 위해서, PU 의 양과 연산 복잡도 정도의 상관관계를 회귀분석하여 이를 이용한다. 제안하는 기법을 사용한 실험 결과, 약 9~16%의 디코딩 시간 개선을 보였다.

  • PDF

R-lambda Model based Rate Control for GOP Parallel Coding in A Real-Time HEVC Software Encoder (HEVC 실시간 소프트웨어 인코더에서 GOP 병렬 부호화를 지원하는 R-lambda 모델 기반의 율 제어 방법)

  • Kim, Dae-Eun;Chang, Yongjun;Kim, Munchurl;Lim, Woong;Kim, Hui Yong;Seok, Jin Wook
    • Journal of Broadcast Engineering
    • /
    • v.22 no.2
    • /
    • pp.193-206
    • /
    • 2017
  • In this paper, we propose a rate control method based on the $R-{\lambda}$ model that supports a parallel encoding structure in GOP levels or IDR period levels for 4K UHD input video in real-time. For this, a slice-level bit allocation method is proposed for parallel encoding instead of sequential encoding. When a rate control algorithm is applied in the GOP level or IDR period level parallelism, the information of how many bits are consumed cannot be shared among the frames belonging to a same frame level except the lowest frame level of the hierarchical B structure. Therefore, it is impossible to manage the bit budget with the existing bit allocation method. In order to solve this problem, we improve the bit allocation procedure of the conventional ones that allocate target bits sequentially according to the encoding order. That is, the proposed bit allocation strategy is to assign the target bits in GOPs first, then to distribute the assigned target bits from the lowest depth level to the highest depth level of the HEVC hierarchical B structure within each GOP. In addition, we proposed a processing method that is used to improve subjective image qualities by allocating the bits according to the coding complexities of the frames. Experimental results show that the proposed bit allocation method works well for frame-level parallel HEVC software encoders and it is confirmed that the performance of our rate controller can be improved with a more elaborate bit allocation strategy by using the preprocessing results.

Investigation and Processing of Seismic Reflection Data Collected from a Water-Land Area Using a Land Nodal Airgun System (수륙 경계지역에서 얻어진 육상 노달 에어건 탄성파탐사 자료의 고찰 및 자료처리)

  • Lee, Donghoon;Jang, Seonghyung;Kang, Nyeonkeon;Kim, Hyun-do;Kim, Kwansoo;Kim, Ji-Soo
    • The Journal of Engineering Geology
    • /
    • v.31 no.4
    • /
    • pp.603-620
    • /
    • 2021
  • A land nodal seismic system was employed to acquire seismic reflection data using stand-alone cable-free receivers in a land-river area. Acquiring reliable data using this technology is very cost effective, as it avoids topographic problems in the deployment and collection of receivers. The land nodal airgun system deployed on the mouth of the Hyungsan River (in Pohang, Gyeongsangbuk Province) used airgun sources in the river and receivers on the riverbank, with subparallel source and receiver lines, approximately 120 m-spaced. Seismic data collected on the riverbank are characterized by a low signal-to-noise (S/N) and inconsistent reflection events. Most of the events are represented by hyperbola in the field records, including direct waves, guided waves, air waves, and Scholte surface waves, in contrast to the straight lines in the data collected conventionally where source and receiver lines are coincident. The processing strategy included enhancing the signal behind the low-frequency large-amplitude noise with a cascaded application of bandpass and f-k filters for the attenuation of air waves. Static time delays caused by the cross-offset distance between sources and receivers are corrected, with a focus on mapping the shallow reflections obscured by guided wave and air wave noise. A new time-distance equation and curve for direct and air waves are suggested for the correction of the static time delay caused by the cross-offset between source and receiver. Investigation of the minimum cross-offset gathers shows well-aligned shallow reflections around 200 ms after time-shift correction. This time-delay static correction based on the direct wave is found essential to improving the data from parallel source and receiver lines. Data acquisition and processing strategies developed in this study for land nodal airgun seismic systems will be readily applicable to seismic data from land-sea areas when high-resolution signal data becomes available in the future for investigation of shallow gas reservoirs, faults, and engineering designs for the development of coastal areas.