• 제목/요약/키워드: Parallel program

검색결과 586건 처리시간 0.025초

Design of 32 bit Parallel Processor Core for High Energy Efficiency using Instruction-Levels Dynamic Voltage Scaling Technique

  • Yang, Yil-Suk;Roh, Tae-Moon;Yeo, Soon-Il;Kwon, Woo-H.;Kim, Jong-Dae
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제9권1호
    • /
    • pp.1-7
    • /
    • 2009
  • This paper describes design of high energy efficiency 32 bit parallel processor core using instruction-levels data gating and dynamic voltage scaling (DVS) techniques. We present instruction-levels data gating technique. We can control activation and switching activity of the function units in the proposed data technique. We present instruction-levels DVS technique without using DC-DC converter and voltage scheduler controlled by the operation system. We can control powers of the function units in the proposed DVS technique. The proposed instruction-levels DVS technique has the simple architecture than complicated DVS which is DC-DC converter and voltage scheduler controlled by the operation system and a hardware implementation is very easy. But, the energy efficiency of the proposed instruction-levels DVS technique having dual-power supply is similar to the complicated DVS which is DC-DC converter and voltage scheduler controlled by the operation system. We simulate the circuit simulation for running test program using Spectra. We selected reduced power supply to 0.667 times of the supplied power supply. The energy efficiency of the proposed 32 bit parallel processor core using instruction-levels data gating and DVS techniques can improve about 88.4% than that of the 32 bit parallel processor core without using those. The designed high energy efficiency 32 bit parallel processor core can utilize as the coprocessor processing massive data at high speed.

병렬 컴퓨팅을 이용한 DES 키 탐색 안정성 분석 (Evaluation of DES key search stability using Parallel Computing)

  • 윤준원;최장원;박찬열;공기식
    • 디지털콘텐츠학회 논문지
    • /
    • 제14권1호
    • /
    • pp.65-72
    • /
    • 2013
  • 기상, 바이오, 천문학, 암호학 등 다양한 분야의 대규모 작업을 처리하기 위하여 다수의 계산 자원을 동시에 사용하기 위한 병렬 컴퓨팅 기법들이 제안되어져 왔다. 병렬 컴퓨팅은 여러 프로세서에게 작업을 분담시켜 동시에 계산을 수행하게 함으로써 프로그램의 실행시간을 단축시킬 수 있을 뿐만 아니라 해결할 수 있는 문제의 규모를 확장 시킬 수 있다. 본 논문에서는 실제 암호 알고리즘 분석하기 위하여 병렬 처리 방식을 적용하여 그 효율성을 분석하였다. 암호 알고리즘의 실질적인 안전성 요소인 키의 길이는 전수조사 계산량에 의존한다. 이에 병렬 처리 환경에서 DES 키 탐색 암호 알고리즘의 키 전수조사 작업을 수행하기 위한 세부적인 절차에 대해서 논하였고, 클러스터링 장비에 적용하여 시뮬레이션 수행하였다. 그 결과 컴퓨터의 양에 따라서 계산량의 추이를 실증적으로 예측함으로써 암호 알고리즘의 안전성 강도를 측정할 수 있다.

원심 터보홴 설계용 프로그램의 개발 및 응용에 대한 연구 (A Study on the Development and Application of a Design Program for Centrifugal Turbo Fan)

  • 김장권;오석형
    • 동력기계공학회지
    • /
    • 제20권6호
    • /
    • pp.71-79
    • /
    • 2016
  • This paper introduces the design method of the centrifugal turbo fan and the process of developing the design program of it. The developed design program confirmed the applicability by experimental performance data. Here, we proposed new velocity coefficients and considered various losses such as impeller inlet loss, vane passage flow loss, casing pressure loss, recirculation loss power, and disk friction loss power. Especially, the inlet and outlet widths of the impeller were newly determined by reflecting the experimental results. As a result, this fan design program shows a good performance result regardless of the types of impeller and is expected to be a very useful design tool.

농촌 지역 퇴행성 관절염 노인을 대상으로 한 운동수행 의도 증진프로그램의 효과 (Effect of Program Promoting Intention to Exercise Performance Based Theory of Planned Behavior in the Elderly)

  • 김진순;현혜진
    • Journal of Korean Biological Nursing Science
    • /
    • 제17권1호
    • /
    • pp.1-10
    • /
    • 2015
  • Purpose: This study is aimed at grasping the benefit/effect of program promoting intention to exercise performance based theory of planned behavior in the elderly who live in the rural areas with degenerative joint diseases (DJDs). Methods: There were 2 groups; 32 people in the experimental group and 24 in the control group, all above the age of 60. Program promoting intention to exercise performance was applied to the experimental group for 12 weeks. Results: Compared to the control group, the experimental group showed a significant the increase of attitude towards exercise, subjective norm, perceived behavior control, exercising intention, and exercise performance. Also, pain as a physical function, joint stiffness, ADLs, body flexibility, parallel, perceived health state as a psychological function, and life satisfaction were significantly improved. Conclusion: We expect that program promoting intention to exercise performance is used in nursing practice for the elderly with DJDs are needed to manage lifestyle.

Dynamic Load Balancing Algorithm using Execution Time Prediction on Cluster Systems

  • Yoon, Wan-Oh;Jung, Jin-Ha;Park, Sang-Bang
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 ITC-CSCC -1
    • /
    • pp.176-179
    • /
    • 2002
  • In recent years, an increasing amount of computer network research has focused on the problem of cluster system in order to achieve higher performance and lower cost. The load unbalance is the major defect that reduces performance of a cluster system that uses parallel program in a form of SPMD (Single Program Multiple Data). Also, the load unbalance is a problem of MPP (Massive Parallel Processors), and distributed system. The cluster system is a loosely-coupled distributed system, therefore, it has higher communication overhead than MPP. Dynamic load balancing can solve the load unbalance problem of cluster system and reduce its communication cost. The cluster systems considered in this paper consist of P heterogeneous nodes connected by a switch-based network. The master node can predict the average execution time of tasks for each slave node based on the information from the corresponding slave node. Then, the master node redistributes remaining tasks to each node considering the predicted execution time and the communication overhead for task migration. The proposed dynamic load balancing uses execution time prediction to optimize the task redistribution. The various performance factors such as node number, task number, and communication cost are considered to improve the performance of cluster system. From the simulation results, we verified the effectiveness of the proposed dynamic load balancing algorithm.

  • PDF

NOW 환경에서 개선된 고정 분할 단위 알고리즘 (Refined fixed granularity algorithm on Networks of Workstations)

  • 구본근
    • 정보처리학회논문지A
    • /
    • 제8A권2호
    • /
    • pp.117-124
    • /
    • 2001
  • At NOW (Networks Of Workstations), the load sharing is very important role for improving the performance. The known load sharing strategy is fixed-granularity, variable-granularity and adaptive-granularity. The variable-granularity algorithm is sensitive to the various parameters. But Send algorithm, which implements the fixed-granularity strategy, is robust to task granularity. And the performance difference between Send and variable-granularity algorithm is not substantial. But, in Send algorithm, the computing time and the communication time are not overlapped. Therefore, long latency time at the network has influence on the execution time of the parallel program. In this paper, we propose the preSend algorithm. In the preSend algorithm, the master node can send the data to the slave nodes in advance without the waiting for partial results from the slaves. As the master node sent the next data to the slaves in advance, the slave nodes can process the data without the idle time. As stated above, the preSend algorithm can overlap the computing time and the communication time. Therefore we reduce the influence of the long latency time at the network and the execution time of the parallel program on the NOW. To compare the execution time of two algorithms, we use the $320{\times}320$ matrix multiplication. The comparison results of execution times show that the preSend algorithm has the shorter execution time than the Send algorithm.

  • PDF

Optimal Amplify-and-Forward Scheme for Parallel Relay Networks with Correlated Relay Noise

  • Liu, Binyue;Yang, Ye
    • ETRI Journal
    • /
    • 제36권4호
    • /
    • pp.599-608
    • /
    • 2014
  • This paper studies a parallel relay network where the relays employ an amplify-and-forward (AF) relaying scheme and are subjected to individual power constraints. We consider correlated effective relay noise arising from practical scenarios when the relays are exposed to common interferers. Assuming that the noise covariance and the full channel state information are available, we investigate the problem of finding the optimal AF scheme in terms of maximum end-to-end transmission rate. It is shown that the maximization problem can be equivalently transformed to a convex semi-definite program, which can be efficiently solved. Then an upper bound on the maximum achievable AF rate of this network is provided to further evaluate the performance of the optimal AF scheme. It is proved that the upper bound can be asymptotically achieved in two special regimes when the transmit power of the source node or the relays is sufficiently large. Finally, both theoretical and numerical results are given to show that, on average, noise correlation is beneficial to the transmission rate - whether the relays know the noise covariance matrix or not.

Dynamics of moored arctic spar interacting with drifting level ice using discrete element method

  • Jang, HaKun;Kim, MooHyun
    • Ocean Systems Engineering
    • /
    • 제11권4호
    • /
    • pp.313-330
    • /
    • 2021
  • In this study, the dynamic interaction between an Arctic Spar and drifting level ice is examined in time domain using the newly developed ice-hull-mooring coupled dynamics program. The in-house program, CHARM3D, which is the hull-riser-mooring coupled dynamic simulator is extended by coupling with the open-source discrete element method (DEM) simulator, LIGGGHTS. In the LIGGGHTS module, the parallel-bonding method is implemented to model the level ice using an assembly of multiple bonded spherical particles. As a case study, a spread-moored Artic Spar platform, whose hull surface near waterline is the inverted conical shape, is chosen. To determine the breaking-related DEM parameter (the critical bonding strength), the four-point numerical bending test is used. A series of numerical simulations is systematically performed under the various ice conditions including ice drift velocity, flexural strength, and thickness. Then, the effects of these parameters on the ice force, platform motions, and mooring tensions are discussed. The simulations reveal various features of dynamic interactions between the drifting ice and moored platform for various ice conditions including the novel synchronous resonance at low ice speed. The newly developed simulator is promising and can repeatedly be used for the future design and analysis including ice-floater-mooring coupled dynamics.

Building a Dynamic Analyzer for CUDA based System.

  • SALAH T. ALSHAMMARI
    • International Journal of Computer Science & Network Security
    • /
    • 제23권8호
    • /
    • pp.77-84
    • /
    • 2023
  • The utilization of GPUs on general-purpose computers is currently on the rise due to the increase in its programmability and performance requirements. The utility of tools like NVIDIA's CUDA have been designed to allow programmers to code algorithms by using C-like language for the execution process on the graphics processing units GPU. Unfortunately, many of the performance and correctness bugs will happen on parallel programs. The CUDA tool support for the parallel programs has not yet been actualized. The use of a dynamic analyzer to find performance and correctness bugs in CUDA programs facilitates the execution of sophisticated processes, especially in modern computing requirements. Any race conditions bug it will impact of program correctness and the share memory bank conflicts to improve the overall performance. The technique instruments the programs in a way that promotes accessibility of the memory locations accessed by different threads well as to check for any bugs in the code of a program. The instrumented source code will be used initiated directly in the device emulation code of CUDA to send report for the user about all errors. The current degree of automation helps programmers solve subtle bugs in highly complex programs or programs that cannot be analyzed manually.