• Title/Summary/Keyword: threading

Search Result 155, Processing Time 0.03 seconds

Design of an ALU for SMT Microprocessors (SMT 마이크로프로세서에 적합한 ALU의 설계)

  • 김상철;홍인표;이용석
    • Proceedings of the IEEK Conference
    • /
    • 2003.07d
    • /
    • pp.1383-1386
    • /
    • 2003
  • In this paper, an ALU for Simultaneous Multi-Threading (SMT) microprocessors is designed. The SMT architecture improves notably performance and utilization of processes compared with conventional superscalar architectures by executing instructions from multiple threads at the same time. This ALU adopts data bypassing method to process multi-threads. And it can flush instructions in the same thread that generate exceptions such as branch misprediction. interrupt etc, performance of SMT microprocessors with data bypassing and exception handler can be improved.

  • PDF

Back-Office Process Agents and Reference Construction Framework for Internet Shopping Malls (인터넷 쇼핑몰 운영을 위한 후방 프로세스 에이전트와 참조 구축 프레임웍)

  • 박광호
    • Journal of Intelligence and Information Systems
    • /
    • v.5 no.1
    • /
    • pp.167-186
    • /
    • 1999
  • 인터넷 유통업은 기본적으로 대량 트랜잭션 발생을 목표로 한다. 본 논문에서는 인터넷 유통업의 대표적인 형태인 인터넷 쇼핑몰 운영을 위한 내부 프로세스 에이전트를 정의하고 이들의 참조 구축 프레임웍을 제시하고 있다. 인터넷 쇼핑몰의 후방 프로세스를 분석해 보았으며 이를 토대로 다양한 운영층 프로세스 에이전트 유형과 특성을 정의하였다. 또한, 다수의 에이전트로 구성된 프로세스 에이전트팀 조직과 활동 원칙도 제시하였다. 에이전트의 구현을 위해 멀티쓰레딩 기법을 사용하였다. 단순한 데이터 처리를 담당하는 운영층 프로세스 에이전트에 대한 연구는 향후 보다 복잡한 지능을 가진 전략층 프로세스 에이전트에 대한 연구로 발전할 것이다.

  • PDF

Performance Evaluation of a Simultaneous MultiThreading (동시 다중 쓰레딩을 이용하는 마이크로 프로세서 성능평가)

  • 이정훈;오영은;박형우;김진석
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04a
    • /
    • pp.1-3
    • /
    • 2003
  • 프로세서의 효율을 높이기 위한 방법으로 독립적인 쓰레드들을 한 프로세서 사이클에 동시에 실행시킬 수 있는 SMT 기술에 관한 많은 연구가 수행되어왔다[1, 2, 3, 4]. 많은 연구에서 SMT 기술에 대한 성능을 시뮬레이션 수준에서 측정하였기 때문에, 실제 환경에서 SMT 기술의 성능을 측정할 필요가 있다. 본 논문에서는 SMT 기술이 구현된 프로세서에서 각종 벤치마킹을 직접 수행해 봄으로써 실제 환경에서의 성능을 측정해 보았으며, 이를 기존의 SMP와의 비교를 통해 SMT 기술이 실제로 얼마만큼 좋은 성능을 낼 수 있는지 실험을 통해 보였다.

  • PDF

Stress Analysis in Bar Welding Process for Endless Rolling Application (연연속 압연 공정에서 판 접합 공정의 응력해석)

  • 정제숙;김호영;이종섭
    • Proceedings of the Korean Society for Technology of Plasticity Conference
    • /
    • 1999.08a
    • /
    • pp.327-335
    • /
    • 1999
  • A batch process in which transfer bars are rolled in single bar units involves many problems in terms of quality, yield and threading stability. The endless rolling process is an effective solution to such problems. In this, study, an analysis model is proposed to calculate the distribution of normal stress in endless rolling process. The model was examined by comparing with the result of experiment. A device using the spring is developed for improving the welding quality.

  • PDF

A Processor Architecture for 802.11 Wireless LAN Environment (802.11 Wireless LAN 환경에 적합한 프로세서 구조)

  • 전성재;홍인표;이용주;이용석;정진우
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.550-552
    • /
    • 2004
  • 최근 휴대폰, PDA, 노트북 등의 모바일 제품의 인기에 따라 모바일에 대한 소비자의 관심이 증대되고 있으며, 대형 네트워크 장비보다 소형의 개인 휴대용의 모바일 제품의 성장세가 두드러지고 있다. 이러한 추세에 따라 무선랜에 대한 관심도 증대되고 있다. 본 논문에서는 기존의 ARM 프로세서를 기반으로 802..11 무선랜 환경에 맞는 네트워크 프로세서 구조에 대한 연구를 수행하였다. 그 결과 전송과 수신이 빈번하게 동시에 일어나는 무선랜 환경에서는 multi-threading을 처리할 수 있는 프로세서가 구조(SMT)가 Superscalar 구조에 비해 높은 성능 향상 폭을 보여주었다

  • PDF

GPU-Based Acceleration of Quantum-Inspired Evolutionary Algorithm (GPU를 이용한 Quantum-Inspired Evolutionary Algorithm 가속)

  • Ryoo, Ji-Hyun;Park, Han-Min;Choi, Ki-Young
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.49 no.8
    • /
    • pp.1-9
    • /
    • 2012
  • Quantum-Inspired Evolutionary Algorithm(QEA) contains sufficient data-level parallelism to be naturally accelerated on GPUs. For an efficient reduction of execution time, however, careful task-mapping should be done to properly reflect the characteristics of CPU and GPU. Furthermore, when deciding which part of the application should run on GPU, we need to consider the data transfer between CPU and GPU memory spaces as well as the data-level parallelism. In addition, the usage of zero-copy host memory, proper choice of the execution configuration, and thread organization considering memory coalescing is important to further reduce the execution time. With all these techniques, we could run QEA 3.69 times faster on average in comparison with the multi-threading CPU for the case of 0-1 knapsack problem with 30,000 items.

Development of the software for high speed data transfer of the high-speed, large capacity data archive system for the storage of the correlation data from Korea-Japan Joint VLBI Correlator (KJJVC)

  • Park, Sun-Youp;Kang, Yong-Woo;Roh, Duk-Gyoo;Oh, Se-Jin;Yeom, Jae-Hwan;Sohn, Bong-Won;Yukitoshi, Kanya;Byun, Do-Young
    • Bulletin of the Korean Space Science Society
    • /
    • 2008.10a
    • /
    • pp.37.2-37.2
    • /
    • 2008
  • Korea-Japan Joint VLBI Correlator (KJJVC), to be used for Korean VLBI Network (KVN) in Korea Astronomy & Space Science Institute (KASI), is a high-speed calculator that outputs the correlation results in the maximum speed of 1.4GB/sec.To receive and record this data keeping up with this speed and with no loss, the design of the software running on the data archive system for receving and recording the output data from the correlator is very important. But, the simple kind of programming using just single thread that receives data from network and records it by turns, can cause a bottleneck effect while processing high speed data and a probable data loss, and cannot utilize the merit of hardwares supporting multi core or hyper threading, or operating systems supporting these hardwares. In this talk we summarize the design of the data transfer software for KJJVC and high speed, large capacity data archive system using general socket programming and multi threading techniques, and the pre-BMT(Bench Marking Test) results from the tests of the storage product providers' proposals using this software.

  • PDF

Computational Analysis of the 3-D structure of Human GPR87 Protein: Implications for Structure-Based Drug Design

  • Rani, Mukta;Nischal, Anuradha;Sahoo, Ganesh Chandra;Khattri, Sanjay
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.12
    • /
    • pp.7473-7482
    • /
    • 2013
  • The G-protein coupled receptor 87 (GPR87) is a recently discovered orphan GPCR which means that the search of their endogenous ligands has been a novel challenge. GPR87 has been shown to be overexpressed in squamous cell carcinomas (SCCs) or adenocarcinomas in lungs and bladder. The 3D structure of GPR87 was here modeled using two templates (2VT4 and 2ZIY) by a threading method. Functional assignment of GPR87 by SVM revealed that along with transporter activity, various novel functions were predicted. The 3D structure was further validated by comparison with structural features of the templates through Verify-3D, ProSA and ERRAT for determining correct stereochemical parameters. The resulting model was evaluated by Ramachandran plot and good 3D structure compatibility was evidenced by DOPE score. Molecular dynamics simulation and solvation of protein were studied through explicit spherical boundaries with a harmonic restraint membrane water system. A DRY-motif (Asp-Arg-Tyr sequence) was found at the end of transmembrane helix3, where GPCR binds and thus activation of signals is transduced. In a search for better inhibitors of GPR87, in silico modification of some substrate ligands was carried out to form polar interactions with Arg115 and Lys296. Thus, this study provides early insights into the structure of a major drug target for SCCs.

Implementation of a Scoreboard Array and a Port Arbiter for In-order SMT Processors (순차적 SMT Processor를 위한 Scoreboard Array와 포트 중재 모듈의 구현)

  • Heo, Chang-Yong;Hong, In-Pyo;Lee, Yong-Surk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.6
    • /
    • pp.59-70
    • /
    • 2004
  • SMT(Simultaneous Multi Threading) architecture uses TLP(Thread Level Parallelism) and increases processor throughput, such that issue slots can be filled with instructions from multiple independent threads. Having multiple ready threads reduces the probability that a functional unit is left idle, which increases processor efficiency. To utilize those advantages for the SMT processors, the issue unit must control the flow of instructions from different threads and not create conflicts among those instructions, which make the SMT issue logic extremely complex. Therefore, our SMT architecture, which is modeled in this paper, uses an in-order-issue and completion scheme, and therefore, can use a simple issue mechanism with a scoreboard already instead of using register renaming or a reorder buffer. However, an SMT scoreboarding mechanism is still more complex and costlier than that of a single threaded conventional processor. This paper proposes an optimal implementation of a scoreboarding mechanism for an ARM-based SMT architecture.

Directional adjacency-score function for protein fold recognition

  • Heo, Mu-Young;Cheon, Moo-Kyung;Kim, Suhk-Mann;Chung, Kwang-Hoon;Chang, Ik-Soo
    • Interdisciplinary Bio Central
    • /
    • v.1 no.2
    • /
    • pp.8.1-8.6
    • /
    • 2009
  • Introduction: It is a challenge to design a protein score function which stabilizes the native structures of many proteins simultaneously. The coarse-grained description of proteins to construct the pairwise-contact score function usually ignores the backbone directionality of protein structures. We propose a new two-body score function which stabilizes all native states of 1,006 proteins simultaneously. This two-body score function differs from the usual pairwise-contact functions in that it considers two adjacent amino acids at two ends of each peptide bond with the backbone directionality from the N-terminal to the C-terminal. The score is a corresponding propensity for a directional alignment of two adjacent amino acids with their local environments. Results and Discussion: We show that the construction of a directional adjacency-score function was achieved using 1,006 training proteins with the sequence homology less than 30%, which include all representatives of different protein classes. After parameterizing the local environments of amino acids into 9 categories depending on three secondary structures and three kinds of hydrophobicity of amino acids, the 32,400 adjacency-scores of amino acids could be determined by the perceptron learning and the protein threading. These could stabilize simultaneously all native folds of 1,006 training proteins. When these parameters are tested on the new distinct 382 proteins with the sequence homology less than 90%, 371 (97.1%) proteins could recognize their native folds. We also showed using these parameters that the retro sequence of the SH3 domain, the B domain of Staphylococcal protein A, and the B1 domain of Streptococcal protein G could not be stabilized to fold, which agrees with the experimental evidence.