Search | Korea Science

A Parallel Speech Recognition Model on Distributed Memory Multiprocessors (분산 메모리 다중프로세서 환경에서의 병렬 음성인식 모델)

정상화;김형순;박민욱;황병한
- The Journal of the Acoustical Society of Korea
- /
- v.18 no.5
- /
- pp.44-51
- /
- 1999
This paper presents a massively parallel computational model for the efficient integration of speech and natural language understanding. The phoneme model is based on continuous Hidden Markov Model with context dependent phonemes, and the language model is based on a knowledge base approach. To construct the knowledge base, we adopt a hierarchically-structured semantic network and a memory-based parsing technique that employs parallel marker-passing as an inference mechanism. Our parallel speech recognition algorithm is implemented in a multi-Transputer system using distributed-memory MIMD multiprocessors. Experimental results show that the parallel speech recognition system performs better in recognition accuracy than a word network-based speech recognition system. The recognition accuracy is further improved by applying code-phoneme statistics. Besides, speedup experiments demonstrate the possibility of constructing a realtime parallel speech recognition system.
PDF

Performance Evaluation and Analysis of Symmetric Multiprocessor using Multi-Program Benchmarks (Multi-Program 벤치마크를 이용한 대칭구조 Multiprocessor의 성능평가와 분석)

Jeong Tai-Kyeong
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.10 no.4
- /
- pp.645-651
- /
- 2006
This paper discusses computer system performance evaluation and analysis by employing a simulator which able to execute a symmetric multiprocessor in machine simulation environment. We also perform a multiprocessor system analysis using SPLASH-2, which is a suite of multi-program benchmarks for multiprocessors, to perform the behavior study of the symmetric multiprocessor OS kernel, IRIX5.3. To validate the scalability of symmetric multiprocessor system, we demonstrate structure and evaluation methods for symmetric multiprocessor as well as a functionality-based software simulator, SimOS. In this paper, we examine cache miss count and stall time on the symmetric multiprocessor between the local instruction and local data, using the multi-program benchmarks such as RADIX sorting algorithm and Cholesky factorization.
PDF KSCI

Enhancing Fixed Priority Scheduling Algorithms for Real-Time Tasks on Multiprocessors (다중처리기 상의 실시간 태스크를 위한 고정 우선순위 스케줄링 알고리즘의 성능 향상)

Park Minkyu;Han Sangchul;Kim HeeHeon;Cho Seongje;Cho Yookun
- Journal of KIISE:Computing Practices and Letters
- /
- v.11 no.1
- /
- pp.62-68
- /
- 2005
This paper presents a scheme to enhance fixed priority scheduling algorithms on multiprocessors. This scheme gives the highest priority to jobs with zero laxity and schedules them Prior to other jobs. A fixed priority algorithm employing this scheme strictly dominates the original one; it can schedule all task sets schedulable by the fixed priority algorithm and some task sets not schedulable by the fixed priority algorithm. Simulation results show that the proposed scheme improves fixed priority algorithms in terms of the number of schedulable task sets and schedulable utilization bound.
PDF KSCI

A Diagnosis Algorithm for Hypercube Multiprocessors using Adaptive Cube Partition Method (적응적 큐브 분할을 이용한 하이퍼큐브 진단 알고리즘)

Choi, Moon-Ok;Rhee, Chung-Sei
- Journal of KIISE:Computer Systems and Theory
- /
- v.27 no.4
- /
- pp.431-439
- /
- 2000
In this paper, we propose a system-level diagnosis algorithm for hypercube muti-processors using adaptive cube partition method. Feng[1] proposed a diagnosis algorithm for hypercube multiprocessors which gives a better performance compared to previous researches[2, 3]. But cube partitions in Feng's algorithm are performed without syndrome analysis. Therfore unnecessery overhead is made during cube partitions. In this paper, we propose an adaptive cube partition method which gives better partition through syndrome analysis and reduces diagnosis cost. We give a simulation result for comparisons. We have found that our algorithm shows better performance compared to Feng's method.
PDF

Size Reduction and Performance Analysis of the Bit-map Table Used in the Bus-based Shared Memory System (버스기반의 공유메모리 시스템에서 사용된 비트맵 테이블의 크기 축소와 성능 분석)

Woo, Jong-Jung;Lee, Ka-Young
- The Transactions of the Korea Information Processing Society
- /
- v.5 no.1
- /
- pp.24-32
- /
- 1998
The bus contention among bus-based shared-memory multiprocessors limits their performance. In addition, under split bus transaction environment, multiprocessors may make some memory requests unnecessary stand by in the memory access buffer, which makes system performance worse. This unnecessary stand-by can be eliminated by maintaining the bitmap table which contains the status bit for each memory block. However, this mechanism requires a great size of SRAM for the status information, which is fully mapped from the whole memory blocks. To solve this problem, we propose a bitmap cache which exploits partial mapping and locality of references. The simulation results show that the proposed system can greatly reduce the capacity of SRAM for the status information with little deteriorating its performance.
PDF

Low-power heterogeneous uncore architecture for future 3D chip-multiprocessors

Dorostkar, Aniseh;Asad, Arghavan;Fathy, Mahmood;Jahed-Motlagh, Mohammad Reza;Mohammadi, Farah
- ETRI Journal
- /
- v.40 no.6
- /
- pp.759-773
- /
- 2018
Uncore components such as on-chip memory systems and on-chip interconnects consume a large amount of energy in emerging embedded applications. Few studies have focused on next-generation analytical models for future chip-multiprocessors (CMPs) that simultaneously consider the impacts of the power consumption of core and uncore components. In this paper, we propose a convex-optimization approach to design heterogeneous uncore architectures for embedded CMPs. Our convex approach optimizes the number and placement of memory banks with different technologies on the memory layer. In parallel with hybrid memory architecting, optimizing the number and placement of through silicon vias as a viable solution in building three-dimensional (3D) CMPs is another important target of the proposed approach. Experimental results show that the proposed method outperforms 3D CMP designs with hybrid and traditional memory architectures in terms of both energy delay products (EDPs) and performance parameters. The proposed method improves the EDPs by an average of about 43% compared with SRAM design. In addition, it improves the throughput by about 7% compared with dynamic RAM (DRAM) design.
https://doi.org/10.4218/etrij.2017-0095 인용 PDF KSCI

Performance Improvement of Single Chip Multiprocessor using Concurrent Branch Execution (분기 동시 수행을 이용한 단일 칩 멀티프로세서의 성능 개선)

Lee, Seung-Ryul;Kim, Jun-Shik;Choi, Jae-Hyeok;Choi, Sang-Bang
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.44 no.2
- /
- pp.61-71
- /
- 2007
The instruction level parallelism, which has been used to improve the performance of processors, expose its limit. The change of a control flow by a branch miss prediction is one of the obstacles that restrict the instruction level parallelism. The single chip multiprocessors have been developed to utilize the thread level parallelism. However, we could not use the maximum performance of the single chip multiprocessor in case of executing the coded programs without considering the multi-thread. In order to overcome the two performance degradation factors, in this paper, we suggest the concurrent branch execution method that applies to the multi-path execution method at a single chip multiprocessor. We executes all two flows of the conditional branch using the idle core processor. Through this, we can improve the processor's efficiency with blocking the control flow termination by the branch instruction and reducing the idle time. We analyze the effects of concurrent branch execution proposed in this paper through the simulation. As a result of that, concurrent branch execution reduces about 20% of idle time and improves the maximum 10% of the branch prediction accuracy. We show that our scheme improves the overall performance of maximum 39% compared to the normal single chip multiprocessor and maximum 27% compared to the superscalar processor.
PDF KSCI

Parallel Speech Recognition on Distributed Memory Multiprocessors (분산 메모리 다중 프로세서 상에서의 병렬 음성인식)

윤지현;홍성태;정상화;김형순
- Proceedings of the Korean Information Science Society Conference
- /
- 1998.10a
- /
- pp.747-749
- /
- 1998
본 논문에서는 음성과 자연언어의 통합처리를 위한 효과적인 병렬 계산 모델을 제안한다. 음소모델은 continuous HMM에 기반을 둔 문맥종속형 음소를 사용하며, 언어모델은 knowledge-based approach를 사용한다. 또한 계층구조의 지식베이스상에서 다수의 가설을 처리하기 위해 memory-based parsing기술을 사용하였다. 본 연구의 병렬 음성인식 알고리즘은 분산메모리 MIMD 구조의 다중 Transputer 시스템을 이용하여 구현되었다. 실험을 통하여 음성인식 과정에서 발생하는 speech-specific problem의 해를 제공하고 음성인식 시스템의 병렬화를 통하여 실시간 음성인식의 가능성을 보여준다.
PDF

Parallel Integration for Real-Time Simulation (실시간 시뮬레이션을 위한 병렬적분)

Lee, W.S.;Samson, J.
- Transactions of the Korean Society of Automotive Engineers
- /
- v.2 no.1
- /
- pp.106-115
- /
- 1994
A parallel integration approach is proposed for real-time simulation of controlled mechanical systems. The proposed approach, which employs the dual-rate integration method in a parallel computing environment, is developed to deal with stiffness and high frequency characteristics of the controlled mechanical systems effectively. Numerical experiments are performed to demonstrate the effectiveness of the approach in shared memory multiprocessors, Alliant FX/8 and Alliant FX/80.
PDF

The architecture and performance evaluation of large programmable controller using the multiprocessors (다중 프로세서를 이용한 대형 Programmable Controller 구조 및 성능 해석)

박홍성;김종일;권욱현
- 제어로봇시스템학회:학술대회논문집
- /
- 1986.10a
- /
- pp.169-174
- /
- 1986
This thesis investigates the scanning time ; one of the most important performance index of Programmable Controller(PC). The multiprocessor architecture of the large PC considered in this thesis are classified as architecture 1 and architecture 2 by the bus control methods. The queuing model of each architecture is developed. Form the analysis it is observed that in the case of the number of processors less than 3 the best architecture of the large PC is the architecture 2 and in the case of the number of processors greater than 2 the best architecture of the large PC is the architecture 1.
PDF

Search Result 71, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)