• Title/Summary/Keyword: and Parallel Processing

Search Result 2,013, Processing Time 0.028 seconds

12-bit SAR A/D Converter with 6MSB sharing (상위 6비트를 공유하는 12 비트 SAR A/D 변환기)

  • Lee, Ho-Yong;Yoon, Kwang-Sub
    • Journal of IKEEE
    • /
    • v.22 no.4
    • /
    • pp.1012-1018
    • /
    • 2018
  • In this paper, CMOS SAR (Successive Approximation Register) A/D converter with 1.8V supply voltage is designed for IoT sensor processing. This paper proposes design of a 12-bit SAR A/D converter with two A / D converters in parallel to improve the sampling rate. A/D converter1 of the two A/D converters determines all the 12-bit bits, and another A/D converter2 uses the upper six bits of the other A/D converters to minimize power consumption and switching energy. Since the second A/D converter2 does not determine the upper 6 bits, the control circuits and SAR Logic are not needed and the area is minimized. In addition, the switching energy increases as the large capacitor capacity and the large voltage change in the C-DAC, and the second A/D converter does not determine the upper 6 bits, thereby reducing the switching energy. It is also possible to reduce the process variation in the C-DAC by proposed structure by the split capacitor capacity in the C-DAC equals the unit capacitor capacity. The proposed SAR A/D converter was designed using 0.18um CMOS process, and the supply voltage of 1.8V, the conversion speed of 10MS/s, and the Effective Number of Bit (ENOB) of 10.2 bits were measured. The area of core block is $600{\times}900um^2$, the total power consumption is $79.58{\mu}W$, and the FOM (Figure of Merit) is 6.716fJ / step.

Implementation of High-radix Modular Exponentiator for RSA using CRT (CRT를 이용한 하이래딕스 RSA 모듈로 멱승 처리기의 구현)

  • 이석용;김성두;정용진
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.10 no.4
    • /
    • pp.81-93
    • /
    • 2000
  • In a methodological approach to improve the processing performance of modulo exponentiation which is the primary arithmetic in RSA crypto algorithm, we present a new RSA hardware architecture based on high-radix modulo multiplication and CRT(Chinese Remainder Theorem). By implementing the modulo multiplier using radix-16 arithmetic, we reduced the number of PE(Processing Element)s by quarter comparing to the binary arithmetic scheme. This leads to having the number of clock cycles and the delay of pipelining flip-flops be reduced by quarter respectively. Because the receiver knows p and q, factors of N, it is possible to apply the CRT to the decryption process. To use CRT, we made two s/2-bit multipliers operating in parallel at decryption, which accomplished 4 times faster performance than when not using the CRT. In encryption phase, the two s/2-bit multipliers can be connected to make a s-bit linear multiplier for the s-bit arithmetic operation. We limited the encryption exponent size up to 17-bit to maintain high speed, We implemented a linear array modulo multiplier by projecting horizontally the DG of Montgomery algorithm. The H/W proposed here performs encryption with 15Mbps bit-rate and decryption with 1.22Mbps, when estimated with reference to Samsung 0.5um CMOS Standard Cell Library, which is the fastest among the publications at present.

Design and Performance Analysis of a DS/CDMA Multiuser Detection Algorithm in a Mixed Structure Form (혼합구조 형태의 DS/CDMA 다중사용자 검파 알고리즘 설계 및 성능 분석)

  • Lim, Jong-Min
    • Journal of the Institute of Electronics Engineers of Korea TE
    • /
    • v.39 no.3
    • /
    • pp.51-58
    • /
    • 2002
  • The conventional code division multiple access(CDMA) detector shows severe degradation in communication quality as the number of users increases due to multiple access interferences(MAI). This problem thus restricts the user capacity. Various multiuser detection algorithms have been proposed to overcome the MAI problem. The existing detectors can be generally classified into one of the two categories : linear multiuser detection and subtractive interference cancellation detectors. In the linear multiuser detection, a linear transform is applied to the soft outputs of the conventional detector. In the subtractive interference cancellation detection, estimates of the interference are generated and subtracted out from the received signal. There has been great interest in the family of the subtractive interference cancellation detection because the linear multiuser detection exhibits the disadvantage of taking matrix inversion operations. The successive interference cancellation (SIC) and the parallel interference cancellation (PIC) are the two most popular structures in the subtractive interference cancellation detector. The SIC structure is very simple in hardware complexity, but has the disadvantage of increased processing delay time, while the PIC structure is good in performance, but shows the disadvantage of increased hardware complexity. In this paper we propose a mixed structure form of SIC and PIC in order to achieve good performance as well as simple hardware complexity. A performance analysis of the proposed scheme has been made, and the superior characteristics of the mixed structure are demonstrated by extensive computer simulations. 

Radix-4 Trellis Parallel Architecture and Trace Back Viterbi Decoder with Backward State Transition Control (Radix-4 트렐리스 병렬구조 및 역방향 상태천이의 제어에 의한 역추적 비터비 디코더)

  • 정차근
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.5
    • /
    • pp.397-409
    • /
    • 2003
  • This paper describes an implementation of radix-4 trellis parallel architecture and backward state transition control trace back Viterbi decoder, and presents the application results to high speed wireless LAN. The radix-4 parallelized architecture Vietrbi decoder can not only improve the throughput with simple structure, but also have small processing delay time and overhead circuit compared to M-step trellis architecture one. Based on these features, this paper addresses a novel Viterbi decoder which is composed of branch metric computation, architecture of ACS and trace back decoding by sequential control of backward state transition for the implementation of radix-4 trellis parallelized structure. With the proposed architecture, the decoding of variable code rate due to puncturing the base code can easily be implemented by the unified Viterbi decoder. Moreover, any additional circuit and/or peripheral control logic are not required in the proposed decoder architecture. The trace back decoding scheme with backward state transition control can carry out the sequential decoding according to ACS cycle clock without additional circuit for survivor memory control. In order to evaluate the usefulness, the proposed method is applied to channel CODEC of the IEEE 802.11a high speed wireless LAN, and HDL coding simulation results are presented.

Design of an Efficient Bit-Parallel Multiplier using Trinomials (삼항 다항식을 이용한 효율적인 비트-병렬 구조의 곱셈기)

  • 정석원;이선옥;김창한
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.13 no.5
    • /
    • pp.179-187
    • /
    • 2003
  • Recently efficient implementation of finite field operation has received a lot of attention. Among the GF($2^m$) arithmetic operations, multiplication process is the most basic and a critical operation that determines speed-up hardware. We propose a hardware architecture using Mastrovito method to reduce processing time. Existing Mastrovito multipliers using the special generating trinomial p($\chi$)=$x^m$+$x^n$+1 require $m^2$-1 XOR gates and $m^2$ AND gates. The proposed multiplier needs $m^2$ AND gates and $m^2$+($n^2$-3n)/2 XOR gates that depend on the intermediate term xn. Time complexity of existing multipliers is $T_A$+( (m-2)/(m-n) +1+ log$_2$(m) ) $T_X$ and that of proposed method is $T_X$+(1+ log$_2$(m-1)+ n/2 ) )$T_X$. The proposed architecture is efficient for the extension degree m suggested as standards: SEC2, ANSI X9.63. In average, XOR space complexity is increased to 1.18% but time complexity is reduced 9.036%.

High Voltage Electron Microscopy of Structural Patterns of Plastid Crystalline Bodies in Sedum rotundifolium (HVEM에 의한 둥근잎꿩의 비름 (Sedum rotundifolium L.) 색소체의 결정체 구조)

  • Kim, In-Sun
    • Applied Microscopy
    • /
    • v.36 no.2
    • /
    • pp.73-82
    • /
    • 2006
  • Major contributions has been made in cellular ultrastructure studies with the use of high voltage electron microscopy (HVEM) and tomography. Applications of HVEM, accompanied by appropriate image processing, have provided great improvements in the analysis of three-dimensional cellular structures. In the present study, structural patterns of the crystalline bodies that are distinguished in mesophyll plastids of CAM-performing Sedum rotundifolium L., have been investigated using HVEM and tomography. Tilting, and diffraction pattern analysis were performed during the investigation. The titlting was performed at ${\pm}60^{\circ}\;with\;2^{\circ}$ increments while examining serial sections ranging from 0.125 to $1{\mu}m$ in thickness. The young plastids exhibited crystalline inclusion bodies that revealed a peculiar structural pattern. They were irregular in shape and also variable in size. Their structural attributes affected the plastid morphology. The body consisted of a large number of tubular elements, often reaching up to several thousand in number. The tubular elements typically aggregated to form a fluster The elements demonstrated either a parallel or lattice arrangement depending on the sectioning angle. The distance between the elements was approximately 20nm as demonstrated by the diffraction analysis. HVEM examination of the serial sections revealed an occasional fusion or branching of elements within the inclusion bodies. Finally, a three-dimensional reconstruction of the plastid crystalline bodies has been attempted using two different image processing methods.

Evaluation of bonding state of tunnel shotcrete using impact-echo method - numerical analysis (충격 반향 기법을 이용한 숏크리트 배면 접착 상태 평가에 관한 수치해석적 연구)

  • Song, Ki-Il;Cho, Gye-Chun;Chang, Seok-Bue
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.10 no.2
    • /
    • pp.105-118
    • /
    • 2008
  • Shotcrete is one of the main support materials in tunnelling. Its bonding state on excavated rock surfaces controls the safety of the tunnel: De-bonding of shotcrete from an excavated surface decreases the safety of the tunnel. Meanwhile, the bonding state of shotcrete is affected by blasting during excavation at tunnel face as well as bench cut. Generally, the bonding state of shotcrete can be classified as void, de-bonded, or fully bonded. In this study, the state of the back-surface of shotcrete is investigated using impact-echo (IE) techniques. Numerical simulation of IE technique is performed with ABAQUS. Signals obtained from the IE simulations were analyzed at time, frequency, and time-frequency domains, respectively. Using an integrated active signal processing technique coupled with a Short-Time Fourier Transform (STFT) analysis, the bonding state of the shotcrete can be evaluated accurately. As the bonding state worsens, the amplitude of the first peak past the maximum amplitude in the time domain waveform and the maximum energy of the autospectral density are increasing. The resonance frequency becomes detectable and calculable and the contour in time-frequency domain has a long tail parallel to the time axis. Signal characteristics with respect to ground condition were obtained in case of fully bonded condition. As the ground condition worsens, the length of a long tail parallel to the time axis is lengthened and the contour is located in low frequency range under 10 kHz.

  • PDF

Performance Analysis of Noncoherent OOK UWB Transceiver for LR-WPAN (저속 WPAN용 비동기 OOK 방식 UWB 송수신기 성능 분석)

  • Ki Myoungoh;Choi Sungsoo;Oh Hui-Myoung;Kim Kwan-Ho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.11A
    • /
    • pp.1027-1034
    • /
    • 2005
  • IEEE802.15.4a, which is started to realize the PHY layer including high precision ranging/positioning and low data rate communication functions, requires a simple and low power consumable transceiver architecture. To satisfy this requirements, the simple noncoherent on-off keying (OOK) UWB transceiver with the parallel energy window banks (PEWB) giving high precision signal processing interface is proposed. The flexibility of the proposed system in multipath fading channel environments is acquired with the pulse and bit repetition method. To analyze the bit error rate (BER) performance of this proposed system, a noise model in receiver is derived with commonly used random variable distribution, chi-square. BER of $10^{-5}$ under the line-of-sight (LOS) residential channel is achieved with the integration time of 32 ns and signal to noise ratio (SNR) of 15.3 dB. For the non-line-of-sight (NLOS) outdoor channel, the integration time of 72 ns and SNR of 16.2 dB are needed. The integrated energy to total received energy (IRR) for the best BER performance is about $86\%$.

An Online Scaling Method for Improving the Availability of a Database Cluster (데이터베이스 클러스터의 가용성 향상을 위한 온라인 확장 기법)

  • Lee, Chung-Ho;Jang, Yong-Il;Bae, Hae-Yeong
    • The KIPS Transactions:PartD
    • /
    • v.10D no.6
    • /
    • pp.935-948
    • /
    • 2003
  • An online scaling method adds new nodes to the shared-nothing database cluster and makes tables be reorganized while the system is running. The objective is to share the workload with many nodes and increase the capacity of cluster systems. The existing online scaling method, however, has two problems. One is the degradation of response time and transactions throughput due to the additional overheads of data transfer and replica's condidtency. The other is and inefficient recovery mechanism in which the overall scaling transaction is aborted by a fault. These problems deteriorate the availability of shared-nothing database cluster. To avoid the additional overheads throughout the scaling period, our scalingmethod consists of twophases : a parallel data transfer phase and a combination phase. The parallel data transferred datausing reduces the size of data transfer by dividing the data into the number of replicas. The combination phase combines the transferred datausing resources of spare nodes. Also, our method reduces the possibility of failure throughout the scaling period and improves the availability of the database cluster.

The Performance Analysis of GPU-based Cloth simulation according to the Change of Work Group Configuration (워크 그룹 구성 변화에 따른 GPU 기반 천 시뮬레이션의 성능 분석)

  • Choi, Young-Hwan;Hong, Min;Lee, Seung-Hyun;Choi, Yoo-Joo
    • Journal of Internet Computing and Services
    • /
    • v.18 no.3
    • /
    • pp.29-36
    • /
    • 2017
  • In these days, 3D dynamic simulation is closely related to many industries. In the past, physically-based 3D simulation was used mainly in the car crash or construction related fields, but it also plays an important role in movies or games today. Many mathematical computations are needed to represent the 3D object realistically, but it is difficult to process a large amount of calculations for simulation of application based on CPU in real-time. Recently, with the advanced graphic hardware and improved architecture, GPU can be utilized for the general purposes of computation function as well as graphic computation. Many approaches using GPU have been applied for various research fields. In this paper, we analyze the performance variation of two cloth simulation algorithms based on GPU according to the change of execution properties of GPU shaders in oder to optimize the performance of GPU-based cloth simulation. Cloth simulation is implemented by the spring centric algorithm and node centric algorithm with GPU parallel computing using compute shader of GLSL 4.3. We compare the performance of between these algorithms according to the change of the size and dimension of work group. The experiment is repeated to 10 times during 5,000 frames for each test and experimental results are provided by averaging of FPS. The experimental result shows that the node centric algorithm is executed in higher speed than the spring centric algorithm.