• Title/Summary/Keyword: Vector Architecture

Search Result 261, Processing Time 0.023 seconds

Detection of structural damage via free vibration responses by extended Kalman filter with Tikhonov regularization scheme

  • Zhang, Chun;Huang, Jie-Zhong;Song, Gu-Quan;Dai, Lin;Li, Huo-Kun
    • Structural Monitoring and Maintenance
    • /
    • v.3 no.2
    • /
    • pp.115-127
    • /
    • 2016
  • It is a challenging problem of assessing the location and extent of structural damages with vibration measurements. In this paper, an improved Extended Kalman filter (EKF) with Tikhonov regularization is proposed to identify structural damages. The state vector of EKF consists of the initial values of modal coordinates and damage parameters of structural elements, therefore the recursive formulas of EKF are simplified and modal truncation technique can be used to reduce the dimension of the state vector. Then Tikhonov regularization is introduced into EKF to restrain the effect of the measurement noise for improving the solution of ill-posed inverse problems. Numerical simulations of a seven-story shear-beam structure and a simply-supported beam show that the proposed method has good robustness and can identify the single or multiple damages accurately with the unknown initial structural state.

A Multithreaded Architecture for the Efficient Execution of Vector Computations (벡타 연산을 효율적으로 수행하기 위한 다중 스레드 구조)

  • Yun, Seong-Dae;Jeong, Gi-Dong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.2 no.6
    • /
    • pp.974-984
    • /
    • 1995
  • This paper presents a design of a high performance MULVEC (MULtithreaded architecture for the VEctor Computations), as a building block of massively parallel Processing systems. The MULVEC comes from the synthesis of the dataflow model and the extant super sclar RISC microprocesso r. The MULVEC reduces, using status fields, the number of synchronizations in the case of repeated vector computations within the same thread segment, and also reduces the amount of the context switching, network traffic, etc. After be nchmark programs are simulated on the SPARC station 20(super scalar RISC microprocessor)the performance (execution time of programs and the utilization of processors) of MULVEC and the performance(execution time of a program) of *Taccording the different numbers of node are analyzed. We observed that the execution time of the program in MULVEC is faster than that in * T about 1-2 times according the number of nodes and the number of the repetitions of the loop.

  • PDF

Hierarchical Architecture of Multilayer Perceptrons for Performance Improvement (다층퍼셉트론의 계층적 구조를 통한 성능향상)

  • Oh, Sang-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.6
    • /
    • pp.166-174
    • /
    • 2010
  • Based on the theoretical results that multi-layer feedforward neural networks with enough hidden nodes are universal approximators, we usually use three-layer MLP's(multi-layer perceptrons) consisted of input, hidden, and output layers for many application problems. However, this conventional three-layer architecture of MLP shows poor generalization performance in some applications, which are complex with various features in an input vector. For the performance improvement, this paper proposes a hierarchical architecture of MLP especially when each part of inputs has a special information. That is, one input vector is divided into sub-vectors and each sub-vector is presented to a separate MLP. These lower-level MLPs are connected to a higher-level MLP, which has a role to do a final decision. The proposed method is verified through the simulation of protein disorder prediction problem.

Design of Low Complexity and High Throughput Encoder for Structured LDPC Codes (구조적 LDPC 부호의 저복잡도 및 고속 부호화기 설계)

  • Jung, Yong-Min;Jung, Yun-Ho;Kim, Jae-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.10
    • /
    • pp.61-69
    • /
    • 2009
  • This paper presents the design results of a low complexity and high throughput LDPC encoder structure. In order to solve the high complexity problem of the LDPC encoder, a simplified matrix-vector multiplier is proposed instead of the conventional complex matrix-vector multiplier. The proposed encoder also adopts a partially parallel structure and performs column-wise operations in matrix-vector multiplication to achieve high throughput. Implementation results show that the proposed architecture reduces the number of logic gates and memory elements by 37.4% and 56.7%, compared with existing five-stage pipelined architecture. The proposed encoder also supports 800Mbps throughput at 40MHz clock frequency which is improved about three times more than the existing architecture.

Vector mechanics-based simulation of large deformation behavior in RC shear walls using planar four-node elements

  • Zhang, Hongmei;Shan, Yufei;Duan, Yuanfeng;Yun, Chung Bang;Liu, Song
    • Structural Engineering and Mechanics
    • /
    • v.74 no.1
    • /
    • pp.1-18
    • /
    • 2020
  • For the large deformation of shear walls under vertical and horizontal loads, there are difficulties in obtaining accurate simulation results using the response analysis method, even with fine mesh elements. Furthermore, concrete material nonlinearity, stiffness degradation, concrete cracking and crushing, and steel bar damage may occur during the large deformation of reinforced concrete (RC) shear walls. Matrix operations that are involved in nonlinear analysis using the traditional finite-element method (FEM) may also result in flaws, and may thus lead to serious errors. To solve these problems, a planar four-node element was developed based on vector mechanics. Owing to particle-based formulation along the path element, the method does not require repeated constructions of a global stiffness matrix for the nonlinear behavior of the structure. The nonlinear concrete constitutive model and bilinear steel material model are integrated with the developed element, to ensure that large deformation and damage behavior can be addressed. For verification, simulation analyses were performed to obtain experimental results on an RC shear wall subjected to a monotonically increasing lateral load with a constant vertical load. To appropriately evaluate the parameters, investigations were conducted on the loading speed, meshing dimension, and the damping factor, because vector mechanics is based on the equation of motion. The static problem was then verified to obtain a stable solution by employing a balanced equation of motion. Using the parameters obtained, the simulated pushover response, including the bearing capacity, deformation ability, curvature development, and energy dissipation, were found to be in accordance with the experimental observation. This study demonstrated the potential of the developed planar element for simulating the entire process of large deformation and damage behavior in RC shear walls.

Robustness of Differentiable Neural Computer Using Limited Retention Vector-based Memory Deallocation in Language Model

  • Lee, Donghyun;Park, Hosung;Seo, Soonshin;Son, Hyunsoo;Kim, Gyujin;Kim, Ji-Hwan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.3
    • /
    • pp.837-852
    • /
    • 2021
  • Recurrent neural network (RNN) architectures have been used for language modeling (LM) tasks that require learning long-range word or character sequences. However, the RNN architecture is still suffered from unstable gradients on long-range sequences. To address the issue of long-range sequences, an attention mechanism has been used, showing state-of-the-art (SOTA) performance in all LM tasks. A differentiable neural computer (DNC) is a deep learning architecture using an attention mechanism. The DNC architecture is a neural network augmented with a content-addressable external memory. However, in the write operation, some information unrelated to the input word remains in memory. Moreover, DNCs have been found to perform poorly with low numbers of weight parameters. Therefore, we propose a robust memory deallocation method using a limited retention vector. The limited retention vector determines whether the network increases or decreases its usage of information in external memory according to a threshold. We experimentally evaluate the robustness of a DNC implementing the proposed approach according to the size of the controller and external memory on the enwik8 LM task. When we decreased the number of weight parameters by 32.47%, the proposed DNC showed a low bits-per-character (BPC) degradation of 4.30%, demonstrating the effectiveness of our approach in language modeling tasks.

A design of CAVLC(Context-Adaptive Variable Length Coding) for H.264 (H.264 CAVLC(Context-Adaptive Variable Length Coding)설계)

  • Lee, Yong-Ju;Suh, Ki-Bum
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.10a
    • /
    • pp.108-111
    • /
    • 2008
  • In this paper, we propose an advanced hardware architecture for the CAVLC entropy encoder engine for real time Full HD video compression. Since there are 384 data coefficients which are sum of 376 AC coefficient and 8 DC coefficient per one macroblock, 384 coefficient have to be processed per one macroblock in worst case for real time processing. We propose an novel architecture which includes parallel architecture and pipeline processing, and reduction "0" in AC/DC coefficient table. To verify the proposed architecture, we develop the reference C for CAVLC and verified the designed circuit with the test vector from reference C code.

  • PDF

Design of Architecture of Programmable Stack-based Video Processor with VHDL (VHDL을 이용한 프로그램 가능한 스택 기반 영상 프로세서 구조 설계)

  • 박주현;김영민
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.4
    • /
    • pp.31-43
    • /
    • 1999
  • The main goal of this paper is to design a high performance SVP(Stack based Video Processor) for network applications. The SVP is a comprehensive scheme; 'better' in the sense that it is an optimal selection of previously proposed enhancements of a stack machine and a video processor. This can process effectively object-based video data using a S-RISC(Stack-based Reduced Instruction Set Computer) with a semi -general-purpose architecture having a stack buffer for OOP(Object-Oriented Programming) with many small procedures at running programs. And it includes a vector processor that can improve the MPEG coding speed. The vector processor in the SVP can execute advanced mode motion compensation, motion prediction by half pixel and SA-DCT(Shape Adaptive-Discrete Cosine Transform) of MPEG-4. Absolutors and halfers in the vector processor make this architecture extensive to a encoder. We also designed a VLSI stack-oriented video processor using the proposed architecture of stack-oriented video decoding. It was designed with O.5$\mu\textrm{m}$ 3LM standard-cell technology, and has 110K logic gates and 12 Kbits SRAM internal buffer. The operating frequency is 50MHz. This executes algorithms of video decoding for QCIF 15fps(frame per second), maximum rate of VLBV(Very Low Bitrate Video) in MPEG-4.

  • PDF

Vectorization of an Explicit Finite Element Method on Memory-to-Memory Type Vector Computer (Memory-to-Memory방식 벡터컴퓨터에서의 외연적 유한요소법의 벡터화)

  • 이지호;이재석
    • Computational Structural Engineering
    • /
    • v.4 no.1
    • /
    • pp.95-108
    • /
    • 1991
  • An explicit finite element method can be executed more rapidly and effectively on vector computer than on the scalar computer because it has suitable structures for vector processing. In this paper, an efficient vectorization method of the explicit finite element program on the memory-to-memory type vector computer is proposed. First, the general vectorization method which can be applied regardless of the vector architecture is investigated, then the method which is suitable for the memory-to-memory type vector computer is proposed. To illustrate the usefulness of the proposed vectorization method, DYNA3D, the existing explicit finite element program, is migrated on HDS AS/XL V50 which is the memory-to-memory type vector computer. Performance results on actual test show a vector/scalar speedup is above 2.4.

  • PDF

Implementation of MDCT core in Digital-Audio with Micro-program type vector processor

  • Ku Dae Sung;Choi Hyun Yong;Ra Kyung Tae;Hwang Jung Yeun;Kim Jong Bin
    • Proceedings of the IEEK Conference
    • /
    • 2004.08c
    • /
    • pp.477-481
    • /
    • 2004
  • High Quality CD, OAT audio requires that large amount of data. Currently, multi channel preference has been rapidly propagated among latest users. The MPEG(Moving Picture Expert Group) is provides data compression technology of sound and image system. The MPEG standard provides multi channel and 5.1 sounds, using the same audio algorithm as MPEG-l. And MPEG-2 audio is forward and backward compatible. The MDCT (Modified Discrete Cosine Transform) is a linear orthogonal lapped transform based on the idea of TDAC(Time Domain Aliasing Cancellation). In this paper, we proposed the micro-program type vector processor architecture a benefit in MDCT/IMDCT of MPEG-II AAC. And it's reduced operating coefficient by overlapped area to bind. To compare original algorithm with optimized algorithm that cosine coefficient reduced $0.5\%$multiply operating $0.098\%$ and add operating 80.58\%$. Algorithm test is used C-language then we designed hardware architecture of micro-programmed method that applied to optimized algorithm. This processor is 20MHz operation 5V.

  • PDF