• Title/Summary/Keyword: Fast Computation

Search Result 750, Processing Time 0.028 seconds

Intra 16$\times$16 Mode Decision Using Subset of Transform Coefficients in H.264/AVC (H.264/AVC에서 변환계수의 부분집합을 사용한 인트라 16$\times$16 예측 모드 선택 방법)

  • Lim, Sang-Hee;Lee, Seong-Won;Paik, Joon-Ki
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.6
    • /
    • pp.54-62
    • /
    • 2007
  • In this paper, we significantly reduces the amount of computation for intra 16$\times$16 mode decision in H.264 by applying the fast algorithm, which obtains the transformed prediction residual with fewer computations. By extending the existing intra 4$\times$4 mode decision, we propose the new algorithm for fast intra 16$\times$16 mode decision. The proposed algorithm uses partial transform coefficients which consist of one DC and three adjacent AC coefficients after 4$\times$4 transform in the intra 16$\times$16 mode decision. Theoretical analysis and experimental results show that the proposed algorithm can reduce computations up to 50% in the intra 16$\times$16 mode decision process with unnoticeable degradation.

Fast Fuzzy Inference Algorithm for Fuzzy System constructed with Triangular Membership Functions (삼각형 소속함수로 구성된 퍼지시스템의 고속 퍼지추론 알고리즘)

  • Yoo, Byung-Kook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.12 no.1
    • /
    • pp.7-13
    • /
    • 2002
  • Almost applications using fuzzy theory are based on the fuzzy inference. However fuzzy inference needs much time in calculation process for the fuzzy system with many input variables or many fuzzy labels defined on each variable. Inference time is dependent on the number of arithmetic Product in computation Process. Especially, the inference time is a primary constraint to fuzzy control applications using microprocessor or PC-based controller. In this paper, a simple fast fuzzy inference algorithm(FFIA), without loss of information, was proposed to reduce the inference time based on the fuzzy system with triangular membership functions in antecedent part of fuzzy rule. The proposed algorithm was induced by using partition of input state space and simple geometrical analysis. By using this scheme, we can take the same effect of the fuzzy rule reduction.

An Efficient Motion Search Algorithm for a Media Processor (미디어 프로세서에 적합한 효율적인 움직임 탐색 알고리즘)

  • Noh Dae-Young;Kim Seang-Hoon;Sohn Chae-Bong;Oh Seoung-Jun;Ahn Chang-Beam
    • Journal of Broadcast Engineering
    • /
    • v.9 no.4 s.25
    • /
    • pp.434-445
    • /
    • 2004
  • Motion Estimation is an essential module in video encoders based on international standards such as H.263 and MPEG. Many fast motion estimation algorithms have been proposed in order to reduce the computational complexity of a well-known full search algorithms(FS). However, these fast algorithms can not work efficiently in DSP processors recently developed for video processing. To solve for this. we propose an efficient motion estimation scheme optimized in the DSP processor like Philips TM1300. A motion vector predictor is pre-estimated and a small search range is chosen in the proposed scheme using strong motion vector correlation between a current macro block (MB) and its neighboring MB's to reduce computation time. An MPEG-4 SP@L3(Simple Profile at Level 3) encoding system is implemented in Philips TM1300 to verify the effectiveness of the proposed method. In that processor, we can achieve better performance using our method than other conventional ones while keeping visual quality as good as that of the FS.

A Fast SAD Algorithm for Area-based Stereo Matching Methods (영역기반 스테레오 영상 정합을 위한 고속 SAD 알고리즘)

  • Lee, Woo-Young;Kim, Cheong Ghil
    • Journal of Satellite, Information and Communications
    • /
    • v.7 no.2
    • /
    • pp.8-12
    • /
    • 2012
  • Area-based stereo matchng algorithms are widely used for image analysis for stereo vision. SAD (Sum of Absolute Difference) algorithm is one of well known area-based stereo matchng algorithms with the characteristics of data intensive computing application. Therefore, it requires very high computation capabilities and its processing speed becomes very slow with software realization. This paper proposes a fast SAD algorithm utilizing SSE (Streaming SIMD Extensions) instructions based on SIMD (Single Instruction Multiple Data) parallism. CPU supporing SSE instructions has 16 XMM registers with 128 bits. For the performance evaluation of the proposed scheme, we compare the processing speed between SAD with/without SSE instructions. The proposed scheme achieves four times performance improvement over the general SAD, which shows the possibility of the software realization of real time SAD algorithm.

Solver for the Wavier-Stokes Equations by using Initial Guess Velocity (속도의 초기간 추정을 사용한 Navier-Stokes방정식 풀이 기법)

  • Kim, Young-Hee;Lee, Sung-Kee
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.9
    • /
    • pp.445-456
    • /
    • 2005
  • We propose a fast and accurate fluid solver of the Wavier-Stokes equations for the physics-based fluid simulations. Our method utilizes the solution of the Stokes equation as an initial guess for the velocity of the nonlinear term in the Wavier-Stokes equations. By guessing the initial velocity close to the exact solution of the given nonlinear differential equations, we can develop remarkably accurate and stable fluid solver. Our solver is based on the implicit scheme of finite difference methods, that makes it work well for large time steps. Since we employ the ADI method, our solver is also fast and has a uniform computation time. The experimental results show that our solver is excellent for fluids with high Reynolds numbers such as smoke and clouds.

Fast Intra Prediction Mode Decision Algorithm Using Directional Gradients For H.264 (방향성 기울기를 이용한 H.264를 위한 고속 화면내 예측 모드 결정 알고리즘)

  • Han, Hwa-Jeong;Jeon, Yeong-Il;Han, Chan-Hee;Lee, Si-Woong
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.9
    • /
    • pp.1-8
    • /
    • 2009
  • H.264/AVC video coding standard uses the rate distortion optimization method which determines the best coding mode for macroblock(MB) to improve coding efficiency. Whereas RDO selects the best coding mode, it causes the heavy computational burden comparing with previous standards. To reduce the complexity, in this paper, a fast intra prediction mode decision algorithm using directional gradients is proposed. The proposed algorithm is composed of 2-path structure. In the first path, $16{\times}16$ intra prediction mode is determined using directional gradients. In the second path, 3 modes instead of 9 modes are chosen for RDO to decide the best mode for $4{\times}4$ block. Finally, the two modes determined in the two-path decision process are compared to decide the final block mode. Experimental results show that the computation time of the proposed method is decreased to about 77% of the exhaustive mode decision method with negligible quality loss.

Design of Iterative Divider in GF(2163) Based on Improved Binary Extended GCD Algorithm (개선된 이진 확장 GCD 알고리듬 기반 GF(2163)상에서 Iterative 나눗셈기 설계)

  • Kang, Min-Sup;Jeon, Byong-Chan
    • The KIPS Transactions:PartC
    • /
    • v.17C no.2
    • /
    • pp.145-152
    • /
    • 2010
  • In this paper, we first propose a fast division algorithm in GF($2^{163}$) using standard basis representation, and then it is mapped into divider for GF($2^{163}$) with iterative hardware structure. The proposed algorithm is based on the binary ExtendedGCD algorithm, and the arithmetic operations for modular reduction are performed within only one "while-statement" unlike conventional approach which uses two "while-statement". In this paper, we use reduction polynomial $f(x)=x^{163}+x^7+x^6+x^3+1$ that is recommended in SEC2(Standards for Efficient Cryptography) using standard basis representation, where degree m = 163. We also have implemented the proposed iterative architecture in FPGA using Verilog HDL, and it operates at a clock frequency of 85 MHz on Xilinx-VirtexII XC2V8000 FPGA device. From implementation results, we will show that computation speed of the proposed scheme is significantly improved than the existing two approaches.

Fast Scalar Multiplication Algorithm on Elliptic Curve over Optimal Extension Fields (최적확장체 위에서 정의되는 타원곡선에서의 고속 상수배 알고리즘)

  • Chung Byungchun;Lee Soojin;Hong Seong-Min;Yoon Hyunsoo
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.15 no.3
    • /
    • pp.65-76
    • /
    • 2005
  • Speeding up scalar multiplication of an elliptic curve point has been a prime approach to efficient implementation of elliptic curve schemes such as EC-DSA and EC-ElGamal. Koblitz introduced a $base-{\phi}$ expansion method using the Frobenius map. Kobayashi et al. extended the $base-{\phi}$ scalar multiplication method to suit Optimal Extension Fields(OEF) by introducing the table reference method. In this paper we propose an efficient scalar multiplication algorithm on elliptic curve over OEF. The proposed $base-{\phi}$ scalar multiplication method uses an optimized batch technique after rearranging the computation sequence of $base-{\phi}$ expansion usually called Horner's rule. The simulation results show that the new method accelerates the scalar multiplication about $20\%{\sim}40\%$ over the Kobayashi et al. method and is about three times as fast as some conventional scalar multiplication methods.

Fast Inverse Transform Considering Multiplications (곱셈 연산을 고려한 고속 역변환 방법)

  • Hyeonju Song;Yung-Lyul Lee
    • Journal of Broadcast Engineering
    • /
    • v.28 no.1
    • /
    • pp.100-108
    • /
    • 2023
  • In hybrid block-based video coding, transform coding converts spatial domain residual signals into frequency domain data and concentrates energy in a low frequency band to achieve a high compression efficiency in entropy coding. The state-of-the-art video coding standard, VVC(Versatile Video Coding), uses DCT-2(Discrete Cosine Transform type 2), DST-7(Discrete Sine Transform type 7), and DCT-8(Discrete Cosine Transform type 8) for primary transform. In this paper, considering that DCT-2, DST-7, and DCT-8 are all linear transformations, we propose an inverse transform that reduces the number of multiplications in the inverse transform by using the linearity of the linear transform. The proposed inverse transform method reduced encoding time and decoding time by an average 26%, 15% in AI and 4%, 10% in RA without the increase of bitrate compared to VTM-8.2.

Stereo Image-based 3D Modelling Algorithm through Efficient Extraction of Depth Feature (효율적인 깊이 특징 추출을 이용한 스테레오 영상 기반의 3차원 모델링 기법)

  • Ha, Young-Su;Lee, Heng-Suk;Han, Kyu-Phil
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.10
    • /
    • pp.520-529
    • /
    • 2005
  • A feature-based 3D modeling algorithm is presented in this paper. Since conventional methods use depth-based techniques, they need much time for the image matching to extract depth information. Even feature-based methods have less computation load than that of depth-based ones, the calculation of modeling error about whole pixels within a triangle is needed in feature-based algorithms. It also increase the computation time. Therefore, the proposed algorithm consists of three phases, which are an initial 3D model generation, model evaluation, and model refinement phases, in order to acquire an efficient 3D model. Intensity gradients and incremental Delaunay triangulation are used in the Initial model generation. In this phase, a morphological edge operator is adopted for a fast edge filtering, and the incremental Delaunay triangulation is modified to decrease the computation time by avoiding the calculation errors of whole pixels and selecting a vertex at the near of the centroid within the previous triangle. After the model generation, sparse vertices are matched, then the faces are evaluated with the size, approximation error, and disparity fluctuation of the face in evaluation stage. Thereafter, the faces which have a large error are selectively refined into smaller faces. Experimental results showed that the proposed algorithm could acquire an adaptive model with less modeling errors for both smooth and abrupt areas and could remarkably reduce the model acquisition time.