• Title/Summary/Keyword: Computation Complexity

Search Result 607, Processing Time 0.029 seconds

Age and Gender Classification with Small Scale CNN (소규모 합성곱 신경망을 사용한 연령 및 성별 분류)

  • Jamoliddin, Uraimov;Yoo, Jae Hung
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.1
    • /
    • pp.99-104
    • /
    • 2022
  • Artificial intelligence is getting a crucial part of our lives with its incredible benefits. Machines outperform humans in recognizing objects in images, particularly in classifying people into correct age and gender groups. In this respect, age and gender classification has been one of the hot topics among computer vision researchers in recent decades. Deployment of deep Convolutional Neural Network(: CNN) models achieved state-of-the-art performance. However, the most of CNN based architectures are very complex with several dozens of training parameters so they require much computation time and resources. For this reason, we propose a new CNN-based classification algorithm with significantly fewer training parameters and training time compared to the existing methods. Despite its less complexity, our model shows better accuracy of age and gender classification on the UTKFace dataset.

Hardware Architecture for Entropy Filter Implementation (엔트로피 필터 구현에 대한 Hardware Architecture)

  • Sim, Hwi-Bo;Kang, Bong-Soon
    • Journal of IKEEE
    • /
    • v.26 no.2
    • /
    • pp.226-231
    • /
    • 2022
  • The concept of information entropy has been widely applied in various fields. Recently, in the field of image processing, many technologies applying the concept of information entropy have been developed. As the importance and demand of computer vision technologies increase in modern industry, real-time processing must be possible in order for image processing technologies to be efficiently applied to modern industries. Extracting the entropy value of an image is difficult to process in real-time due to the complexity of computation in software, and a hardware structure of an image entropy filter capable of real-time processing has never been proposed. In this paper, we propose for the first time a hardware structure of a histogram-based entropy filter that can be processed in real time using a barrel shifter. The proposed hardware was designed using Verilog HDL, and Xilinx's xczu7ev-2ffvc1156 was set as the target device and FPGA was implemented. As a result of logic synthesis using the Xilinx Vivado program, it has a maximum operating frequency of 750.751 MHz in a 4K UHD high-resolution environment, and it processes more than 30 images per second and satisfies the real-time processing standard.

Fast Inverse Transform Considering Multiplications (곱셈 연산을 고려한 고속 역변환 방법)

  • Hyeonju Song;Yung-Lyul Lee
    • Journal of Broadcast Engineering
    • /
    • v.28 no.1
    • /
    • pp.100-108
    • /
    • 2023
  • In hybrid block-based video coding, transform coding converts spatial domain residual signals into frequency domain data and concentrates energy in a low frequency band to achieve a high compression efficiency in entropy coding. The state-of-the-art video coding standard, VVC(Versatile Video Coding), uses DCT-2(Discrete Cosine Transform type 2), DST-7(Discrete Sine Transform type 7), and DCT-8(Discrete Cosine Transform type 8) for primary transform. In this paper, considering that DCT-2, DST-7, and DCT-8 are all linear transformations, we propose an inverse transform that reduces the number of multiplications in the inverse transform by using the linearity of the linear transform. The proposed inverse transform method reduced encoding time and decoding time by an average 26%, 15% in AI and 4%, 10% in RA without the increase of bitrate compared to VTM-8.2.

Design of XOR Gate Based on QCA Universal Gate Using Rotated Cell (회전된 셀을 이용한 QCA 유니버셜 게이트 기반의 XOR 게이트 설계)

  • Lee, Jin-Seong;Jeon, Jun-Cheol
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.3
    • /
    • pp.301-310
    • /
    • 2017
  • Quantum-dot cellular automata(QCA) is an alternative technology for implementing various computation, high performance, and low power consumption digital circuits at nano scale. In this paper, we propose a new universal gate in QCA. By using the universal gate, we propose a novel XOR gate which is reduced time/hardware complexity. The universal gate can be used to construct all other basic logic gates. Meanwhile, the proposed universal gate is designed by basic cells and a rotated cell. The rotated cell of the proposed universal gate is located at the central of 3-input majority gate structure. In this paper, we propose an XOR gate using three universal gates, although more than five 3-input majority gates are used to design an XOR gate using the 3-input majority gate. The proposed XOR gate is superior to the conventional XOR gate in terms of the total area and the consumed clock because the number of gates are reduced.

Lightweight Key Escrow Scheme for Internet of Battlefield Things Environment (사물인터넷 환경을 위한 경량화 키 위탁 기법)

  • Tuan, Vu Quoc;Lee, Minwoo;Lim, Jaesung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.12
    • /
    • pp.1863-1871
    • /
    • 2022
  • In the era of Fourth Industrial Revolution, secure networking technology is playing an essential role in the defense weapon systems. Encryption technology is used for information security. The safety of cryptographic technology, according to Kerchoff's principles, is based on secure key management of cryptographic technology, not on cryptographic algorithms. However, traditional centralized key management is one of the problematic issues in battlefield environments since the frequent movement of the forces and the time-varying quality of tactical networks. Alternatively, the system resources of each node used in the IoBT(Internet of Battlefield Things) environment are limited in size, capacity, and performance, so a lightweight key management system with less computation and complexity is needed than a conventional key management algorithm. This paper proposes a novel key escrow scheme in a lightweight manner for the IoBT environment. The safety and performance of the proposed technique are verified through numerical analysis and simulations.

Analysis of streamflow prediction performance by various deep learning schemes

  • Le, Xuan-Hien;Lee, Giha
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.131-131
    • /
    • 2021
  • Deep learning models, especially those based on long short-term memory (LSTM), have presented their superiority in addressing time series data issues recently. This study aims to comprehensively evaluate the performance of deep learning models that belong to the supervised learning category in streamflow prediction. Therefore, six deep learning models-standard LSTM, standard gated recurrent unit (GRU), stacked LSTM, bidirectional LSTM (BiLSTM), feed-forward neural network (FFNN), and convolutional neural network (CNN) models-were of interest in this study. The Red River system, one of the largest river basins in Vietnam, was adopted as a case study. In addition, deep learning models were designed to forecast flowrate for one- and two-day ahead at Son Tay hydrological station on the Red River using a series of observed flowrate data at seven hydrological stations on three major river branches of the Red River system-Thao River, Da River, and Lo River-as the input data for training, validation, and testing. The comparison results have indicated that the four LSTM-based models exhibit significantly better performance and maintain stability than the FFNN and CNN models. Moreover, LSTM-based models may reach impressive predictions even in the presence of upstream reservoirs and dams. In the case of the stacked LSTM and BiLSTM models, the complexity of these models is not accompanied by performance improvement because their respective performance is not higher than the two standard models (LSTM and GRU). As a result, we realized that in the context of hydrological forecasting problems, simple architectural models such as LSTM and GRU (with one hidden layer) are sufficient to produce highly reliable forecasts while minimizing computation time because of the sequential data nature.

  • PDF

A Study on Speech Synthesizer Using Distributed System (분산형 시스템을 적용한 음성합성에 관한 연구)

  • Kim, Jin-Woo;Min, So-Yeon;Na, Deok-Su;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.3
    • /
    • pp.209-215
    • /
    • 2010
  • Recently portable terminal is received attention by wireless networks and mass capacity ROM. In this result, TTS(Text to Speech) system is inserted to portable terminal. Nevertheless high quality synthesis is difficult in portable terminal, users need high quality synthesis. In this paper, we proposed Distributed TTS (DTTS) that was composed of server and terminal. The DTTS on corpus based speech synthesis can be high quality synthesis. Synthesis system in server that generate optimized speech concatenation information after database search and transmit terminal. Synthesis system in terminal make high quality speech synthesis as low computation using transmitted speech concatenation information from server. The proposed method that can be reducing complexity, smaller power consumption and efficient maintenance.

On the elastic stability and free vibration responses of functionally graded porous beams resting on Winkler-Pasternak foundations via finite element computation

  • Zakaria Belabed;Abdelouahed Tounsi;Mohammed A. Al-Osta;Abdeldjebbar Tounsi;Hoang-Le Minh
    • Geomechanics and Engineering
    • /
    • v.36 no.2
    • /
    • pp.183-204
    • /
    • 2024
  • In current investigation, a novel beam finite element model is formulated to analyze the buckling and free vibration responses of functionally graded porous beams resting on Winkler-Pasternak elastic foundations. The novelty lies in the formulation of a simplified finite element model with only three degrees of freedom per node, integrating both C0 and C1 continuity requirements according to Lagrange and Hermite interpolations, respectively, in isoparametric coordinate while emphasizing the impact of z-coordinate-dependent porosity on vibration and buckling responses. The proposed model has been validated and demonstrating high accuracy when compared to previously published solutions. A detailed parametric examination is performed, highlighting the influence of porosity distribution, foundation parameters, slenderness ratio, and boundary conditions. Unlike existing numerical techniques, the proposed element achieves a high rate of convergence with reduced computational complexity. Additionally, the model's adaptability to various mechanical problems and structural geometries is showcased through the numerical evaluation of elastic foundations, with results in strong agreement with the theoretical formulation. In light of the findings, porosity significantly affects the mechanical integrity of FGP beams on elastic foundations, with the advanced beam element offering a stable, efficient model for future research and this in-depth investigation enriches porous structure simulations in a field with limited current research, necessitating additional exploration and investigation.

A Hierarchical Cluster Tree Based Fast Searching Algorithm for Raman Spectroscopic Identification (계층 클러스터 트리 기반 라만 스펙트럼 식별 고속 검색 알고리즘)

  • Kim, Sun-Keum;Ko, Dae-Young;Park, Jun-Kyu;Park, Aa-Ron;Baek, Sung-June
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.3
    • /
    • pp.562-569
    • /
    • 2019
  • Raman spectroscopy has been receiving increased attention as a standoff explosive detection technique. In addition, there is a growing need for a fast search method that can identify raman spectrum for measured chemical substances compared to known raman spectra in large database. By far the most simple and widely used method is to calculate and compare the Euclidean distance between the given spectrum and the spectra in a database. But it is non-trivial problem because of the inherent high dimensionality of the data. One of the most serious problems is the high computational complexity of searching for the closet spectra. To overcome this problem, we presented the MPS Sort with Sorted Variance+PDS method for the fast algorithm to search for the closet spectra in the last paper. the proposed algorithm uses two significant features of a vector, mean values and variance, to reject many unlikely spectra and save a great deal of computation time. In this paper, we present two new methods for the fast algorithm to search for the closet spectra. the PCA+PDS algorithm reduces the amount of computation by reducing the dimension of the data through PCA transformation with the same result as the distance calculation using the whole data. the Hierarchical Cluster Tree algorithm makes a binary hierarchical tree using PCA transformed spectra data. then it start searching from the clusters closest to the input spectrum and do not calculate many spectra that can not be candidates, which save a great deal of computation time. As the Experiment results, PCA+PDS shows about 60.06% performance improvement for the MPS Sort with Sorted Variance+PDS. also, Hierarchical Tree shows about 17.74% performance improvement for the PCA+PDS. The results obtained confirm the effectiveness of the proposed algorithm.

An Efficient Array Algorithm for VLSI Implementation of Vector-radix 2-D Fast Discrete Cosine Transform (Vector-radix 2차원 고속 DCT의 VLSI 구현을 위한 효율적인 어레이 알고리듬)

  • 신경욱;전흥우;강용섬
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.12
    • /
    • pp.1970-1982
    • /
    • 1993
  • This paper describes an efficient array algorithm for parallel computation of vector-radix two-dimensional (2-D) fast discrete cosine transform (VR-FCT), and its VLSI implementation. By mapping the 2-D VR-FCT onto a 2-D array of processing elements (PEs), the butterfly structure of the VR-FCT can be efficiently importanted with high concurrency and local communication geometry. The proposed array algorithm features architectural modularity, regularity and locality, so that it is very suitable for VLSI realization. Also, no transposition memory is required, which is invitable in the conventional row-column decomposition approach. It has the time complexity of O(N+Nnzp-log2N) for (N*N) 2-D DCT, where Nnzd is the number of non-zero digits in canonic-signed digit(CSD) code, By adopting the CSD arithmetic in circuit desine, the number of addition is reduced by about 30%, as compared to the 2`s complement arithmetic. The computational accuracy analysis for finite wordlength processing is presented. From simulation result, it is estimated that (8*8) 2-D DCT (with Nnzp=4) can be computed in about 0.88 sec at 50 MHz clock frequency, resulting in the throughput rate of about 72 Mega pixels per second.

  • PDF