• Title/Summary/Keyword: model quantization

Search Result 227, Processing Time 0.025 seconds

A Tree Regularized Classifier-Exploiting Hierarchical Structure Information in Feature Vector for Human Action Recognition

  • Luo, Huiwu;Zhao, Fei;Chen, Shangfeng;Lu, Huanzhang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.3
    • /
    • pp.1614-1632
    • /
    • 2017
  • Bag of visual words is a popular model in human action recognition, but usually suffers from loss of spatial and temporal configuration information of local features, and large quantization error in its feature coding procedure. In this paper, to overcome the two deficiencies, we combine sparse coding with spatio-temporal pyramid for human action recognition, and regard this method as the baseline. More importantly, which is also the focus of this paper, we find that there is a hierarchical structure in feature vector constructed by the baseline method. To exploit the hierarchical structure information for better recognition accuracy, we propose a tree regularized classifier to convey the hierarchical structure information. The main contributions of this paper can be summarized as: first, we introduce a tree regularized classifier to encode the hierarchical structure information in feature vector for human action recognition. Second, we present an optimization algorithm to learn the parameters of the proposed classifier. Third, the performance of the proposed classifier is evaluated on YouTube, Hollywood2, and UCF50 datasets, the experimental results show that the proposed tree regularized classifier obtains better performance than SVM and other popular classifiers, and achieves promising results on the three datasets.

A Perceptual Audio Coder Based on Temporal-Spectral Structure (시간-주파수 구조에 근거한 지각적 오디오 부호화기)

  • 김기수;서호선;이준용;윤대희
    • Journal of Broadcast Engineering
    • /
    • v.1 no.1
    • /
    • pp.67-73
    • /
    • 1996
  • In general, the high quality audio coding(HQAC) has the structure of the convertional data compression techniques combined with moodels of human perception. The primary auditory characteristic applied to HQAC is the masking effect in the spectral domain. Therefore spectral techniques such as the subband coding or the transform coding are widely used[1][2]. However no effort has yet been made to apply the temporal masking effect and temporal redundancy removing method in HQAC. The audio data compression method proposed in this paper eliminates statistical and perceptual redundancies in both temporal and spectral domain. Transformed audio signal is divided into packets, which consist of 6 frames. A packet contains 1536 samples($256{\times}6$) :nd redundancies in packet reside in both temporal and spectral domain. Both redundancies are elminated at the same time in each packet. The psychoacoustic model has been improved to give more delicate results by taking into account temporal masking as well as fine spectral masking. For quantization, each packet is divided into subblocks designed to have an analogy with the nonlinear critical bands and to reflect the temporal auditory characteristics. Consequently, high quality of reconstructed audio is conserved at low bit-rates.

  • PDF

The Noise Evaluation for Ragius 150 CR System (Regius 150 Computed Radiography 시스템의 Noise 평가에 관한 연구)

  • Kim, Jung-Min;Min, Jung-Whan;Jeong, Hea-Won;Im, Eun-Kyung;Yang, Han-Joon
    • Journal of radiological science and technology
    • /
    • v.29 no.4
    • /
    • pp.237-240
    • /
    • 2006
  • The Noise of CR Systems is made up of X-ray quantum mottle and additional Imaging plate's structure noise, photon noise of lumine cence, noise of electrometer, quantization noise etc. In this Regius 150 system, SNR was increased from 8.2 to 30 with linearily according to radiation dose from 0.1 mR to 100 mR. It means that the Regius 150 system has enough trustability because of SNR is over 5 by Rose Model. NPS was calculated two dementional Fourier Transform with shake of pixel value in the white Image. In the spatial frequence range of $0.5\;lp/mm{\sim}2.5\;lp/mm$, the NPS was distributed $10^{-4}{\sim}10^{-5}$ at 1 mR X-ray dose. That is similar result compare other systems to that of Kodak CR system reported by Carlu, HR-CR system reported by Dobbins.

  • PDF

Assessing applicability of self-organizing map for regional rainfall frequency analysis in South Korea (Self-organizing map을 이용한 강우 지역빈도해석의 지역구분 및 적용성 검토)

  • Ahn, Hyunjun;Shin, Ju-Young;Jeong, Changsam;Heo, Jun-Haeng
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.5
    • /
    • pp.383-393
    • /
    • 2018
  • The regional frequency analysis is the method which uses not only sample of target station but also sample of neighborhood stations in which are classified as hydrological homogeneous regions. Consequently, identification of homogeneous regions is a very important process in regional frequency analysis. In this study, homogeneous regions for regional frequency analysis of precipitation were identified by the self-organizing map (SOM) which is one of the artificial neural network. Geographical information and hourly rainfall data set were used in order to perform the SOM. Quantization error and topographic error were computed for identifying the optimal SOM map. As a result, the SOM model organized by $7{\times}6$ array with 42 nodes was selected and the selected stations were classified into 6 clusters for rainfall regional frequency analysis. According to results of the heterogeneity measure, all 6 clusters were identified as homogeneous regions and showed more homogeneous regions compared with the result of previous study.

Design of Wideband Speech Coder Using the MLT Residual Signal (MLT 여기신호를 이용한 광대역 음성 부호화기 설계)

  • Oh Yeon-Seon;Shin Jae-Hyun;Lee In-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.5
    • /
    • pp.248-254
    • /
    • 2005
  • In this Paper, the structure of a split bandwidth wideband speech coder and its highband coder for tone qualify elevation are Proposed. The lowband and highband by the split bandwidth method are encoded independently applying the G.729E and MLT (Modulated Lapped Transform) residual model. In the highband structure which is encoded by low bit rate of 4kbps, the MLT residual signals are distinguished to voice and unvoice signal . The voice signals are applied to MLT peak picking method by lowband pitch period. Because transformed MLT residual signals are represented by periodic signal that have periodic peak. The unvoice signals are applied to MLT which linear prediction spectral response is added and do vector quantization. Performance for proposed 15.8kbps wideband speech coder was verified through subjective listening test.

Application of Excitation Moment for Enhancing Fault Diagnosis Probability of Rotating Blade (회전 블레이드의 결함진단 확률제고를 위한 가진 모멘트 적용)

  • Kim, Jong Su;Choi, Chan Kyu;Yoo, Hong Hee
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.38 no.2
    • /
    • pp.205-210
    • /
    • 2014
  • Recently, pattern recognition methods have been widely used by researchers for fault diagnoses of mechanical systems. A pattern recognition method determines the soundness of a mechanical system by detecting variations in the system's vibration characteristics. Hidden Markov models (HMMs) and artificial neural networks (ANNs) have recently been used as pattern recognition methods in various fields. In this study, a HMM-ANN hybrid method for the fault diagnosis of a mechanical system is introduced, and a rotating wind turbine blade with a crack is selected for fault diagnosis. The existence, location, and depth of said crack are identified in this research. For improving the diagnostic accuracy of the method in spite of the presence of noise, a moment with a few specific frequencies is applied to the structure.

Bit-Rate Control Using Histogram Based Rate-Distortion Characteristics (히스토그램 기반의 비트율-왜곡 특성을 이용한 비트율 제어)

  • 홍성훈;유상조;박수열;김성대
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.9B
    • /
    • pp.1742-1754
    • /
    • 1999
  • In this paper, we propose a rate control scheme, using histogram based rate-distortion (R-D) estimation, which produces a consistent picture quality between consecutive frames. The histogram based R-D estimation used in our rate control scheme offers a closed-form mathematical model that enable us to predict the bits and the distortion generated from an encoded frame at a given quantization parameter (QP) and vice versa. The most attractive feature of the R-D estimation is low complexity of computing the R-D data because its major operation is just to obtain a histogram or weighted histogram of DCT coefficients from an input picture. Furthermore, it is accurate enough to be applied to the practical video coding. Therefore, the proposed rate control scheme using this R-D estimation model is appropriate for the applications requiring low delay and low complexity, and controls the output bit-rate ad quality accurately. Our rate control scheme ensures that the video buffer do not underflow and overflow by satisfying the buffer constraint and, additionally, prevents quality difference between consecutive frames from exceeding certain level by adopting the distortion constraint. In addition, a consistent considering the maximum tolerance BER of the voice service. Also in Rician fading channel of K=6 and K=10, considering CLP=$10^{-3}$ as a criterion, it is observed that the performance improment of about 3.5 dB and 1.5 dB is obtained, respectively, in terms of $E_b$/$N_o$ by employing the concatenated FEC code with pilot symbols.

  • PDF

Video Watermarking Scheme with Adaptive Embedding in 3D-DCT domain (3D-DCT 계수를 적응적으로 이용한 비디오 워터마킹)

  • Park Hyun;Han Ji-Seok;Moon Young-Shik
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.15 no.3
    • /
    • pp.3-12
    • /
    • 2005
  • This paper introduces a 3D perceptual model based on JND(Just Noticeable Difference) and proposes a video watermarking scheme which is perceptual approach of adaptive embedding in 3D-DCT domain. Videos are composed of consecutive frames with many similar adjacent frames. If a watermark is embedded in the period of similar frames with little motion, it can be easily noticed by human eyes. Therefore, for the transparency the watermark should be embedded into some places where motions exist and for the robustness its magnitude needs to be adjusted properly. For the transparency and the robustness, watermark based on 3D perceptual model is utilized. That is. the sensitivities from the 3D-DCT quantization are derived based on 3D perceptual model, and the sensitivities of the regions having more local motion than global motion are adjusted. Then the watermark is embedded into visually significant coefficients in proportion to the strength of motion in 3D-DCT domain. Experimental results show that the proposed scheme improves the robustness to MPEG compression and temporal attacks by about $3{\sim}9\%$, compared to the existing 3D-DCT based method. In terms of PSNR, the proposed method is similar to the existing method, but JND guarantees the transparency of watermark.

Object Detection Performance Analysis between On-GPU and On-Board Analysis for Military Domain Images

  • Du-Hwan Hur;Dae-Hyeon Park;Deok-Woong Kim;Jae-Yong Baek;Jun-Hyeong Bak;Seung-Hwan Bae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.8
    • /
    • pp.157-164
    • /
    • 2024
  • In this paper, we propose a discussion that the feasibility of deploying a deep learning-based detector on the resource-limited board. Although many studies evaluate the detector on machines with high-performed GPUs, evaluation on the board with limited computation resources is still insufficient. Therefore, in this work, we implement the deep-learning detectors and deploy them on the compact board by parsing and optimizing a detector. To figure out the performance of deep learning based detectors on limited resources, we monitor the performance of several detectors with different H/W resource. On COCO detection datasets, we compare and analyze the evaluation results of detection model in On-Board and the detection model in On-GPU in terms of several metrics with mAP, power consumption, and execution speed (FPS). To demonstrate the effect of applying our detector for the military area, we evaluate them on our dataset consisting of thermal images considering the flight battle scenarios. As a results, we investigate the strength of deep learning-based on-board detector, and show that deep learning-based vision models can contribute in the flight battle scenarios.

Frame-Layer H.264 Rate Control for Scene-Change Video at Low Bit Rate (저 비트율 장면 전환 영상에 대한 향상된 H.264 프레임 단위 데이터율 제어 알고리즘)

  • Lee, Chang-Hyun;Jung, Yun-Ho;Kim, Jae-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.11
    • /
    • pp.127-136
    • /
    • 2007
  • An abrupt scene-change frame is one that is hardly correlated with the previous frames. In that case, because an intra-coded frame has less distortion than an inter-coded one, almost all macroblocks are encoded in intra mode. This breaks up the rate control flow and increases the number of bits used. Since the reference software for H.264 takes no special action for a scene-change frame, several studies have been conducted to solve the problem using the quadratic R-D model. However, since this model is more suitable for inter frames, the existing schemes are unsuitable for computing the QP of the scene-change intra frame. In this paper, an improved rate control scheme accounting for the characteristics of intra coding is proposed for scene-change frames. The proposed scheme was validated using 16 test sequences. The results showed that the proposed scheme performed better than the existing H.264 rate control schemes. The PSNR was improved by an average of 0.4-0.6 dB and a maximum of 1.1-1.6 dB. The PSNR fluctuation was also in proved by an average of 18.6 %.