• Title/Summary/Keyword: Frame Classification

Search Result 262, Processing Time 0.024 seconds

Context-based coding of inter-frame DCT coefficients for video compression (비디오 압축을 위한 영상간 차분 DCT 계수의 문맥값 기반 부호화 방법)

  • Lee, Jin-Hak;Kim, Jae-Kyoon
    • Proceedings of the IEEK Conference
    • /
    • 2000.09a
    • /
    • pp.281-285
    • /
    • 2000
  • This paper proposes context-based coding methods for variable length coding of inter-frame DCT coefficients. The proposed methods classify run-level symbols depending on the preceding coefficients. No extra overhead needs to be transmitted, since the information of the previously transmitted coefficients is used for classification. Two entropy coding methods, arithmetic coding and Huffman coding, are used for the proposed context-based coding. For Huffman coding, there is no complexity increase from the current standards by using the existing inter/intra VLC tables. Experimental results show that the proposed methods give ~ 19% bits gain and ~ 0.8 dB PSNR improvement for adaptive inter/intra VLC table selection, and ~ 37% bits gain and ~ 2.7dB PSNR improvement for arithmetic coding over the current standards, MPEG-4 and H.263. Also, the proposed methods obtain larger gain for small quantizaton parameters and the sequences with fast and complex motion. Therefore, for high quality video coding, the proposed methods have more advantage.

  • PDF

A Study on the Efficient Speech Recognition System using Database Grouping (어휘 그룹화를 이용한 음성인식시스템의 성능향상에 관한 연구)

  • 우상욱;권승호;한수양;이동규;이두수
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2455-2458
    • /
    • 2003
  • In this paper, the Classification of Energy Labeling has been Proposed. Energy Parameters of input signal which is extracted from each phoneme is labelled. And groups of labelling according to detected energies of input signals are detected. Next, DTW processes in a selected group of labeling. This leads to DTW processing faster than a previous algorithm. In this Method, because an accurate detection of parameters is necessary on the assumption in steps of a detection of speeching duration and a detection of energy parameters, variable windows which are decided by pitch period is used. Extract algorithms don't search for exact frame energy, because 256 frame window-sizes is fixed. For this reason, a new energy extraction method has been proposed. A pitch period is detected firstly; next window scale is decided between 200 frames and 300 frames. The proposed method make it possible to cancel an influence of windows.

  • PDF

An Efficient Motion Compensation Algorithm for Video Sequences with Brightness Variations (밝기 변화가 심한 비디오 시퀀스에 대한 효율적인 움직임 보상 알고리즘)

  • 김상현;박래홍
    • Journal of Broadcast Engineering
    • /
    • v.7 no.4
    • /
    • pp.291-299
    • /
    • 2002
  • This paper proposes an efficient motion compensation algorithm for video sequences with brightness variations. In the proposed algorithm, the brightness variation parameters are estimated and local motions are compensated. To detect the frame with large brightness variations. we employ the frame classification based on the cross entropy between histograms of two successive frames, which can reduce the computational redundancy. Simulation results show that the proposed method yields a higher peak signal to noise ratio (PSNR) than the conventional methods, with a low computational load, when the video scene contains large brightness changes.

An Efficient Video Coding Algorithm Applying Brightness Variation Compensation (밝기변화 보상을 적용한 효율적인 비디오 코딩 알고리즘)

  • Kim Sang-Hyun
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.5 no.4
    • /
    • pp.287-293
    • /
    • 2004
  • This paper proposes an efficient motion compensation algorithm for video sequences with brightness variations. In the proposed algorithm, the brightness variation parameters are estimated and local motions are compensated. To detect the frame with large brightness variations, we employ the frame classification based on the cross entropy between histograms of two successive frames, which can reduce the computational redundancy. Simulation results show that the proposed method yields a higher peak signal to noise ratio (PSNR) than that of the conventional methods, with a low computational load, when the video scene contains large brightness changes.

  • PDF

A Study on the Interior Orientation for Various Image Formation Sensors

  • Lee, Suk-Kun;Shin, Sung-Woong
    • Korean Journal of Geomatics
    • /
    • v.4 no.1
    • /
    • pp.23-30
    • /
    • 2004
  • This study aims to establish interior orientation for various types of sensors including frame cameras, panoramic cameras, line cameras, and whisk-broom scanners. To do so, this study suggests the classification of components of interior orientation of which elements are different according to the sensors. This is entailed by incorporation of sensor characteristics into mathematical models of interior orientation parameters are suggested for being used as guidelines in recovering systematic distortions. Finally, the potential errors resulted from the assumption of regarding sensor model of whisk-broom scanner model as that of push-broom scanner are discussed.

  • PDF

Object tracking algorithm through RGB-D sensor in indoor environment (실내 환경에서 RGB-D 센서를 통한 객체 추적 알고리즘 제안)

  • Park, Jung-Tak;Lee, Sol;Park, Byung-Seo;Seo, Young-Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.248-249
    • /
    • 2022
  • In this paper, we propose a method for classifying and tracking objects based on information of multiple users obtained using RGB-D cameras. The 3D information and color information acquired through the RGB-D camera are acquired and information about each user is stored. We propose a user classification and location tracking algorithm in the entire image by calculating the similarity between users in the current frame and the previous frame through the information on the location and appearance of each user obtained from the entire image.

  • PDF

A Study on the Signal Processing for Content-Based Audio Genre Classification (내용기반 오디오 장르 분류를 위한 신호 처리 연구)

  • 윤원중;이강규;박규식
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.6
    • /
    • pp.271-278
    • /
    • 2004
  • In this paper, we propose a content-based audio genre classification algorithm that automatically classifies the query audio into five genres such as Classic, Hiphop, Jazz, Rock, Speech using digital sign processing approach. From the 20 seconds query audio file, the audio signal is segmented into 23ms frame with non-overlapped hamming window and 54 dimensional feature vectors, including Spectral Centroid, Rolloff, Flux, LPC, MFCC, is extracted from each query audio. For the classification algorithm, k-NN, Gaussian, GMM classifier is used. In order to choose optimum features from the 54 dimension feature vectors, SFS(Sequential Forward Selection) method is applied to draw 10 dimension optimum features and these are used for the genre classification algorithm. From the experimental result, we can verify the superior performance of the proposed method that provides near 90% success rate for the genre classification which means 10%∼20% improvements over the previous methods. For the case of actual user system environment, feature vector is extracted from the random interval of the query audio and it shows overall 80% success rate except extreme cases of beginning and ending portion of the query audio file.

Detection and Classification for Low-altitude Micro Drone with MFCC and CNN (MFCC와 CNN을 이용한 저고도 초소형 무인기 탐지 및 분류에 대한 연구)

  • Shin, Kyeongsik;Yoo, Sinwoo;Oh, Hyukjun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.3
    • /
    • pp.364-370
    • /
    • 2020
  • This paper is related to detection and classification for micro-sized aircraft that flies at low-altitude. The deep-learning based method using sounds coming from the micro-sized aircraft is proposed to detect and identify them efficiently. We use MFCC as sound features and CNN as a detector and classifier. We've proved that each micro-drones have their own distinguishable MFCC feature and confirmed that we can apply CNN as a detector and classifier even though drone sound has time-related sequence. Typically many papers deal with RNN for time-related features, but we prove that if the number of frame in the MFCC features are enough to contain the time-related information, we can classify those features with CNN. With this approach, we've achieved high detection and classification ratio with low-computation power at the same time using the data set which consists of four different drone sounds. So, this paper presents the simple and effecive method of detection and classification method for micro-sized aircraft.

Fine-tuning Neural Network for Improving Video Classification Performance Using Vision Transformer (Vision Transformer를 활용한 비디오 분류 성능 향상을 위한 Fine-tuning 신경망)

  • Kwang-Yeob Lee;Ji-Won Lee;Tae-Ryong Park
    • Journal of IKEEE
    • /
    • v.27 no.3
    • /
    • pp.313-318
    • /
    • 2023
  • This paper proposes a neural network applying fine-tuning as a way to improve the performance of Video Classification based on Vision Transformer. Recently, the need for real-time video image analysis based on deep learning has emerged. Due to the characteristics of the existing CNN model used in Image Classification, it is difficult to analyze the association of consecutive frames. We want to find and solve the optimal model by comparing and analyzing the Vision Transformer and Non-local neural network models with the Attention mechanism. In addition, we propose an optimal fine-tuning neural network model by applying various methods of fine-tuning as a transfer learning method. The experiment trained the model with the UCF101 dataset and then verified the performance of the model by applying a transfer learning method to the UTA-RLDD dataset.

Development of a Vehicle Classification Algorithm Using an Inductive Loop Detector on a Freeway (단일 루프 검지기를 이용한 차종 분류 알고리즘 개발)

  • 이승환;조한선;최기주
    • Journal of Korean Society of Transportation
    • /
    • v.14 no.1
    • /
    • pp.135-154
    • /
    • 1996
  • This paper presents a heuristic algorithm for classifying vehicles using a single loop detector. The data used for the development of the algorithm are the frequency variation of a vehicle sensored from the circle-shaped loop detectors which are normal buried beneath the expressway. The pre-processing of data is required for the development of the algorithm that actually consists of two parts. One is both normalization of occupancy time and that with frequency variation, the other is finding of an adaptable number of sample size for each vehicle category and calculation of average value of normalized frequencies along with occupancy time that will be stored for comparison. Then, detected values are compared with those stored data to locate the most fitted pattern. After the normalization process, we developed some frameworks for comparison schemes. The fitted scales used were 10 and 15 frames in occupancy time(X-axis) and 10 and 15 frames in frequency variation (Y-axis). A combination of X-Y 10-15 frame turned out to be the most efficient scale of normalization producing 96 percent correct classification rate for six types of vehicle.

  • PDF