• Title/Summary/Keyword: Frame Classification

Search Result 260, Processing Time 0.022 seconds

Audio Event Classification Using Deep Neural Networks (깊은 신경망을 이용한 오디오 이벤트 분류)

  • Lim, Minkyu;Lee, Donghyun;Kim, Kwang-Ho;Kim, Ji-Hwan
    • Phonetics and Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.27-33
    • /
    • 2015
  • This paper proposes an audio event classification method using Deep Neural Networks (DNN). The proposed method applies Feed Forward Neural Network (FFNN) to generate event probabilities of ten audio events (dog barks, engine idling, and so on) for each frame. For each frame, mel scale filter bank features of its consecutive frames are used as the input vector of the FFNN. These event probabilities are accumulated for the events and the classification result is determined as the event with the highest accumulated probability. For the same dataset, the best accuracy of previous studies was reported as about 70% when the Support Vector Machine (SVM) was applied. The best accuracy of the proposed method achieves as 79.23% for the UrbanSound8K dataset when 80 mel scale filter bank features each from 7 consecutive frames (in total 560) were implemented as the input vector for the FFNN with two hidden layers and 2,000 neurons per hidden layer. In this configuration, the rectified linear unit was suggested as its activation function.

An Effective Classification Method of Video Contents Using a Neural-Network (신경망을 이용한 효율적인 비디오 컨텐츠 분류 방법)

  • 이후형;전승철;박성한
    • Proceedings of the IEEK Conference
    • /
    • 2001.06d
    • /
    • pp.109-112
    • /
    • 2001
  • This paper proposes a method to classify different video contents using features of digital video. Classified video types are the news, drama, show, sports, and talk program. Features, such as intra-coded macroblock number St motion vector in P-picture in MPEG domain are used. The frame difference of YCbCr is also employed as a measure of classification. We detect the occurrences of cuts in a video for a measure of classification. Finally, back-propagation neural-network of 3 layers is used to classify video contents.

  • PDF

Classification of TV Program Scenes Based on Audio Information

  • Lee, Kang-Kyu;Yoon, Won-Jung;Park, Kyu-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.3E
    • /
    • pp.91-97
    • /
    • 2004
  • In this paper, we propose a classification system of TV program scenes based on audio information. The system classifies the video scene into six categories of commercials, basketball games, football games, news reports, weather forecasts and music videos. Two type of audio feature set are extracted from each audio frame-timbral features and coefficient domain features which result in 58-dimensional feature vector. In order to reduce the computational complexity of the system, 58-dimensional feature set is further optimized to yield l0-dimensional features through Sequential Forward Selection (SFS) method. This down-sized feature set is finally used to train and classify the given TV program scenes using κ -NN, Gaussian pattern matching algorithm. The classification result of 91.6% reported here shows the promising performance of the video scene classification based on the audio information. Finally, the system stability problem corresponding to different query length is investigated.

Classification of Underwater Transient Signals Using Gaussian Mixture Model (정규혼합모델을 이용한 수중 천이신호 식별)

  • Oh, Sang-Hwan;Bae, Keun-Sung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.9
    • /
    • pp.1870-1877
    • /
    • 2012
  • Transient signals generally have short duration and variable length with time-varying and non-stationary characteristics. Thus frame-based pattern matching method is useful for classification of transient signals. In this paper, we propose a new method for classification of underwater transient signals using a Gaussian mixture model(GMM). We carried out classification experiments for various underwater transient signals depending upon the types of noise, signal-to-noise ratio, and number of mixtures in the GMM. Experimental results have verified that the proposed method works quite well for classification of underwater transient signals.

An Explainable Deep Learning Algorithm based on Video Classification (비디오 분류에 기반 해석가능한 딥러닝 알고리즘)

  • Jin Zewei;Inwhee Joe
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.449-452
    • /
    • 2023
  • The rapid development of the Internet has led to a significant increase in multimedia content in social networks. How to better analyze and improve video classification models has become an important task. Deep learning models have typical "black box" characteristics. The model requires explainable analysis. This article uses two classification models: ConvLSTM and VGG16+LSTM models. And combined with the explainable method of LRP, generate visualized explainable results. Finally, based on the experimental results, the accuracy of the classification model is: ConvLSTM: 75.94%, VGG16+LSTM: 92.50%. We conducted explainable analysis on the VGG16+LSTM model combined with the LRP method. We found VGG16+LSTM classification model tends to use the frames biased towards the latter half of the video and the last frame as the basis for classification.

(Real Time Classification System for Lead Pin Images) (실시간 Lead Pin 영상 분류 시스템)

  • 장용훈
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.9
    • /
    • pp.1177-1188
    • /
    • 2002
  • To classify real time Lead pin images in this paper, The image acquisition system was composed to C.C.D, image frame grabber(DT3153), P.C(PentiumIII). I proposed image processing algorithms. This algorithms were composed to real time monitoring, Lead Pin image acquisition, image noise deletion, object area detection, point detection and pattern classification algorithm. The raw images were acquired from Lead pin images using the system. The result images were obtained from raw images by image processing algorithms. In implemental result, The right recognition was 97 of 100 acceptable products, 95 of 100 defective products. The recognition rate was 96% for total 200 Lead Pins.

  • PDF

Rainfall Recognition from Road Surveillance Videos Using TSN (TSN을 이용한 도로 감시 카메라 영상의 강우량 인식 방법)

  • Li, Zhun;Hyeon, Jonghwan;Choi, Ho-Jin
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.34 no.5
    • /
    • pp.735-747
    • /
    • 2018
  • Rainfall depth is an important meteorological information. Generally, high spatial resolution rainfall data such as road-level rainfall data are more beneficial. However, it is expensive to set up sufficient Automatic Weather Systems to get the road-level rainfall data. In this paper, we propose to use deep learning to recognize rainfall depth from road surveillance videos. To achieve this goal, we collect a new video dataset and propose a procedure to calculate refined rainfall depth from the original meteorological data. We also propose to utilize the differential frame as well as the optical flow image for better recognition of rainfall depth. Under the Temporal Segment Networks framework, the experimental results show that the combination of the video frame and the differential frame is a superior solution for the rainfall depth recognition. The final model is able to achieve high performance in the single-location low sensitivity classification task and reasonable accuracy in the higher sensitivity classification task for both the single-location and the multi-location case.

Lane Detection and Tracking Using Classification in Image Sequences

  • Lim, Sungsoo;Lee, Daeho;Park, Youngtae
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.12
    • /
    • pp.4489-4501
    • /
    • 2014
  • We propose a novel lane detection method based on classification in image sequences. Both structural and statistical features of the extracted bright shape are applied to the neural network for finding correct lane marks. The features used in this paper are shown to have strong discriminating power to locate correct traffic lanes. The traffic lanes detected in the current frame is also used to estimate the traffic lane if the lane detection fails in the next frame. The proposed method is fast enough to apply for real-time systems; the average processing time is less than 2msec. Also the scheme of the local illumination compensation allows robust lane detection at nighttime. Therefore, this method can be widely used in intelligence transportation systems such as driver assistance, lane change assistance, lane departure warning and autonomous vehicles.

The Construction of Semantic Networks for Korean "Cooking Verb" Based on the Argument Information. (논항 정보 기반 "요리 동사"의 어휘의미망 구축 방안)

  • Lee, Sukeui
    • Korean Linguistics
    • /
    • v.48
    • /
    • pp.223-268
    • /
    • 2010
  • The purpose of this paper is to build a semantic networks of the 'cooking class' verb (based on 'CoreNet' of KAIST). This proceedings needs to adjust the concept classification. Then sub-categories of [Cooking] and [Foodstuff] hierarchy of CoreNet was adjusted for the construction of verb semantic networks. For the building a semantic networks, each meaning of 'Cooking verbs' of Korean has to be analyzed. This paper focused on the Korean 'heating' verbs and 'non-heating'verbs. Case frame structure and argument information were inserted for the describing verb information. This paper use a Propege 3.3 as a tool for building "cooking verb" semantic networks. Each verb and noun was inserted into it's class, and connected by property relation marker 'HasThemeAs', 'IsMaterialOf'.

De Morgan Frames (드 모르간 틀)

  • 이승온
    • Journal for History of Mathematics
    • /
    • v.17 no.2
    • /
    • pp.73-84
    • /
    • 2004
  • Stone introduced extremally disconnected spaces as the image of complete Boolean algebras under his famous duality between Bool and ZComp and they turn out to be projective objects in various categories of Hausdorff spaces and completely regular ones are exactly those X with Dedekind complete C(X, ). In the pointfree setting, extremally disconnected frame (= De Morgan frame) are those with De Morgan condition. In this paper, we investigate a historical aspect of De Morgan frame together with that of De Morgan.

  • PDF