• Title/Summary/Keyword: Acoustic scene classification

Search Result 5, Processing Time 0.016 seconds

Acoustic scene classification using recurrence quantification analysis (재발량 분석을 이용한 음향 상황 인지)

  • Park, Sangwook;Choi, Woohyun;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.1
    • /
    • pp.42-48
    • /
    • 2016
  • Since a variety of sound occur in same place and similar sound occurs in other places, the performance of acoustic scene classification is not guaranteed in case of insufficient training data. A Bag of Words (BOW) based histogram feature is foreseen as a method to overcome the problem. However, since the histogram features is made by using a feature distribution, the ordering of sequence of features is ignored. A temporal information such as periodicity and stationarity are also important for acoustic scene classification. In this paper, temporal features about a periodicity and a stationarity are extracted by using a recurrent quantification analysis. In the experiment, performance of the proposed method is shown better than other baseline methods.

Light weight architecture for acoustic scene classification (음향 장면 분류를 위한 경량화 모형 연구)

  • Lim, Soyoung;Kwak, Il-Youp
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.979-993
    • /
    • 2021
  • Acoustic scene classification (ASC) categorizes an audio file based on the environment in which it has been recorded. This has long been studied in the detection and classification of acoustic scenes and events (DCASE). In this study, we considered the problem that ASC faces in real-world applications that the model used should have low-complexity. We compared several models that apply light-weight techniques. First, a base CNN model was proposed using log mel-spectrogram, deltas, and delta-deltas features. Second, depthwise separable convolution, linear bottleneck inverted residual block was applied to the convolutional layer, and Quantization was applied to the models to develop a low-complexity model. The model considering low-complexity was similar or slightly inferior to the performance of the base model, but the model size was significantly reduced from 503 KB to 42.76 KB.

Listenable Explanation for Heatmap in Acoustic Scene Classification (음향 장면 분류에서 히트맵 청취 분석)

  • Suh, Sangwon;Park, Sooyoung;Jeong, Youngho;Lee, Taejin
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.727-731
    • /
    • 2020
  • 인공신경망의 예측 결과에 대한 원인을 분석하는 것은 모델을 신뢰하기 위해 필요한 작업이다. 이에 컴퓨터 비전 분야에서는 돌출맵 또는 히트맵의 형태로 모델이 어떤 내용을 근거로 예측했는지 시각화 하는 모델 해석 방법들이 제안되었다. 하지만 오디오 분야에서는 스펙트로그램 상의 시각적 해석이 직관적이지 않으며, 실제 어떤 소리를 근거로 판단했는지 이해하기 어렵다. 따라서 본 연구에서는 히트맵의 청취 분석 시스템을 제안하고, 이를 활용한 음향 장면 분류 모델의 히트맵 청취 분석 실험을 진행하여 인공신경망의 예측 결과에 대해 사람이 이해할 수 있는 설명을 제공할 수 있는지 확인한다.

  • PDF

Dual CNN Structured Sound Event Detection Algorithm Based on Real Life Acoustic Dataset (실생활 음향 데이터 기반 이중 CNN 구조를 특징으로 하는 음향 이벤트 인식 알고리즘)

  • Suh, Sangwon;Lim, Wootaek;Jeong, Youngho;Lee, Taejin;Kim, Hui Yong
    • Journal of Broadcast Engineering
    • /
    • v.23 no.6
    • /
    • pp.855-865
    • /
    • 2018
  • Sound event detection is one of the research areas to model human auditory cognitive characteristics by recognizing events in an environment with multiple acoustic events and determining the onset and offset time for each event. DCASE, a research group on acoustic scene classification and sound event detection, is proceeding challenges to encourage participation of researchers and to activate sound event detection research. However, the size of the dataset provided by the DCASE Challenge is relatively small compared to ImageNet, which is a representative dataset for visual object recognition, and there are not many open sources for the acoustic dataset. In this study, the sound events that can occur in indoor and outdoor are collected on a larger scale and annotated for dataset construction. Furthermore, to improve the performance of the sound event detection task, we developed a dual CNN structured sound event detection system by adding a supplementary neural network to a convolutional neural network to determine the presence of sound events. Finally, we conducted a comparative experiment with both baseline systems of the DCASE 2016 and 2017.

Analysis to the Essential Factors of Humor Emerging in Chinese Cartoon Around Year of 2000 (2000년을 전후로 하여 중국 애니메이션에 나타난 유머요인 분석)

  • Dong, Peng;Oh, Jin-Hee
    • Cartoon and Animation Studies
    • /
    • s.36
    • /
    • pp.189-215
    • /
    • 2014
  • Since the launching of in 1963, a large amount of outstanding cartoons had been produced in China by the year 1980. During this period of time, international reputation was achieved with the extremely full expression and characteristic stories originated from Chinese culture. Decades of cartoons were produced ever year benefiting from support of the government in the last years. However, the quality and in fluence power dropped down comparing with the increasing productivity. The outward followed by examples of successful international box office most of the animation made in China. These cartoons did not obtain admitting internationally, or disclose any traditional speciality of China, although the domestic box office is considered to be fairly successful. The key factors to the successful cases should be analysed and researched rather than simply estimating, in order to achieve both artistic and commercial success. Factor of humor, as a key element of a successful cartoon is proposed in this thesis. Prior to the discussion, a general definition of humor factor is described through Henri Bergson's comedy concept, based on which the key factors of humor will be analysed. A classification system would be derived and introduced as a tool for the analysis of humor factors. According to Henri Bergson, Humor is determined by circumstance, language and character factors. Humor factors are divided into visual, scene and acoustic factors in this research taking the Speciality of cartoon media into consideration. It is the speciality that, in addition to the visual and language factors, multiple acoustic elements are also introduced in such a presentation pattern. This classification system would be considerably applicable to the analysis of humor factors in Chinese cartoons. In this study, around the year 2000 to share the Chinese animation masterpiece were analyzed by selecting and , and . This discussion about key factors of humor is likely to be beneficial to the development of Chinese Cartoons in the future.