• Title/Summary/Keyword: Multiple Audio Features

Search Result 15, Processing Time 0.024 seconds

Multi-modal Emotion Recognition using Semi-supervised Learning and Multiple Neural Networks in the Wild (준 지도학습과 여러 개의 딥 뉴럴 네트워크를 사용한 멀티 모달 기반 감정 인식 알고리즘)

  • Kim, Dae Ha;Song, Byung Cheol
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.351-360
    • /
    • 2018
  • Human emotion recognition is a research topic that is receiving continuous attention in computer vision and artificial intelligence domains. This paper proposes a method for classifying human emotions through multiple neural networks based on multi-modal signals which consist of image, landmark, and audio in a wild environment. The proposed method has the following features. First, the learning performance of the image-based network is greatly improved by employing both multi-task learning and semi-supervised learning using the spatio-temporal characteristic of videos. Second, a model for converting 1-dimensional (1D) landmark information of face into two-dimensional (2D) images, is newly proposed, and a CNN-LSTM network based on the model is proposed for better emotion recognition. Third, based on an observation that audio signals are often very effective for specific emotions, we propose an audio deep learning mechanism robust to the specific emotions. Finally, so-called emotion adaptive fusion is applied to enable synergy of multiple networks. The proposed network improves emotion classification performance by appropriately integrating existing supervised learning and semi-supervised learning networks. In the fifth attempt on the given test set in the EmotiW2017 challenge, the proposed method achieved a classification accuracy of 57.12%.

A Threshold Adaptation based Voice Query Transcription Scheme for Music Retrieval (음악검색을 위한 가변임계치 기반의 음성 질의 변환 기법)

  • Han, Byeong-Jun;Rho, Seung-Min;Hwang, Een-Jun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.59 no.2
    • /
    • pp.445-451
    • /
    • 2010
  • This paper presents a threshold adaptation based voice query transcription scheme for music information retrieval. The proposed scheme analyzes monophonic voice signal and generates its transcription for diverse music retrieval applications. For accurate transcription, we propose several advanced features including (i) Energetic Feature eXtractor (EFX) for onset, peak, and transient area detection; (ii) Modified Windowed Average Energy (MWAE) for defining multiple small but coherent windows with local threshold values as offset detector; and finally (iii) Circular Average Magnitude Difference Function (CAMDF) for accurate acquisition of fundamental frequency (F0) of each frame. In order to evaluate the performance of our proposed scheme, we implemented a prototype music transcription system called AMT2 (Automatic Music Transcriber version 2) and carried out various experiments. In the experiment, we used QBSH corpus [1], adapted in MIREX 2006 contest data set. Experimental result shows that our proposed scheme can improve the transcription performance.

Design and Implementation of the Endoscope Image Store System in the Orthopedics (정형외과 관절경 영상 저장 시스템의 설계 및 구현)

  • 심갑식;정태영
    • Journal of the Korea Society of Computer and Information
    • /
    • v.7 no.4
    • /
    • pp.8-15
    • /
    • 2002
  • This Paper proposes designing and implementing the database system storing the medical images. This system collects the medical image when doctors operate and diagnose the patients using the endoscope in the orthopedics, then stores the medical image data to database. Therefore. system avoids duplicated medical data, retrieves and updates the medical data effectively. The medical image data can be shared to the multiple users and application programs. This system consists of the five components. that is, the input module acquiring the medical image from the endoscope. the modulo storing the medical image. the database design and implementation storms the patient's disease history and the medical image data, user friendly interface design and implementation, and the simple data retrieval engine. The features of the system are followed. The image catcher program using DirectShow is portable any image catcher board And because the image catcher algorithm is implemented as a public module, The throughput can be increased during the development of video and audio contents on internet.

  • PDF

Impact of a Breast Health Awareness Activity on the Knowledge Level of the Participants and its Association with Socio-Demographic Features

  • Khokher, Samina;Qureshi, Muhammad Usman;Fatima, Warda;Mahmood, Saqib;Saleem, Afaf
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.14
    • /
    • pp.5817-5822
    • /
    • 2015
  • The developing countries have higher mortality rates for breast cancer. A reason for this is presentation at advanced stages due to low levels of public awareness. Activities are arranged by health authorities of developing countries to increase the knowledge of women but their effectiveness has not been evaluated in detail. A multiple choice questionnaire with questions about socio-demographic profile and questions about breast cancer knowledge was designed in local language Urdu, to evaluate the knowledge of the participants before and after an audio visual educational activity in Lahore, Pakistan. Scores of 0-2, 3-5 and 6-8 were ranked as poor, fair and good, respectively. Among 146 participants these scores were achieved by 1%, 55% and 45% before activity and 0%, 16% and 84% after the activity. Overall 66% of participants increased their knowledge score. Younger age, higher education, reliance on television as source of information and being a housewife were associated with better impact of the awareness activity. For the six knowledge related questions 3%, 5%, 11%, 23%, 33% and 44% more participants gave correct answers after the activity. However 6% and 7% fewer participants answered correctly for 2 questions related to the cause and the best prevention for breast cancer. The study indicated that awareness activities are effective to increase the knowledge of women and better impact is associated with higher education and younger age of women. The component analysis showed that the questions and related presentations using medical terms have a negative impact and should not therefore be used. Analysis of activity therefore leads to identification of deficiencies which can be remedied in future.

Real-time Interactive Control of Magnetic Resonance Imaging System Using High-speed Digital Signal Processors (고속 DSP를 이용한 실시간 자기공명영상시스템 제어)

  • 안창범;김휴정;이흥규
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.40 no.5
    • /
    • pp.341-349
    • /
    • 2003
  • A real time interactive controller (spectrometer) for magnetic resonance imaging (MRI) system has been developed using high speed digital signal processors (DSP). The controller generates radio frequency (rf) waveforms and audio frequency gradient waveforms and controls multiple receivers for data acquisition. By employing DSPs having high computational power (e.g., TMS320C670l) real time generation of complicated gradient waveforms and interactive control of selection planes are possible, which are important features in real-time imaging of moving organs, e.g., cardiac imaging. The spectrometer was successfully implemented at a 1.5 Tesla whole body MRI system for clinical application. Performance of the spectrometer is verified by various experiments including high- speed imaging such as fast spin echo (FSE) and echo planar imaging (EPI). These high-speed imaging techniques reduce measurement time, however, usually intensify artifact if there is any systematic phase error or jitter in the synchronization between the transmitter, receiver, and gradients.