Search | Korea Science

A Study on the Fevelopment of Teal Time Speech Detection in PC (PC를 이용한 실시간 음성검출 알고리즘에 관한 연구)

Chung, Hoon;Chung, Kwon;Chung, Ik-joo
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1994.06c
- /
- pp.129-132
- /
- 1994
본 논문에서는 윈도우즈용 음성인식 software "voice access"를 개발하여 연구한 실시간 음성검출 알고리즘에 관해 소개한다. 이 음성검출 알고리즘은 200 sample 단위의 프레임 에너지, 프레임 영교차율, 음성의 길이를 음성검출의 파라메타로 사용한다. 각 파라메타의 문턱값은 신호의 평균값, 잡음의 표준편차, 미디안 표준편차와 한국어의 음성적 특성을 고려하여 설정하였으며 주변의 환경에 적응해 가며 문턱값을 조정하므로 주변 잡음환경의 변화에 대해서도 강인한 음성검출 결과를 보여준다. 또한 실시간으로 음성을 검출하므로 실용성이 높다. 음성의 검출은 일반사운드 카드를 통해 16-bit의 8KHz로 샘플링된 신호를 사용한다. 음성검출을 위한 분석은 200 sample 씩 하고 100 sample 씩 overlap 하면서 수행한다. 음성검출을 위한 모든 분석은 특별한 DSP의 도움없이 486D 이상에서 실시간으로 구현했다.시간으로 구현했다.
PDF

An Analysis of Face Recognition Methods for Recognition of Game Player's Facial Expression (게임 사용자 얼굴표정 인식을 위한 얼굴인식 기법 분석)

Yoo, Chae-Gon
- Journal of Korea Game Society
- /
- v.3 no.2
- /
- pp.19-23
- /
- 2003
컴퓨터 기술의 발전에 따라서 게임분야 역시 다양한 첨단 기술이 적용되고 있다. 예를 들면 강력한 3D가속 기능을 가진 비디오카드, 5.1 채널 사운드, 포스피드백 지원 입력 장치, 운전대, 적외선 센서, 음성 감지기 등이 게임의 입출력 인터페이스로서 이용되고 있다. 전형적인 방법 이외에도 광학방식이나 휴대용 게임기에 대한 플레이 방식에 대한 연구도 활발하다. 최근에는 비디오 게임기에도 사람의 동작을 인식하여 게임의 입력으로 받아들이는 기술이 상용화되기도 하였다. 본 논문에서는 이런 발전 방향을 고려하여 차세대 게임 인터페이스의 방식으로서 사용될 수 있는 사람의 표정 인식을 통한 인터페이스 구현을 위한 접근 방법들에 대하여 고찰을 하고자 한다. 사람의 표정을 입력으로 사용하는 게임은 심리적인 변화를 게임에 적용시킬 수 있으며, 유아나 장애자들이 게임을 플레이하기 위한 수단으로도 유용하게 사용될 수 있다. 영상을 통한 자동 얼굴 인식 및 분석 기술은 다양한 응용분야에 적용될 수 있는 관계로 많은 연구가 진행되어 왔다. 얼굴 인식은 동영상이나 정지영상과 같은 영상의 형태, 해상도, 조명의 정도 등에 따른 요소에 의하여 인식률이나 인식의 목적이 달라진다. 게임플레이어의 표정인식을 위해서는 얼굴의 정확한 인식 방법을 필요로 하며, 이를 위한 비교적 최근의 연구 동향을 살펴보고자 한다.
PDF

Digital Video Record System for Classification of Car Accident Sounds in the Parking Lot. (주차장 차량사고 음향분류 DVR시스템)

Yoon, Jae-Min
- Proceedings of the Korean Information Science Society Conference
- /
- 2010.06c
- /
- pp.429-432
- /
- 2010
주차장에서는 다양한 형태의 사건 사고가 발생하는데, 기존 DVR(CCTV)는 단순 영상녹화 기능만 지원하므로, 이를 효과적으로 분석하는데는 한계가 있다. 따라서, DVR의 영상카메라와 마이크를 통해서 입력되는 영상과 사운드 신호를 대상으로, 해당 영상이 발생하는 음향 신호의 종류를 파악하여, 특정 음향이 발생한 영상구간을 저장하여 이를 검색할 수 있다면, 주차장 관리자가 효과적으로 사건 사고를 대처할 수 있게 된다. 본 연구에서는 주차장에서 발생하는 차량관련 음향(충돌음, 과속음, 경적음, 유리파손, 비명)을 분류하기 위해 효과적인 특징벡터를 제안하고, 제안한 특징벡터를 이용하여 신경망 차량음향분류기를 설계하여 성능을 평가함으로써, 효과적으로 차량음향을 분류하기 위한 방법을 제안하였다. 또한, 신경망 차량음향분류기를 DVR시스템과 연동하여, 마이크로부터 입력되는 음향신호를 실시간 분석하고, 특정 소리가 발생한 영상구간을 기록함으로써, 음향 키워드에 의해서 해당 사고영상을 검색 및 디스플레이하는 시스템을 개발하였다.
PDF

Feature Comparison of Emotion Recognition Models using Face Images (얼굴사진 기반 감정인식 모델의 특성 분석)

Kim, MinGeyung;Yang, Jiyoon;Choi, Yoo-Joo
- Proceedings of the Korea Information Processing Society Conference
- /
- 2022.11a
- /
- pp.615-617
- /
- 2022
본 논문에서는 얼굴사진 기반 감정인식 심층망, 음성사운드를 기반한 감정인식 심층망을 결합한 앙상블 네트워크 구축을 위한 사전연구로서 얼굴사진 기반 감정을 인식하는 기존 딥뉴럴 네트워크 모델들을 입력 데이터 처리 방법에 따라 분류하고, 각 방법의 특성을 분석한다. 또한, 얼굴사진 외관 특성을 기반한 감정인식 네트워크를 여러 구조로 구성하고, 구성된 방법의 성능을 비교하여, 우수 성능을 보이는 네트워크를 선정하여 추후 앙상블 네트워크의 구성 네트워크로 사용하고자 한다.
https://doi.org/10.3745/PKIPS.y2022m11a.615 인용 PDF

Implementation of ARM based Embedded System for Muscular Sense into both Color and Sound Conversion (근감각-색·음 변환을 위한 ARM 기반 임베디드시스템의 구현)

Kim, Sung-Ill
- The Journal of the Korea Contents Association
- /
- v.16 no.8
- /
- pp.427-434
- /
- 2016
This paper focuses on a real-time hardware processing by implementing the ARM Cortex-M4 based embedded system, using a conversion algorithm from a muscular sense to both visual and auditory elements, which recognizes rotations of a human body, directional changes and motion amounts out of human senses. As an input method of muscular sense, AHRS(Attitude Heading Reference System) was used to acquire roll, pitch and yaw values in real time. These three input values were converted into three elements of HSI color model such as intensity, hue and saturation, respectively. Final color signals were acquired by converting HSI into RGB color model. In addition, Three input values of muscular sense were converted into three elements of sound such as octave, scale and velocity, which were synthesized to give an output sound using MIDI(Musical Instrument Digital Interface). The analysis results of both output color and sound signals revealed that input signals of muscular sense were correctly converted into both color and sound in real time by the proposed conversion method.
https://doi.org/10.5392/JKCA.2016.16.08.427 인용 PDF KSCI

Development of Interactive Video Using Real-time Optical Flow and Masking (옵티컬 플로우와 마스킹에 의한 실시간 인터렉티브 비디오 개발)

Kim, Tae-Hee
- The Journal of the Korea Contents Association
- /
- v.11 no.6
- /
- pp.98-105
- /
- 2011
Recent advances in computer technologies support real-time image processing and special effects on personal computers. This paper presents and analyzes a real-time interactive video system. The motivation of this work is to realize an artistic concept that aims at transforming the timeline visual variations in a video of sea water waves into sound in order to provide an audience with an experience of overlapping themselves onto the nature. In practice, the video of sea water waves taken on a beach is processed using an optical flow algorithm in order to extract the information of visual variations between the video frames. This is then masked by the silhouette of an audience and the result is projected on a gallery space. The intensity information is extracted from the resulting video and translated into piano sounds accordingly. This work generates an interactive space realizing the intended concept.
https://doi.org/10.5392/JKCA.2011.11.6.098 인용 PDF KSCI

Design and Implementation of Authoring Tools for Multimedia Production (멀티미디어 제작을 위한 저작도구의 설계 및 구현)

Yoo Su-Mi;Baik Sung-Wook;Bang Kee-Chun
- Journal of Digital Contents Society
- /
- v.4 no.1
- /
- pp.45-55
- /
- 2003
Due to the rapid development of information & communication technology under high performance computing environments, the multimedia production techniques have been applied to a variety of multimedia fields such as general banner advertisements including texts, images and animations, and the internet-broadcasting dealing with videos and sounds. This paper presents an authoring tool with main functions to setup events objects (image, animation, sound, button, area) and to setup action functions, so that non-experts can easily produce multimedia including images, sounds, animations and so on. The authoring tool implemented in Java can be applied to the CD-ROM title production as well as the web-site construction. We can expect that when this authoring tool is used for in multimedia production, both cost and time will be reduced due to its convenience and powerful functions. We have a future plan to integrate intelligent multimedia presentation techniques with the presented tool for the autonomous multimedia authoring works.
PDF

A Study on Interactive Sound Installation and User Intention Analysis - Focusing on an Installation: Color note (인터렉티브 사운드 설치와 사용자 의도 분석에 관한 연구 - 작품 Color note 를 중심으로)

Han, Yoon-Jung;Han, Byeong-Jun
- 한국HCI학회:학술대회논문집
- /
- 2008.02b
- /
- pp.268-273
- /
- 2008
This work defines user intention according to intention range, and also proposes an interactive sound installation which reflects and varies above features. User intention consists of several decomposition concepts, which are elemental intentions, partial intentions, and a universal intention. And also, each concept is defined as inclusion/affiliation relationship with other concepts. For the representation of elemental intention, we implemented an musical interface, Color note, which represents the colors and notes according to response of participants. We also propose Harmonic Defragmentation (HD), which arranges the partial intentions with harmonic rule. Finally, the universal intention is inferred to the comprehensive direction of elemental intentions. We used Karhunen-Lo$\`{e}$ve(K-L) Transform for the inference. For verifying the validity of our proposed interface, the "Color Note," and the various techniques, we installed our work and surveyed various users for the evaluation of HD and statistical techniques. Also, we commissioned another survey to find out satisfaction measurement which was used for expressing universal intention.
PDF

Brain Correlates of Emotion for XR Auditory Content (XR 음향 콘텐츠 활용을 위한 감성-뇌연결성 분석 연구)

Park, Sangin;Kim, Jonghwa;Park, Soon Yong;Mun, Sungchul
- Journal of Broadcast Engineering
- /
- v.27 no.5
- /
- pp.738-750
- /
- 2022
In this study, we reviewed and discussed whether auditory stimuli with short length can evoke emotion-related neurological responses. The findings implicate that if personalized sound tracks are provided to XR users based on machine learning or probability network models, user experiences in XR environment can be enhanced. We also investigated that the arousal-relaxed factor evoked by short auditory sound can make distinct patterns in functional connectivity characterized from background EEG signals. We found that coherence in the right hemisphere increases in sound-evoked arousal state, and vice versa in relaxed state. Our findings can be practically utilized in developing XR sound bio-feedback system which can provide preference sound to users for highly immersive XR experiences.
https://doi.org/10.5909/JBE.2022.27.5.738 인용 PDF KSCI KPUBS

Automatic Indexing Algorithm of Golf Video Using Audio Information (오디오 정보를 이용한 골프 동영상 자동 색인 알고리즘)

Kim, Hyoung-Gook
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.5
- /
- pp.441-446
- /
- 2009
This paper proposes an automatic indexing algorithm of golf video using audio information. In the proposed algorithm, the input audio stream is demultiplexed into the stream of video and audio. By means of Adaboost-cascade classifier, the continuous audio stream is classified into announcer's speech segment recorded in studio, music segment accompanied with players' names on TV screen, reaction segment of audience according to the play, reporter's speech segment with field background, filed noise segment like wind or waves. And golf swing sound including drive shot, iron shot, and putting shot is detected by the method of impulse onset detection and modulation spectrum verification. The detected swing and applause are used effectively to index action or highlight unit. Compared with video based semantic analysis, main advantage of the proposed system is its small computation requirement so that it facilitates to apply the technology to embedded consumer electronic devices for fast browsing.
https://doi.org/10.7776/ASK.2009.28.5.441 인용 PDF KSCI

Search Result 176, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)