• 제목/요약/키워드: speaker tracking

검색결과 23건 처리시간 0.031초

Speaker Tracking Using Eigendecomposition and an Index Tree of Reference Models

  • Moattar, Mohammad Hossein;Homayounpour, Mohammad Mehdi
    • ETRI Journal
    • /
    • 제33권5호
    • /
    • pp.741-751
    • /
    • 2011
  • This paper focuses on online speaker tracking for telephone conversations and broadcast news. Since the online applicability imposes some limitations on the tracking strategy, such as data insufficiency, a reliable approach should be applied to compensate for this shortage. In this framework, a set of reference speaker models are used as side information to facilitate online tracking. To improve the indexing accuracy, adaptation approaches in eigenvoice decomposition space are proposed in this paper. We believe that the eigenvoice adaptation techniques would help to embed the speaker space in the models and hence enrich the generality of the selected speaker models. Also, an index structure of the reference models is proposed to speed up the search in the model space. The proposed framework is evaluated on 2002 Rich Transcription Broadcast News and Conversational Telephone Speech corpus as well as a synthetic dataset. The indexing errors of the proposed framework on telephone conversations, broadcast news, and synthetic dataset are 8.77%, 9.36%, and 12.4%, respectively. Using the index tree structure approach, the run time of the proposed framework is improved by 22%.

화상회의 영상에서 움직이는 화자의 분할 및 추적 알고리즘 (Segmentation and Tracking Algorithm for Moving Speaker in the Video Conference Image)

  • 최우영;김한메
    • 전기전자학회논문지
    • /
    • 제6권1호
    • /
    • pp.54-64
    • /
    • 2002
  • 본 논문에서는 화상회의 영상 데이터에서 화자를 분할하고, 그 움직임을 추적하는 알고리즘을 제안하였다. 실시간 처리가 가능하도록 화자 분할과 움직임 추적 순으로 처리되는 알고리즘으로 단순화하였다. 분할 한계에서는 차분 방법에 의해 구한 움직임 정보와 영상의 밝기 정보를 사용하여 화자를 분할하였다. 분할된 화자로부터 기준 마스크 영상을 생성하였다. 움직임 추적 단계에서는 움직임 추적에 불필요한 블록들은 제외함으로써 빠르게 움직임을 추적할 수 있는 블록정합 알고리즘을 사용하여 추적하였다. 시뮬레이션에서 여러 시험 영상에 제안한 알고리즘을 적용하여 움직이는 화자를 분할하고, 그 움직임를 추적하는 올바른 결과를 얻을 수 있었다.

  • PDF

자율형 이동로봇을 위한 전방위 화자 추종 시스템 (Speaker Tracking System for Autonomous Mobile Robot)

  • 이창훈;김용호
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2002년도 합동 추계학술대회 논문집 정보 및 제어부문
    • /
    • pp.142-145
    • /
    • 2002
  • This paper describes a omni-directionally speaker tracking system for mobile robot interface in real environment. Its purpose is to detect a robust 360-degree sound source and to recognize voice command at a long distance(60-300cm). We consider spatial features, the relation of position and interaural time differences, and realize speaker tracking system using fuzzy inference process based on inference rules generated by its spatial features.

  • PDF

통계적 기법을 이용한 화자변화 검출 실험 (A Speaker Change Detection Experiment that Uses a Statistical Method)

  • 이경록;김진영
    • 음성과학
    • /
    • 제8권4호
    • /
    • pp.59-72
    • /
    • 2001
  • In this paper, we experimented with speaker change detection that uses a statistical method for NOD (News On Demand) service. A specified speaker's change can find out content of each data in speech if analysed because it means change of data contents in news data. Speaker change detection acts as preprocessor that divide input speech by speaker. This is an important preprocessor phase for speaker tracking. We detected speaker change using GLR(generalized likelihood ratio) distance base division and BIC (Bayesian information criterion) base division among matrix method. An experiment verified speaker change point using BIC base division after divide by speaker unit using GLR distance base method first. In the experimental result, FAR (False Alarm Rate) was 63.29 in high noise environment and FAR was 54.28 in low noise environment in MDR (Missed Detection Rate) 15% neighborhood.

  • PDF

인명지킴이 시스템 기반 사회재난 대응 실증 연구 - IDS 기술을 활용한 수난 방지 시스템 시나리오 개발 - (Development of a flood prevention system scenario using IoT Directional speaker Seamless-tracking technology)

  • Lee, Yongsuk;Im, Sua;Shin, Jongkyun
    • 한국재난정보학회 논문집
    • /
    • 제13권1호
    • /
    • pp.106-117
    • /
    • 2017
  • 본 연구는 사회재난의 선제예방 및 효과적 대응을 위해 개발되는 인명지킴이 시스템의 효율적인 실증을 위한 시나리오를 제시한다. 사회재난 대응을 위한 융 복합기술 기반의 지향성 스피커 등을 활용한 인명지킴이 시스템 개발에 요구되는 사고유형 및 개발 중인 기술을 기반으로 사고 예방 및 대응이 신속하게 이루어질 수 있도록 시나리오를 제시한다.

Invisible Messenger: A System to Whisper in a Person′s Ear Remotely by integrating Visual Tracking and Speaker Array

  • Mizoguchi, Hiroshi;Kanamori, Tomohiko;Okabe, Kosuke;Hiraoka, Kazuyuki;Tanaka, Masaru;Shigehara, Takaomi;Mishima, Taketoshi
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 ITC-CSCC -3
    • /
    • pp.1897-1900
    • /
    • 2002
  • This paper proposes a novel computer-human interface, named invisible Messenger. It integrates face detection and tracking, and speaker array signal processing. By speaker array it is possible to form acoustic focus at the arbitrary location that is measured by the face tracking. Thus the proposed system can whisper in a person's ear as if an invisible virtual messenger were standing by the person. Not only speculative discussion, the authors have implemented a working prototype system based upon the proposed idea. This paper also describes about this prototype. In order to confirm effectiveness of the proposed idea, the authors conduct experiments using the implemented system. Experimental results demonstrate the effectivenss of the proposed idea.

  • PDF

Microsoft-Kinect 센서를 활용한 화자추적 시스템 (Microsoft-Kinect Sensor utilizing People Tracking System)

  • 반태학;이상원;김재민;정회경
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2015년도 춘계학술대회
    • /
    • pp.611-613
    • /
    • 2015
  • 멀티미디어 강의실에서는 자동 강의 저장뿐 아니라 카메라의 추적도 자동으로 추적하여 저장되도록 발전하고 있다. 기존의 추적 시스템은 별도의 센서를 몸에 부착하여 추적하거나 전면에 센서를 시공하여 추적하는 불편함이 있었고 동시에 여러명이 전면에 나타나면 에러가 발생하여 추적이 안된다거나 하는 문제점이 있었다. 본 논문에서는 Microsoft-Kinect 센서를 이용하여 화자(강사)의 위치 및 행동을 분석하며, 이를 PTZ 카메라 및 강의 저장 수업녹화 시스템과 연동하여 강의실 수업 녹화시에 효과적인 콘텐츠 생산을 가능하도록 하는 무인화자 추적 솔루션에 대하여 기술하였다.

  • PDF

화자의 긍정·부정 의도를 전달하는 실용적 텔레프레즌스 로봇 시스템의 개발 (Development of a Cost-Effective Tele-Robot System Delivering Speaker's Affirmative and Negative Intentions)

  • 진용규;유수정;조혜경
    • 로봇학회논문지
    • /
    • 제10권3호
    • /
    • pp.171-177
    • /
    • 2015
  • A telerobot offers a more engaging and enjoyable interaction with people at a distance by communicating via audio, video, expressive gestures, body pose and proxemics. To provide its potential benefits at a reasonable cost, this paper presents a telepresence robot system for video communication which can deliver speaker's head motion through its display stanchion. Head gestures such as nodding and head-shaking can give crucial information during conversation. We also can assume a speaker's eye-gaze, which is known as one of the key non-verbal signals for interaction, from his/her head pose. In order to develop an efficient head tracking method, a 3D cylinder-like head model is employed and the Harris corner detector is combined with the Lucas-Kanade optical flow that is known to be suitable for extracting 3D motion information of the model. Especially, a skin color-based face detection algorithm is proposed to achieve robust performance upon variant directions while maintaining reasonable computational cost. The performance of the proposed head tracking algorithm is verified through the experiments using BU's standard data sets. A design of robot platform is also described as well as the design of supporting systems such as video transmission and robot control interfaces.

사출성형 시뮬레이션에 의한 휴대폰 스피커 인클로저의 에어트랩 위치 최적화 (Optimizations of Air-trap Locations in the Speaker Encloser of Mobile Phone by Injection Molding Simulations)

  • 박기윤;박종천
    • 한국기계가공학회지
    • /
    • 제10권5호
    • /
    • pp.85-90
    • /
    • 2011
  • In this paper a design procedure via computer-aided molding simulation is presented to optimize the air-trap locations in a speaker encloser of mobile phone. The molding flow simulation reveals that the race-tracking phenomenon is the dominant feature in the current mold design. In obtaining an optimal filling pattern, the local modifications of the wall thickness such as in a flow leader attachment are considered as the primary control factor, and both the gate position and the filling time become the secondary control factor. In the one-at-a-time approach, the last location to be filled in the mold cavity could be successfully moved to the extremities of the part, allowing a natural ventilation of entrapped air through the mold parting plane.

고지향성 스피커를 위한 새로운 전력 증폭기 설계 (Design of High-efficiency Power Amplifier System for High-directional Speaker)

  • 김진영;김인동;문원규
    • 전기학회논문지
    • /
    • 제66권8호
    • /
    • pp.1215-1221
    • /
    • 2017
  • Parametric array transducers are used for highly directional speaker in an air environments. Piezoelectric micromachined ultrasonic transducers for parametric array transducers need DC-biased voltage driving signals in order to get high-directional quality-sound features. The existing power amplifier such as class A amplifiers has low efficiency and require large volume heatsinks. To overcome the above-mentioned disadvantages of the conventional amplifier, this paper proposes a new power amplifier system. The proposed power amplifier system ensures high linearity of output characteristic by utilizing the push-pull class B type amplifier. Furthermore, the proposed power amplifier system gets high efficiency because it contains the DC-DC converter-type power supply which can perform energy recovery and envelope tracking function. Also the paper suggests the detailed circuit topology. Its characteristics are verified by the detailed experimental results.