• Title/Summary/Keyword: 음성통신

Search Result 2,219, Processing Time 0.031 seconds

A study of Artificial Intelligence (AI) Speaker's Development Process in Terms of Social Constructivism: Focused on the Products and Periodic Co-revolution Process (인공지능(AI) 스피커에 대한 사회구성 차원의 발달과정 연구: 제품과 시기별 공진화 과정을 중심으로)

  • Cha, Hyeon-ju;Kweon, Sang-hee
    • Journal of Internet Computing and Services
    • /
    • v.22 no.1
    • /
    • pp.109-135
    • /
    • 2021
  • his study classified the development process of artificial intelligence (AI) speakers through analysis of the news text of artificial intelligence (AI) speakers shown in traditional news reports, and identified the characteristics of each product by period. The theoretical background used in the analysis are news frames and topic frames. As analysis methods, topic modeling and semantic network analysis using the LDA method were used. The research method was a content analysis method. From 2014 to 2019, 2710 news related to AI speakers were first collected, and secondly, topic frames were analyzed using Nodexl algorithm. The result of this study is that, first, the trend of topic frames by AI speaker provider type was different according to the characteristics of the four operators (communication service provider, online platform, OS provider, and IT device manufacturer). Specifically, online platform operators (Google, Naver, Amazon, Kakao) appeared as a frame that uses AI speakers as'search or input devices'. On the other hand, telecommunications operators (SKT, KT) showed prominent frames for IPTV, which is the parent company's flagship business, and 'auxiliary device' of the telecommunication business. Furthermore, the frame of "personalization of products and voice service" was remarkable for OS operators (MS, Apple), and the frame for IT device manufacturers (Samsung) was "Internet of Things (IoT) Integrated Intelligence System". The econd, result id that the trend of the topic frame by AI speaker development period (by year) showed a tendency to develop around AI technology in the first phase (2014-2016), and in the second phase (2017-2018), the social relationship between AI technology and users It was related to interaction, and in the third phase (2019), there was a trend of shifting from AI technology-centered to user-centered. As a result of QAP analysis, it was found that news frames by business operator and development period in AI speaker development are socially constituted by determinants of media discourse. The implication of this study was that the evolution of AI speakers was found by the characteristics of the parent company and the process of co-evolution due to interactions between users by business operator and development period. The implications of this study are that the results of this study are important indicators for predicting the future prospects of AI speakers and presenting directions accordingly.

Real data-based active sonar signal synthesis method (실데이터 기반 능동 소나 신호 합성 방법론)

  • Yunsu Kim;Juho Kim;Jongwon Seok;Jungpyo Hong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.1
    • /
    • pp.9-18
    • /
    • 2024
  • The importance of active sonar systems is emerging due to the quietness of underwater targets and the increase in ambient noise due to the increase in maritime traffic. However, the low signal-to-noise ratio of the echo signal due to multipath propagation of the signal, various clutter, ambient noise and reverberation makes it difficult to identify underwater targets using active sonar. Attempts have been made to apply data-based methods such as machine learning or deep learning to improve the performance of underwater target recognition systems, but it is difficult to collect enough data for training due to the nature of sonar datasets. Methods based on mathematical modeling have been mainly used to compensate for insufficient active sonar data. However, methodologies based on mathematical modeling have limitations in accurately simulating complex underwater phenomena. Therefore, in this paper, we propose a sonar signal synthesis method based on a deep neural network. In order to apply the neural network model to the field of sonar signal synthesis, the proposed method appropriately corrects the attention-based encoder and decoder to the sonar signal, which is the main module of the Tacotron model mainly used in the field of speech synthesis. It is possible to synthesize a signal more similar to the actual signal by training the proposed model using the dataset collected by arranging a simulated target in an actual marine environment. In order to verify the performance of the proposed method, Perceptual evaluation of audio quality test was conducted and within score difference -2.3 was shown compared to actual signal in a total of four different environments. These results prove that the active sonar signal generated by the proposed method approximates the actual signal.

Front-End Processing for Speech Recognition in the Telephone Network (전화망에서의 음성인식을 위한 전처리 연구)

  • Jun, Won-Suk;Shin, Won-Ho;Yang, Tae-Young;Kim, Weon-Goo;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.4
    • /
    • pp.57-63
    • /
    • 1997
  • In this paper, we study the efficient feature vector extraction method and front-end processing to improve the performance of the speech recognition system using KT(Korea Telecommunication) database collected through various telephone channels. First of all, we compare the recognition performances of the feature vectors known to be robust to noise and environmental variation and verify the performance enhancement of the recognition system using weighted cepstral distance measure methods. The experiment result shows that the recognition rate is increasedby using both PLP(Perceptual Linear Prediction) and MFCC(Mel Frequency Cepstral Coefficient) in comparison with LPC cepstrum used in KT recognition system. In cepstral distance measure, the weighted cepstral distance measure functions such as RPS(Root Power Sums) and BPL(Band-Pass Lifter) help the recognition enhancement. The application of the spectral subtraction method decrease the recognition rate because of the effect of distortion. However, RASTA(RelAtive SpecTrAl) processing, CMS(Cepstral Mean Subtraction) and SBR(Signal Bias Removal) enhance the recognition performance. Especially, the CMS method is simple but shows high recognition enhancement. Finally, the performances of the modified methods for the real-time implementation of CMS are compared and the improved method is suggested to prevent the performance degradation.

  • PDF

A Benchmark of AI Application based on Open Source for Data Mining Environmental Variables in Smart Farm (스마트 시설환경 환경변수 분석을 위한 Open source 기반 인공지능 활용법 분석)

  • Min, Jae-Ki;Lee, DongHoon
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2017.04a
    • /
    • pp.159-159
    • /
    • 2017
  • 스마트 시설환경은 대표적으로 원예, 축산 분야 등 여러 형태의 농업현장에 정보 통신 및 데이터 분석 기술을 도입하고 있는 시설화된 생산 환경이라 할 수 있다. 근래에 하드웨어적으로 급증한 스마트 시설환경에서 생산되는 방대한 생육/환경 데이터를 올바르고 적합하게 사용하기 위해서는 일반 산업 현장과는 차별화 된 분석기법이 요구된다고 할 수 있다. 소프트웨어 공학 분야에서 연구된 빅데이터 처리 기술을 기계적으로 농업 분야의 빅데이터에 적용하기에는 한계가 있을 수 있다. 시설환경 내/외부의 다양한 환경 변수는 시계열 데이터의 난해성, 비가역성, 불특정성, 비정형 패턴 등에 기인하여 예측 모델 연구가 매우 난해한 대상이기 때문이라 할 수 있다. 본 연구에서는 근래에 관심이 급증하고 있는 인공신경망 연구 소프트웨어인 Tensorflow (www.tensorflow.org)와 대표적인 Open source인 OpenNN (www.openn.net)을 스마트 시설환경 환경변수 상호간 상관성 분석에 응용하였다. 해당 소프트웨어 라이브러리의 운영환경을 살펴보면 Tensorflow 는 Linux(Ubuntu 16.04.4), Max OS X(EL capitan 10.11), Windows (x86 compatible)에서 활용가능하고, OpenNN은 별도의 운영환경에 대한 바이너리를 제공하지 않고 소스코드 전체를 제공하므로, 해당 운영환경에서 바이너리 컴파일 후 활용이 가능하다. 소프트웨어 개발 언어의 경우 Tensorflow는 python이 기본 언어이며 python(v2.7 or v3.N) 가상 환경 내에서 개발이 수행이 된다. 주의 깊게 살펴볼 부분은 이러한 개발 환경의 제약으로 인하여 Tensorflow의 주요한 장점 중에 하나인 고속 연산 기능 수행이 일부 운영 환경에 국한이 되어 제공이 된다는 점이다. GPU(Graphics Processing Unit)의 제공하는 하드웨어 가속기능은 Linux 운영체제에서 활용이 가능하다. 가상 개발 환경에 운영되는 한계로 인하여 실시간 정보 처리에는 한계가 따르므로 이에 대한 고려가 필요하다. 한편 근래(2017.03)에 공개된 Tensorflow API r1.0의 경우 python, C++, Java언어와 함께 Go라는 언어를 새로 지원하여 개발자의 활용 범위를 매우 높였다. OpenNN의 경우 C++ 언어를 기본으로 제공하며 C++ 컴파일러를 지원하는 임의의 개발 환경에서 모두 활용이 가능하다. 특징은 클러스터링 플랫폼과 연동을 통해 하드웨어 가속 기능의 부재를 일부 극복했다는 점이다. 상기 두 가지 패키지를 이용하여 2016년 2월부터 5월 까지 충북 음성군 소재 딸기 온실 내부에서 취득한 온도, 습도, 조도, CO2에 대하여 Large-scale linear model을 실험적(시간단위, 일단위, 주단위 분할)으로 적용하고, 인접한 세그먼트의 환경변수 예측 모델링을 수행하였다. 동일한 조건의 학습을 수행함에 있어, Tensorflow가 개발 소요 시간과 학습 실행 속도 측면에서 매우 우세하였다. OpenNN을 이용하여 대등한 성능을 보이기 위해선 병렬 클러스터링 기술을 활용해야 할 것이다. 오프라인 일괄(Offline batch)처리 방식의 한계가 있는 인공신경망 모델링 기법과 현장 보급이 불가능한 고성능 하드웨어 연산 장치에 대한 대안 마련을 위한 연구가 필요하다.

  • PDF

An Efficient IPTV Distribution Network by Packet Transport System (Packet Transport System에 의한 효율적인 IPTV 분배망 구축 방안)

  • Jang, Jin-Hee;Park, Seung-Kwon;Roh, Jin-Young;Noh, Francis Tai
    • Journal of Broadcast Engineering
    • /
    • v.12 no.2
    • /
    • pp.80-92
    • /
    • 2007
  • IPTV Services that is representative union service of broadcasting and telecommunication need guarantee of QoS, efficiency of multicasting, and hish bandwidth on the network. Because typical TDM based metro transport network was designed by transporting fixed voice traffic with stable and recovering method, it has a defect of bottleneck and a waste of bandwidth for acceptance of data traffic with burst feature and then all of data are treated equally at the transport network because it cannot classify between advanced high end service and best effort low end service. for completely resolving this kind of problem about increasing burst traffic and QoS issues, firstly we need to new design for transport network. This paper presents transformation method from TDM based metro transport network to packet based transport network and advantage and effectiveness of packet based transport network and also indicates technical factor and characters about method of packet transport system. As a result of research, the Packet Transport System, which is a transmission network for packet delivery, take in not only a specific character of legacy TDM but QoS, Multicast and high bandwidth, then, it is able to keep an effective bandwidth and a stabilized performance of packet transmissions. Additionally, if a fault be occurred on an optical link, the system is able to guarantee a differential QoS by an each service class using an algorithm to make certain of a traffic existence and contain a protective mechanism.

Implementation of RTOS Simulator With Execution Time Estimation (실행시간 추정 가능한 RTOS 시뮬레이터의 구현)

  • 김방현;류성준;김종현;남영광;이광용
    • Proceedings of the Korea Society for Simulation Conference
    • /
    • 2002.05a
    • /
    • pp.125-129
    • /
    • 2002
  • 실시간 운영체제(Real-Time Operating System: 이하 RTOS라 함) 개발환경에서 제공하는 도구 중에 하나인 RTOS 시뮬레이터는 타겟 하드웨어가 호스트에 연결되어 있지 않아도 호스트에서 응용프로그램의 개발과 디버깅을 가능하게 해주는 타겟 시뮬레이션 환경을 제공해 줌으로서, 개발자로 하여금 빠른 시간 내에 응용프로그램을 개발할 수 있도록 지원하며 하드웨어 개발이 완료되기 전에도 응용프로그램을 개발할 수 있게 해 준다. 그러한 이유로 현재 대부분의 상용 RTOS 개발환경에서는 RTOS 시뮬레이터를 제공하고 있다. 그러나 현재 상용 RTOS 시뮬레이터들은 대부분 RTOS의 기능적인 부분들만 호스트에서 동작하도록 구현되어 있어서 RTOS나 RTOS 응용프로그램이 실제 타겟에서 실행될 때의 실질적인 시간 추정이 불가능하다. 이러한 문제점은 실시간 시스템이 정해진 시간 내에 결과를 출력해야 하는 시스템임을 감안한다면 RTOS 시뮬레이터의 가장 큰 결점이 되기 때문에 실행시간 추정 기능을 가지면서 실용화도 가능한 RTOS 시뮬레이터가 필요하다. 본 연구에서는 이러한 문제점을 해결하여 RTOS와 RTOS 응용프로그램이 실제 타겟에서 처리될 때의 실행시간 추정이 가능하고 상용화가 가능한 기계 명령어 기반(machine instruction-based)의 RTOS 시뮬레이터를 연구 개발하였다. 나아가 실행시간의 주요 요소인 파이프라인과 캐쉬의 영향도 고려함으로서 실행시간 추정의 정확도를 향상시켰다 본 연구에서 사용된 RTOS는 한국전자통신연구원(ETRI)에서 2000년에 개발된 Q+이고, Q+가 동작하는 타겟 하드웨어는 ARM 계열의 StrongARM SA-110 마이크로프로세서와 21285 주제어기가 장착된 EBSA-285 보드이다. 측정하면서 수행하였다. 검증 결과 random 상태에서는 문헌자료에 부합되는 예측결과를 보여주었으나, intermediate와 constant 상태에서는 문헌보다 다소 낮은 속도를 보여주었다 이러한 속도차는 추후 현장 데이터를 수집하여 보다 실질적인 검증을 통하여 조정되어야 할 것으로 판단된다.지발광(1.26초)보다 구애발광(1.12초)에서 0.88배 감소하였고, 암컷에서 정지발광(2.99초)보다 구애발광(1.06초)에서 0.35배 감소하였다. 발광양상에서 발광주파수는 수짓의 정지발광에서 0.8 Hz, 수컷 구애발광에서 0.9 Hz, 암컷의 정지발광에서 0.3 Hz, 암컷의 구애발광에서 0.9 Hz로 각각 나타났다. H. papariensis의 발광파장영역은 400 nm에서 700 nm에 이르는 모든 영역에서 확인되었으며 가장 높은 첨두치는 600 nm에 있고 500에서 600 nm 사이의 파장대가 가장 두드러지게 나타났다. 발광양상과 어우러진 교미행동은 Hp system과 같은 결과를 얻었다.하는 방법을 제안한다. 즉 채널 액세스 확률을 각 슬롯에서 예약상태에 있는 음성 단말의 수뿐만 아니라 각 슬롯에서 예약을 하려고 하는 단말의 수에 기초하여 산출하는 방법을 제안하고 이의 성능을 분석하였다. 시뮬레이션에 의해 새로 제안된 채널 허용 확률을 산출하는 방식의 성능을 비교한 결과 기존에 제안된 방법들보다 상당한 성능의 향상을 볼 수 있었다., 인삼이 성장될 때 부분적인 영양상태의 불충분이나 기후 등에 따른 영향을 받을 수 있기 때문에 앞으로 이에 대한 많은 연구가 이루어져야할 것으로 판단된다.태에도 불구하고 [-wh]의미의 겹의문사는 병렬적 관계의 합성어가 아니라 내부구조를 지니지 않은 단순한 단어(minimal $X^{0}$

  • PDF

Mutual-Backup Architecture of SIP-Servers in Wireless Backbone based Networks (무선 백본 기반 통신망을 위한 상호 보완 SIP 서버 배치 구조)

  • Kim, Ki-Hun;Lee, Sung-Hyung;Kim, Jae-Hyun
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.1
    • /
    • pp.32-39
    • /
    • 2015
  • The voice communications with wireless backbone based networks are evolving into a packet switching VoIP systems. In those networks, a call processing scheme is required for management of subscribers and connection between them. A VoIP service scheme for those systems requires reliable subscriber management and connection establishment schemes, but the conventional call processing schemes based on the centralized server has lack of reliability. Thus, the mutual-backup architecture of SIP-servers is required to ensure efficient subscriber management and reliable VoIP call processing capability, and the synchronization and call processing schemes should be changed as the architecture is changed. In this paper, a mutual-backup architecture of SIP-servers is proposed for wireless backbone based networks. A message format for synchronization and information exchange between SIP servers is also proposed in the paper. This paper also proposes a FSM scheme for the fast call processing in unreliable networks to detect multiple servers at a time. The performance analysis results show that the mutual backup server architecture increases the call processing success rates than conventional centralized server architecture. Also, the FSM scheme provides the smaller call processing times than conventional SIP, and the time is not increased although the number of SIP servers in the networks is increased.

Design and implementation of a 3-axis Motion Sensor based SWAT Hand-signal Motion-recognition System (3축 모션 센서 기반 SWAT 수신호 모션 인식 시스템 설계 및 구현)

  • Yun, June;Pyun, Kihyun
    • Journal of Internet Computing and Services
    • /
    • v.15 no.4
    • /
    • pp.33-42
    • /
    • 2014
  • Hand-signal is an effective communication means in the situation where voice cannot be used for expression especially for soldiers. Vision-based approaches using cameras as input devices are widely suggested in the literature. However, these approaches are not suitable for soldiers that have unseen visions in many cases. in addition, existing special-glove approaches utilize the information of fingers only. Thus, they are still lack for soldiers' hand-signal recognition that involves not only finger motions, but also additional information such as the rotation of a hand. In this paper, we have designed and implemented a new recognition system for six military hand-signal motions, i. e., 'ready', 'move', quick move', 'crawl', 'stop', and 'lying-down'. For this purpose, we have proposed a finger-recognition method and motion-recognition methods. The finger-recognition method discriminate how much each finger is bended, i. e., 'completely flattened', 'slightly flattened', 'slightly bended', and 'completely bended'. The motion-recognition algorithms are based on the characterization of each hand-signal motion in terms of the three axes. Through repetitive experiments, our system have shown 91.2% of correct recognition.

Implementation of RTP/RTCP for Teleconferencing System and Analysis of Quality-of-Service using Audio Data Transmission (영상회의 시스템을 위한 RTP/RTCP 구현 및 오디오 데이터 전송을 위용한 QoS 분석)

  • Kang, Min-Gyu;Hwang, Seung-Koo;Kim, Dong-Kyoo
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.12
    • /
    • pp.3047-3062
    • /
    • 1998
  • This paper deseribes the desihn and the implementation of the Realtime Transport Protocol(RTP)/ Rdaltime Control Protocol(RTCP) (RFC 1889,1890) that is used to transmit the audio/video data to any destination and to feedback the Quality of Service (QoS) information of the received media data to the sender, in the teleconferencing systems proposed by ITU-T. These protocols are implemented with multi thead technique and run on top of UDP/IP-Multicast through the socket interface as the underlying protocol. The upper layer is impelmented such that in can be accessed by the H245 comference control protocol. The RTP packetizes the digitized audio/video data from the encoder info a fixed format, and multieast to the participants. The RTCP monitors RTP packets and extracts the QoS values from it such as round-trip delay, jiter and packet loss to form RTCP packets and non periokically sends them to the sender site. In this Paper, we also descritx the study of measurement and analysis for QoS factors that observed on performing teleconferencing system over Internet. The results from this experiment is indicate that RTT and Jitter value are acceptable even entwork load is high. However, it appears that packet loss rate is high in daytime and most losses periods have length one or two.

  • PDF

Performance Comparison of Out-Of-Vocabulary Word Rejection Algorithms in Variable Vocabulary Word Recognition (가변어휘 단어 인식에서의 미등록어 거절 알고리즘 성능 비교)

  • 김기태;문광식;김회린;이영직;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.27-34
    • /
    • 2001
  • Utterance verification is used in variable vocabulary word recognition to reject the word that does not belong to in-vocabulary word or does not belong to correctly recognized word. Utterance verification is an important technology to design a user-friendly speech recognition system. We propose a new utterance verification algorithm for no-training utterance verification system based on the minimum verification error. First, using PBW (Phonetically Balanced Words) DB (445 words), we create no-training anti-phoneme models which include many PLUs(Phoneme Like Units), so anti-phoneme models have the minimum verification error. Then, for OOV (Out-Of-Vocabulary) rejection, the phoneme-based confidence measure which uses the likelihood between phoneme model (null hypothesis) and anti-phoneme model (alternative hypothesis) is normalized by null hypothesis, so the phoneme-based confidence measure tends to be more robust to OOV rejection. And, the word-based confidence measure which uses the phoneme-based confidence measure has been shown to provide improved detection of near-misses in speech recognition as well as better discrimination between in-vocabularys and OOVs. Using our proposed anti-model and confidence measure, we achieve significant performance improvement; CA (Correctly Accept for In-Vocabulary) is about 89%, and CR (Correctly Reject for OOV) is about 90%, improving about 15-21% in ERR (Error Reduction Rate).

  • PDF