• Title/Summary/Keyword: Vocabulary tree

Search Result 27, Processing Time 0.022 seconds

Corpus Based Unrestricted vocabulary Mandarin TTS (코퍼스 기반 무제한 단어 중국어 TTS)

  • Yu Zheng;Ha Ju-Hong;Kim Byeongchang;Lee Gary Geunbae
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.175-179
    • /
    • 2003
  • In order to produce a high quality (intelligibility and naturalness) synthesized speech, it is very important to get an accurate grapheme-to-phoneme conversion and prosody model. In this paper, we analyzed Chinese texts using a segmentation, POS tagging and unknown word recognition. We present a grapheme-to-phoneme conversion using a dictionary-based and rule-based method. We constructed a prosody model using a probabilistic method and a decision tree-based error correction method. According to the result from the above analysis, we can successfully select and concatenate exact synthesis unit of syllables from the Chinese Synthesis DB.

  • PDF

Automatic Photo Annotation using an Image Vocabulary Tree (이미지 어휘 트리를 이용한 사진의 자동 주석)

  • Kim, Jeong-Jung;Lee, Ju-Jang
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06b
    • /
    • pp.378-380
    • /
    • 2012
  • 디지털 카메라 및 카메라가 부착된 스마트폰의 보급으로 인해 개인 및 사업장에서 관리해야할 사진의 양이 증가 하였다. 본 논문에서는 이미지 어휘 트리를 이용하여 다량의 사진에 주석을 자동으로 추천 하는 알고리즘을 제안하여 대량의 사진 관리를 효과적으로 하도록 한다. 제안한 방법에서는 어휘트리 생성 시 학습 데이터 모음에 포함된 사진들의 주석에 대한 정보도 함께 갖도록 한다. 그래서 입력으로 들어온 사진과 가장 가까운 사진을 어휘 트리를 통해 찾고 찾은 사진이 가지고 있는 주석을 최종 출력으로 낸다. 제안한 알고리즘은 자동으로 다량의 사진의 주석을 빠르게 추천 할 수 있다.

POSTTS : Corpus Based Korean TTS based on Natural Language Analysis (POSTTS : 자연어 분석을 통한 코퍼스 기반 한국어 TTS)

  • Ha Ju-Hong;Zheng Yu;Kim Byeongchang;Lee Geunbae Lee
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.87-90
    • /
    • 2003
  • In order to produce high quality synthesized speech, it is very important to get an accurate grapheme-to-phoneme conversion and prosody model from texts using natural language processing. Robust preprocessing for non-Korean characters should also be required. In this paper, we analyzed Korean texts using a morphological analyzer, part-of-speech tagger and syntactic chunker. We present a new grapheme-to-phoneme conversion method, i.e. a dictionary-based and rule-based hybrid method, for unlimited vocabulary Korean TTS. We constructed a prosody model using a probabilistic method and decision tree-based method.

  • PDF

Evaluations of AI-based malicious PowerShell detection with feature optimizations

  • Song, Jihyeon;Kim, Jungtae;Choi, Sunoh;Kim, Jonghyun;Kim, Ikkyun
    • ETRI Journal
    • /
    • v.43 no.3
    • /
    • pp.549-560
    • /
    • 2021
  • Cyberattacks are often difficult to identify with traditional signature-based detection, because attackers continually find ways to bypass the detection methods. Therefore, researchers have introduced artificial intelligence (AI) technology for cybersecurity analysis to detect malicious PowerShell scripts. In this paper, we propose a feature optimization technique for AI-based approaches to enhance the accuracy of malicious PowerShell script detection. We statically analyze the PowerShell script and preprocess it with a method based on the tokens and abstract syntax tree (AST) for feature selection. Here, tokens and AST represent the vocabulary and structure of the PowerShell script, respectively. Performance evaluations with optimized features yield detection rates of 98% in both machine learning (ML) and deep learning (DL) experiments. Among them, the ML model with the 3-gram of selected five tokens and the DL model with experiments based on the AST 3-gram deliver the best performance.

A Study on Improving Speech Recognition Rate (H/W, S/W) of Speech Impairment by Neurological Injury (신경학적 손상에 의한 언어장애인 음성 인식률 개선(H/W, S/W)에 관한 연구)

  • Lee, Hyung-keun;Kim, Soon-hub;Yang, Ki-Woong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.11
    • /
    • pp.1397-1406
    • /
    • 2019
  • In everyday mobile phone calls between the disabled and non-disabled people due to neurological impairment, the communication accuracy is often hindered by combining the accuracy of pronunciation due to the neurological impairment and the pronunciation features of the disabled. In order to improve this problem, the limiting method is MEMS (micro electro mechanical systems), which includes an induction line that artificially corrects difficult vocalization according to the oral characteristics of the language impaired by improving the word of out of vocabulary. mechanical System) Microphone device improvement. S/W improvement is decision tree with invert function, and improved matrix-vector rnn method is proposed considering continuous word characteristics. Considering the characteristics of H/W and S/W, a similar dictionary was created, contributing to the improvement of speech intelligibility for smooth communication.

An effective automated ontology construction based on the agriculture domain

  • Deepa, Rajendran;Vigneshwari, Srinivasan
    • ETRI Journal
    • /
    • v.44 no.4
    • /
    • pp.573-587
    • /
    • 2022
  • The agricultural sector is completely different from other sectors since it completely relies on various natural and climatic factors. Climate changes have many effects, including lack of annual rainfall and pests, heat waves, changes in sea level, and global ozone/atmospheric CO2 fluctuation, on land and agriculture in similar ways. Climate change also affects the environment. Based on these factors, farmers chose their crops to increase productivity in their fields. Many existing agricultural ontologies are either domain-specific or have been created with minimal vocabulary and no proper evaluation framework has been implemented. A new agricultural ontology focused on subdomains is designed to assist farmers using Jaccard relative extractor (JRE) and Naïve Bayes algorithm. The JRE is used to find the similarity between two sentences and words in the agricultural documents and the relationship between two terms is identified via the Naïve Bayes algorithm. In the proposed method, the preprocessing of data is carried out through natural language processing techniques and the tags whose dimensions are reduced are subjected to rule-based formal concept analysis and mapping. The subdomain ontologies of weather, pest, and soil are built separately, and the overall agricultural ontology are built around them. The gold standard for the lexical layer is used to evaluate the proposed technique, and its performance is analyzed by comparing it with different state-of-the-art systems. Precision, recall, F-measure, Matthews correlation coefficient, receiver operating characteristic curve area, and precision-recall curve area are the performance metrics used to analyze the performance. The proposed methodology gives a precision score of 94.40% when compared with the decision tree(83.94%) and K-nearest neighbor algorithm(86.89%) for agricultural ontology construction.

Development of a Korean Speech Recognition Platform (ECHOS) (한국어 음성인식 플랫폼 (ECHOS) 개발)

  • Kwon Oh-Wook;Kwon Sukbong;Jang Gyucheol;Yun Sungrack;Kim Yong-Rae;Jang Kwang-Dong;Kim Hoi-Rin;Yoo Changdong;Kim Bong-Wan;Lee Yong-Ju
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.8
    • /
    • pp.498-504
    • /
    • 2005
  • We introduce a Korean speech recognition platform (ECHOS) developed for education and research Purposes. ECHOS lowers the entry barrier to speech recognition research and can be used as a reference engine by providing elementary speech recognition modules. It has an easy simple object-oriented architecture, implemented in the C++ language with the standard template library. The input of the ECHOS is digital speech data sampled at 8 or 16 kHz. Its output is the 1-best recognition result. N-best recognition results, and a word graph. The recognition engine is composed of MFCC/PLP feature extraction, HMM-based acoustic modeling, n-gram language modeling, finite state network (FSN)- and lexical tree-based search algorithms. It can handle various tasks from isolated word recognition to large vocabulary continuous speech recognition. We compare the performance of ECHOS and hidden Markov model toolkit (HTK) for validation. In an FSN-based task. ECHOS shows similar word accuracy while the recognition time is doubled because of object-oriented implementation. For a 8000-word continuous speech recognition task, using the lexical tree search algorithm different from the algorithm used in HTK, it increases the word error rate by $40\%$ relatively but reduces the recognition time to half.