• Title/Summary/Keyword: pseudo-word

Search Result 37, Processing Time 0.02 seconds

A Study on Pseudo N-gram Language Models for Speech Recognition (음성인식을 위한 의사(疑似) N-gram 언어모델에 관한 연구)

  • 오세진;황철준;김범국;정호열;정현열
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.2 no.3
    • /
    • pp.16-23
    • /
    • 2001
  • In this paper, we propose the pseudo n-gram language models for speech recognition with middle size vocabulary compared to large vocabulary speech recognition using the statistical n-gram language models. The proposed method is that it is very simple method, which has the standard structure of ARPA and set the word probability arbitrary. The first, the 1-gram sets the word occurrence probability 1 (log likelihood is 0.0). The second, the 2-gram also sets the word occurrence probability 1, which can only connect the word start symbol and WORD, WORD and the word end symbol . Finally, the 3-gram also sets the ward occurrence probability 1, which can only connect the word start symbol , WORD and the word end symbol . To verify the effectiveness of the proposed method, the word recognition experiments are carried out. The preliminary experimental results (off-line) show that the word accuracy has average 97.7% for 452 words uttered by 3 male speakers. The on-line word recognition results show that the word accuracy has average 92.5% for 20 words uttered by 20 male speakers about stock name of 1,500 words. Through experiments, we have verified the effectiveness of the pseudo n-gram language modes for speech recognition.

  • PDF

A Scalable Word-based RSA Cryptoprocessor with PCI Interface Using Pseudo Carry Look-ahead Adder (가상 캐리 예측 덧셈기와 PCI 인터페이스를 갖는 분할형 워드 기반 RSA 암호 칩의 설계)

  • Gwon, Taek-Won;Choe, Jun-Rim
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.39 no.8
    • /
    • pp.34-41
    • /
    • 2002
  • This paper describes a scalable implementation method of a word-based RSA cryptoprocessor using pseudo carry look-ahead adder The basic organization of the modular multiplier consists of two layers of carry-save adders (CSA) and a reduced carry generation and Propagation scheme called the pseudo carry look-ahead adder for the high-speed final addition. The proposed modular multiplier does not need complicated shift and alignment blocks to generate the next word at each clock cycle. Therefore, the proposed architecture reduces the hardware resources and speeds up the modular computation. We implemented a single-chip 1024-bit RSA cryptoprocessor based on the word-based modular multiplier with 256 datapaths in 0.5${\mu}{\textrm}{m}$ SOG technology after verifying the proposed architectures using FPGA with PCI bus.

The Effect of Syllable Frequency, Syllable Type and Final Consonant on Hangeul Word and Pseudo-word Lexical Decision: An Analysis of the Korean Lexicon Project Database (한글 두 글자 단어와 비단어의 어휘판단에 글자 빈도, 글자 유형, 받침이 미치는 영향: KLP 자료의 분석)

  • Myong Seok Shin;ChangHo Park
    • Korean Journal of Cognitive Science
    • /
    • v.34 no.4
    • /
    • pp.277-297
    • /
    • 2023
  • This study attempted to find out how lexical decision of two-syllable words or pseudo-words is affected by syllabic information, such as syllable frequency, syllable (i.e. vowel) type, and presence of final consonant (i.e. batchim), through the analysis of the Korean Lexicon Project Database (KLP-DB). Hierarchical regression of RT data showed that lexical decision of words was influenced by the frequency of the first syllable, the syllable type of the first and second syllables, batchim for the first and second syllables, and also by the interaction of the two syllable types and the interaction of syllable frequency and batchim of the second syllable. For pseudo-words lexical decision was influenced by the frequency of the first and second syllables, syllable type of the first syllable, and batchim for the first and second syllables, and also by the interaction of the two syllable frequencies, the interaction of the two syllable types, and the interaction of syllable frequency and batchim of the first syllable. Word frequency had a strong effect on lexical decision of words, while syllabic information had a stable effect on the lexical decision of pseudo-words. These results indicate that syllabic information should be seriously considered in constructing word and pseudo-word lists and interpreting lexical decision time. Understanding the effect of syllabic information will also contribute to the understanding of word recognition process.

Word Embeddings-Based Pseudo Relevance Feedback Using Deep Averaging Networks for Arabic Document Retrieval

  • Farhan, Yasir Hadi;Noah, Shahrul Azman Mohd;Mohd, Masnizah;Atwan, Jaffar
    • Journal of Information Science Theory and Practice
    • /
    • v.9 no.2
    • /
    • pp.1-17
    • /
    • 2021
  • Pseudo relevance feedback (PRF) is a powerful query expansion (QE) technique that prepares queries using the top k pseudorelevant documents and choosing expansion elements. Traditional PRF frameworks have robustly handled vocabulary mismatch corresponding to user queries and pertinent documents; nevertheless, expansion elements are chosen, disregarding similarity to the original query's elements. Word embedding (WE) schemes comprise techniques of significant interest concerning QE, that falls within the information retrieval domain. Deep averaging networks (DANs) defines a framework relying on average word presence passed through multiple linear layers. The complete query is understandably represented using the average vector comprising the query terms. The vector may be employed for determining expansion elements pertinent to the entire query. In this study, we suggest a DANs-based technique that augments PRF frameworks by integrating WE similarities to facilitate Arabic information retrieval. The technique is based on the fundamental that the top pseudo-relevant document set is assessed to determine candidate element distribution and select expansion terms appropriately, considering their similarity to the average vector representing the initial query elements. The Word2Vec model is selected for executing the experiments on a standard Arabic TREC 2001/2002 set. The majority of the evaluations indicate that the PRF implementation in the present study offers a significant performance improvement compared to that of the baseline PRF frameworks.

Query-based Document Summarization using Pseudo Relevance Feedback based on Semantic Features and WordNet (의미특징과 워드넷 기반의 의사 연관 피드백을 사용한 질의기반 문서요약)

  • Kim, Chul-Won;Park, Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.7
    • /
    • pp.1517-1524
    • /
    • 2011
  • In this paper, a new document summarization method, which uses the semantic features and the pseudo relevance feedback (PRF) by using WordNet, is introduced to extract meaningful sentences relevant to a user query. The proposed method can improve the quality of document summaries because the inherent semantic of the documents are well reflected by the semantic feature from NMF. In addition, it uses the PRF by the semantic features and WordNet to reduce the semantic gap between the high level user's requirement and the low level vector representation. The experimental results demonstrate that the proposed method achieves better performance that the other methods.

Query Expansion Based on Word Graphs Using Pseudo Non-Relevant Documents and Term Proximity (잠정적 부적합 문서와 어휘 근접도를 반영한 어휘 그래프 기반 질의 확장)

  • Jo, Seung-Hyeon;Lee, Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.19B no.3
    • /
    • pp.189-194
    • /
    • 2012
  • In this paper, we propose a query expansion method based on word graphs using pseudo-relevant and pseudo non-relevant documents to achieve performance improvement in information retrieval. The initially retrieved documents are classified into a core cluster when a document includes core query terms extracted by query term combinations and the degree of query term proximity. Otherwise, documents are classified into a non-core cluster. The documents that belong to a core query cluster can be seen as pseudo-relevant documents, and the documents that belong to a non-core cluster can be seen as pseudo non-relevant documents. Each cluster is represented as a graph which has nodes and edges. Each node represents a term and each edge represents proximity between the term and a query term. The term weight is calculated by subtracting the term weight in the non-core cluster graph from the term weight in the core cluster graph. It means that a term with a high weight in a non-core cluster graph should not be considered as an expanded term. Expansion terms are selected according to the term weights. Experimental results on TREC WT10g test collection show that the proposed method achieves 9.4% improvement over the language model in mean average precision.

ON TRANSLATION LENGTHS OF PSEUDO-ANOSOV MAPS ON THE CURVE GRAPH

  • Hyungryul Baik;Changsub Kim
    • Bulletin of the Korean Mathematical Society
    • /
    • v.61 no.3
    • /
    • pp.585-595
    • /
    • 2024
  • We show that a pseudo-Anosov map constructed as a product of the large power of Dehn twists of two filling curves always has a geodesic axis on the curve graph of the surface. We also obtain estimates of the stable translation length of a pseudo-Anosov map, when two filling curves are replaced by multicurves. Three main applications of our theorem are the following: (a) determining which word realizes the minimal translation length on the curve graph within a specific class of words, (b) giving a new class of pseudo-Anosov maps optimizing the ratio of stable translation lengths on the curve graph to that on Teichmüller space, (c) giving a partial answer of how much power is needed for Dehn twists to generate right-angled Artin subgroup of the mapping class group.

Performance of speech recognition unit considering morphological pronunciation variation (형태소 발음변이를 고려한 음성인식 단위의 성능)

  • Bang, Jeong-Uk;Kim, Sang-Hun;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.111-119
    • /
    • 2018
  • This paper proposes a method to improve speech recognition performance by extracting various pronunciations of the pseudo-morpheme unit from an eojeol unit corpus and generating a new recognition unit considering pronunciation variations. In the proposed method, we first align the pronunciation of the eojeol units and the pseudo-morpheme units, and then expand the pronunciation dictionary by extracting the new pronunciations of the pseudo-morpheme units at the pronunciation of the eojeol units. Then, we propose a new recognition unit that relies on pronunciation by tagging the obtained phoneme symbols according to the pseudo-morpheme units. The proposed units and their extended pronunciations are incorporated into the lexicon and language model of the speech recognizer. Experiments for performance evaluation are performed using the Korean speech recognizer with a trigram language model obtained by a 100 million pseudo-morpheme corpus and an acoustic model trained by a multi-genre broadcast speech data of 445 hours. The proposed method is shown to reduce the word error rate relatively by 13.8% in the news-genre evaluation data and by 4.5% in the total evaluation data.

Dynamic Testing for Word - Oriented Memories (워드지향 메모리에 대한 동적 테스팅)

  • Young Sung H.
    • Journal of the Korea Computer Industry Society
    • /
    • v.6 no.2
    • /
    • pp.295-304
    • /
    • 2005
  • This paper presents the problem of exhaustive test generation for detection of coupling faults between cells in word-oriented memories. According to this fault model, contents of any w-bit memory word in a memory with n words, or ability tochange this contents, is influenced by the contents of any other s-1 words in the memory. A near optimal iterative method for construction of test patterns is proposed The systematic structure of the proposed test results in simple BIST implementations.

  • PDF

A Study on Solving Word Problems through the Articulation of Analogical Mapping (유추 사상의 명료화를 통한 문장제 해결에 관한 연구)

  • Kim, Ji Eun;Shin, Jaehong
    • Communications of Mathematical Education
    • /
    • v.27 no.4
    • /
    • pp.429-448
    • /
    • 2013
  • The aim of this study was to examine how analogical mapping articulation activity played a role in solving process in word problems. We analyzed the problem solving strategies and processes that the participating thirty-three 8th grade students employed when solving the problems through analogical mapping articulation activities, and also the characteristics of the thinking processes from the aspects of similarity. As a result, this study indicates that analogical mapping articulation activity could be helpful when the students solved similar word problems, although some of them gained correct answers through pseudo-analytic thinking. Not to have them use pseudo-analytic thinking, it might be necessary to help them recognize superficial similarity and difference among the problems and construct structural similarity to know the principle of solution associated with the problematic situations.