• Title/Summary/Keyword: weight of word

Search Result 128, Processing Time 0.026 seconds

Rapid Speaker Adaptation Based on Eigenvoice Using Weight Distribution Characteristics (가중치 분포 특성을 이용한 Eigenvoice 기반 고속화자적응)

  • 박종세;김형순;송화전
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.5
    • /
    • pp.403-407
    • /
    • 2003
  • Recently, eigenvoice approach has been widely used for rapid speaker adaptation. However, even in the eigenvoice approach, Performance improvement using very small amount of adaptation data is relatively small in comparison with that using somewhat large adaptation data because the reliable estimation of weights of eigenvoice is difficult. In this paper, we propose a rapid speaker adaptation method based on eigenvoice using the weight distribution characteristics to improve the performance on a small adaptation data. In the Experimental results on vocabulary-independent word recognition task (using PBW 452 database), the weight threshold method alleviates the problem of relatively low performance for a tiny small adaptation data. When single adaptation word is used, word error rate is reduced about 9-18% by the weight threshold method.

Word Sense Disambiguation using Meaning Groups (의미그룹을 이용한 단어 중의성 해소)

  • Kim, Eun-Jin;Lee, Soo-Won
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.6
    • /
    • pp.747-751
    • /
    • 2010
  • This paper proposes the method that increases the accuracy for tagging word meaning by creating sense tagged data automatically using machine readable dictionaries. The concept of meaning group is applied here, where the meaning group for each meaning of a target word consists of neighbor words of the target word. To enhance the tagging accuracy, the notion of concentration is used for the weight of each word in a meaning group. The tagging result in SENSEVAL-2 data shows that accuracy of the proposed method is better than that of existing ones.

Gathering Common-word and Document Reclassification to improve Accuracy of Document Clustering (문서 군집화의 정확률 향상을 위한 범용어 수집과 문서 재분류 알고리즘)

  • Shin, Joon-Choul;Ock, Cheol-Young;Lee, Eung-Bong
    • The KIPS Transactions:PartB
    • /
    • v.19B no.1
    • /
    • pp.53-62
    • /
    • 2012
  • Clustering technology is used to deal efficiently with many searched documents in information retrieval system. But the accuracy of the clustering is satisfied to the requirement of only some domains. This paper proposes two methods to increase accuracy of the clustering. We define a common-word, that is frequently used but has low weight during clustering. We propose the method that automatically gathers the common-word and calculates its weight from the searched documents. From the experiments, the clustering error rates using the common-word is reduced to 34% compared with clustering using a stop-word. After generating first clusters using average link clustering from the searched documents, we propose the algorithm that reevaluates the similarity between document and clusters and reclassifies the document into more similar clusters. From the experiments using Naver JiSikIn category, the accuracy of reclassified clusters is increased to 1.81% compared with first clusters without reclassification.

The Effect of Online Word-of-mouth on Fashion Involvement and Internet Purchase Behavior (온라인 패션 구전에 따른 패션제품 관여와 인터넷 구매행동)

  • Song, So-Jin;Hwang, Jin-Sook
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.31 no.3 s.162
    • /
    • pp.410-419
    • /
    • 2007
  • The purposes of this study were to segment consumers by on-line word of month and to find the differences among the segmented groups in regard to fashion involvement, internet perceived risk, and internet purchase behavior. The subjects of this study were female consumers who were members of online cafe in Korea. The data were collected during October, 2004. The respondents returned the questionnaires through internet and 480 questionnaires were finally used in the data analysis. The statistical analyses used for the study were factor analysis, cluster analysis, t-test, and $X^2-test$. The results showed that word-of·mouth communication on internet(e-WOM) is composed of two factors, word-of-mouth transmission and word-of-mouth acceptance. These two factors were put under cluster analysis and were classified into two groups of the word-of·mouth communication: WOM group and non-WOM group. T-test showed that word-of-mouth communication groups were significantly different in regard to fashion involvement, internet perceived risk, and internet purchase behavior. For example, WOM group was more uncertain of their clothing choices, put more weight on the internal factors of clothing selection, and was a frequent purchaser of internet fashion products. Internet fashion business needs to implement the proper marketing strategies based on the results of the study.

A Stochastic Word-Spacing System Based on Word Category-Pattern (어절 내의 형태소 범주 패턴에 기반한 통계적 자동 띄어쓰기 시스템)

  • Kang, Mi-Young;Jung, Sung-Won;Kwon, Hyuk-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.11
    • /
    • pp.965-978
    • /
    • 2006
  • This paper implements an automatic Korean word-spacing system based on word-recognition using morpheme unigrams and the pattern that the categories of those morpheme unigrams share within a candidate word. Although previous work on Korean word-spacing models has produced the advantages of easy construction and time efficiency, there still remain problems, such as data sparseness and critical memory size, which arise from the morpho-typological characteristics of Korean. In order to cope with both problems, our implementation uses the stochastic information of morpheme unigrams, and their category patterns, instead of word unigrams. A word's probability in a sentence is obtained based on morpheme probability and the weight for the morpheme's category within the category pattern of the candidate word. The category weights are trained so as to minimize the error means between the observed probabilities of words and those estimated by words' individual-morphemes' probabilities weighted according to their categories' powers in a given word's category pattern.

The Sentence Similarity Measure Using Deep-Learning and Char2Vec (딥러닝과 Char2Vec을 이용한 문장 유사도 판별)

  • Lim, Geun-Young;Cho, Young-Bok
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.10
    • /
    • pp.1300-1306
    • /
    • 2018
  • The purpose of this study is to see possibility of Char2Vec as alternative of Word2Vec that most famous word embedding model in Sentence Similarity Measure Problem by Deep-Learning. In experiment, we used the Siamese Ma-LSTM recurrent neural network architecture for measure similarity two random sentences. Siamese Ma-LSTM model was implemented with tensorflow. We train each model with 200 epoch on gpu environment and it took about 20 hours. Then we compared Word2Vec based model training result with Char2Vec based model training result. as a result, model of based with Char2Vec that initialized random weight record 75.1% validation dataset accuracy and model of based with Word2Vec that pretrained with 3 million words and phrase record 71.6% validation dataset accuracy. so Char2Vec is suitable alternate of Word2Vec to optimize high system memory requirements problem.

SIMULTANEOUS RANDOM ERROR CORRECTION AND BURST ERROR DETECTION IN LEE WEIGHT CODES

  • Jain, Sapna
    • Honam Mathematical Journal
    • /
    • v.30 no.1
    • /
    • pp.33-45
    • /
    • 2008
  • Lee weight is more appropriate for some practical situations than Hamming weight as it takes into account magnitude of each digit of the word. In this paper, we obtain a sufficient condition over the number of parity check digits for codes correcting random errors and simultaneously detecting burst errors with Lee weight consideration.

Dynamic Interaction of Performance Information and Word-of-Mouth in Film Industry (영화공급사슬 내 성과정보와 입소문 효과의 동적상호작용에 대한 연구)

  • Lee, Wonhee
    • Korean Management Science Review
    • /
    • v.32 no.2
    • /
    • pp.125-143
    • /
    • 2015
  • When studying the film industry, researchers have seldom addressed the dynamic interaction between marketing information and word of mouth in the motion picture industry mainly because of the limitation of traditional research methodologies. This study explores integration and competition among important variables influencing on audience's choice on movie selection, particularly by using a new method of agent-based modeling including competitive environment. Decision process of moviegoer composed of transition probability based on multinomial logit model, considering marketing and box-office information, critique, and word of mouth from other moviegoers. After validating the fitness of market share among released movies, this study conducted a set of simulation experiments considering several variables such as market size, change of weight between variables, and movie performance under competition. Propositions are derived from the simulation results is also suggested for future research.

A Method on Associated Document Recommendation with Word Correlation Weights (단어 연관성 가중치를 적용한 연관 문서 추천 방법)

  • Kim, Seonmi;Na, InSeop;Shin, Juhyun
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.2
    • /
    • pp.250-259
    • /
    • 2019
  • Big data processing technology and artificial intelligence (AI) are increasingly attracting attention. Natural language processing is an important research area of artificial intelligence. In this paper, we use Korean news articles to extract topic distributions in documents and word distribution vectors in topics through LDA-based Topic Modeling. Then, we use Word2vec to vector words, and generate a weight matrix to derive the relevance SCORE considering the semantic relationship between the words. We propose a way to recommend documents in order of high score.

An Automatic Classification System of Official Documents in Middle Schools Using Term Weighting of Titles (제목의 단어 가중치를 이용한 중등학교 공문서 자동분류시스템)

  • Kang, Hyun-Hee;Jin, Min
    • Journal of The Korean Association of Information Education
    • /
    • v.7 no.2
    • /
    • pp.219-226
    • /
    • 2003
  • It takes a lot of time to classify official documents in schools and educational institutions. In order to reduce the overhead, we propose an automatic document classification method using word information of the titles of documents in this paper. At first, meaningful words are extracted from titles of existing documents and Inverse Document Frequency(IDF) weights of words are calculated against each category. Then we build a word weight dictionary. Documents are automatically classified into the appropriate category of which the sum of weights of words of the title is the highest by using the word weight dictionary. We also evaluate the performance of the proposed method using a real dataset of a middle school.

  • PDF