• Title/Summary/Keyword: Word Network

Search Result 723, Processing Time 0.021 seconds

Word Network Analysis based on Mutual Information for Ontology of Korean Rural Planning (한국농촌계획 온톨로지 구축을 위한 상호정보 기반 단어연결망 분석)

  • Lee, Jemyung
    • Journal of Korean Society of Rural Planning
    • /
    • v.23 no.3
    • /
    • pp.37-51
    • /
    • 2017
  • There has been a growing concern on ontology especially in recent knowledge-based industry and defining a field-customized semantic word network is essential for building it. In this paper, a word network for ontology is established with 785 publications of Korean Society of Rural Planning(KSRP), from 1995 to 2017. Semantic relationships between words in the publications were quantitatively measured with the 'normalized pointwise mutual information' based on the information theory. Appearance and co-appearance frequencies of nouns and adjectives in phrases are analyzed based on the assumption that a 'noun phrase' represents a single 'concept'. The word network of KSRP was compared with that of $WordNet^{TM}$, a world-wide thesaurus network, for the verification. It is proved that the KSRP's word network, established in this paper, provides words' semantic relationships based on the common concepts of Korean rural planning research field. With the results, it is expecting that the established word network can present more opportunity for preparation of the fourth industrial revolution to the field of the Korean rural planning.

Input Dimension Reduction based on Continuous Word Vector for Deep Neural Network Language Model (Deep Neural Network 언어모델을 위한 Continuous Word Vector 기반의 입력 차원 감소)

  • Kim, Kwang-Ho;Lee, Donghyun;Lim, Minkyu;Kim, Ji-Hwan
    • Phonetics and Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.3-8
    • /
    • 2015
  • In this paper, we investigate an input dimension reduction method using continuous word vector in deep neural network language model. In the proposed method, continuous word vectors were generated by using Google's Word2Vec from a large training corpus to satisfy distributional hypothesis. 1-of-${\left|V\right|}$ coding discrete word vectors were replaced with their corresponding continuous word vectors. In our implementation, the input dimension was successfully reduced from 20,000 to 600 when a tri-gram language model is used with a vocabulary of 20,000 words. The total amount of time in training was reduced from 30 days to 14 days for Wall Street Journal training corpus (corpus length: 37M words).

A Study on Word Sense Disambiguation Using Bidirectional Recurrent Neural Network for Korean Language

  • Min, Jihong;Jeon, Joon-Woo;Song, Kwang-Ho;Kim, Yoo-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.4
    • /
    • pp.41-49
    • /
    • 2017
  • Word sense disambiguation(WSD) that determines the exact meaning of homonym which can be used in different meanings even in one form is very important to understand the semantical meaning of text document. Many recent researches on WSD have widely used NNLM(Neural Network Language Model) in which neural network is used to represent a document into vectors and to analyze its semantics. Among the previous WSD researches using NNLM, RNN(Recurrent Neural Network) model has better performance than other models because RNN model can reflect the occurrence order of words in addition to the word appearance information in a document. However, since RNN model uses only the forward order of word occurrences in a document, it is not able to reflect natural language's characteristics that later words can affect the meanings of the preceding words. In this paper, we propose a WSD scheme using Bidirectional RNN that can reflect not only the forward order but also the backward order of word occurrences in a document. From the experiments, the accuracy of the proposed model is higher than that of previous method using RNN. Hence, it is confirmed that bidirectional order information of word occurrences is useful for WSD in Korean language.

Development of a Concept Network Useful for Specialized Search Engines (전문검색엔진을 위한 개념망의 개발)

  • 주정은;구상회
    • Journal of Information Technology Applications and Management
    • /
    • v.10 no.2
    • /
    • pp.33-41
    • /
    • 2003
  • It is not easy to find desired information in the world wide web. In this research, we introduce a notion of concept network that is useful in finding information if it is used in search engines that are specialized in domains such as medicine, law or engineering. The concept network that we propose is a network in which nodes represent significant concepts in the domain, and links represent relationships between the concepts. We may use the concept network constructor as a preprocessor to speci-alized search engines. When user enters a target word to find information, our system generates and displays a concept network in which nodes are con-cepts that are closely related with the target word. By reviewing the network, user may confirm that the target word is properly selected for his intention, otherwise he may replace the target word with better ones discovered in the network. In this research, we propose a detailed method to construct concept net-work, implemented a prototypical system that constructs concept networks, and illustrate its usefulness by demonstrating a practical case.

  • PDF

Sub-word Based Offline Handwritten Farsi Word Recognition Using Recurrent Neural Network

  • Ghadikolaie, Mohammad Fazel Younessy;Kabir, Ehsanolah;Razzazi, Farbod
    • ETRI Journal
    • /
    • v.38 no.4
    • /
    • pp.703-713
    • /
    • 2016
  • In this paper, we present a segmentation-based method for offline Farsi handwritten word recognition. Although most segmentation-based systems suffer from segmentation errors within the first stages of recognition, using the inherent features of the Farsi writing script, we have segmented the words into sub-words. Instead of using a single complex classifier with many (N) output classes, we have created N simple recurrent neural network classifiers, each having only true/false outputs with the ability to recognize sub-words. Through the extraction of the number of sub-words in each word, and labeling the position of each sub-word (beginning/middle/end), many of the sub-word classifiers can be pruned, and a few remaining sub-word classifiers can be evaluated during the sub-word recognition stage. The candidate sub-words are then joined together and the closest word from the lexicon is chosen. The proposed method was evaluated using the Iranshahr database, which consists of 17,000 samples of Iranian handwritten city names. The results show the high recognition accuracy of the proposed method.

A study on the vowel extraction from the word using the neural network (신경망을 이용한 단어에서 모음추출에 관한 연구)

  • 이택준;김윤중
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2003.11a
    • /
    • pp.721-727
    • /
    • 2003
  • This study designed and implemented a system to extract of vowel from a word. The system is comprised of a voice feature extraction module and a neutral network module. The voice feature extraction module use a LPC(Linear Prediction Coefficient) model to extract a voice feature from a word. The neutral network module is comprised of a learning module and voice recognition module. The learning module sets up a learning pattern and builds up a neutral network to learn. Using the information of a learned neutral network, a voice recognition module extracts a vowel from a word. A neutral network was made to learn selected vowels(a, eo, o, e, i) to test the performance of a implemented vowel extraction recognition machine. Through this experiment, could confirm that speech recognition module extract of vowel from 4 words.

  • PDF

Categorization of Korean News Articles Based on Convolutional Neural Network Using Doc2Vec and Word2Vec (Doc2Vec과 Word2Vec을 활용한 Convolutional Neural Network 기반 한국어 신문 기사 분류)

  • Kim, Dowoo;Koo, Myoung-Wan
    • Journal of KIISE
    • /
    • v.44 no.7
    • /
    • pp.742-747
    • /
    • 2017
  • In this paper, we propose a novel approach to improve the performance of the Convolutional Neural Network(CNN) word embedding model on top of word2vec with the result of performing like doc2vec in conducting a document classification task. The Word Piece Model(WPM) is empirically proven to outperform other tokenization methods such as the phrase unit, a part-of-speech tagger with substantial experimental evidence (classification rate: 79.5%). Further, we conducted an experiment to classify ten categories of news articles written in Korean by feeding words and document vectors generated by an application of WPM to the baseline and the proposed model. From the results of the experiment, we report the model we proposed showed a higher classification rate (89.88%) than its counterpart model (86.89%), achieving a 22.80% improvement. Throughout this research, it is demonstrated that applying doc2vec in the document classification task yields more effective results because doc2vec generates similar document vector representation for documents belonging to the same category.

A Comparative Study of Word Embedding Models for Arabic Text Processing

  • Assiri, Fatmah;Alghamdi, Nuha
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.8
    • /
    • pp.399-403
    • /
    • 2022
  • Natural texts are analyzed to obtain their intended meaning to be classified depending on the problem under study. One way to represent words is by generating vectors of real values to encode the meaning; this is called word embedding. Similarities between word representations are measured to identify text class. Word embeddings can be created using word2vec technique. However, recently fastText was implemented to provide better results when it is used with classifiers. In this paper, we will study the performance of well-known classifiers when using both techniques for word embedding with Arabic dataset. We applied them to real data collected from Wikipedia, and we found that both word2vec and fastText had similar accuracy with all used classifiers.

Hierarchical Structure in Semantic Networks of Japanese Word Associations

  • Miyake, Maki;Joyce, Terry;Jung, Jae-Young;Akama, Hiroyuki
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.321-329
    • /
    • 2007
  • This paper reports on the application of network analysis approaches to investigate the characteristics of graph representations of Japanese word associations. Two semantic networks are constructed from two separate Japanese word association databases. The basic statistical features of the networks indicate that they have scale-free and small-world properties and that they exhibit hierarchical organization. A graph clustering method is also applied to the networks with the objective of generating hierarchical structures within the semantic networks. The method is shown to be an efficient tool for analyzing large-scale structures within corpora. As a utilization of the network clustering results, we briefly introduce two web-based applications: the first is a search system that highlights various possible relations between words according to association type, while the second is to present the hierarchical architecture of a semantic network. The systems realize dynamic representations of network structures based on the relationships between words and concepts.

  • PDF

Factors Influencing Users' Word-of-Mouth Intention Regarding Mobile Apps : An Empirical Study

  • Chen, Yao;Shang, Yu-Fei
    • The Journal of Industrial Distribution & Business
    • /
    • v.9 no.1
    • /
    • pp.51-65
    • /
    • 2018
  • Purpose - This paper aims to identify factors that influence the users' word-of-mouth intention (WOMI) regarding mobile apps, focussing on the impacts of technology acceptance model (TAM) and social network theory. Research design, data and methodology - Based on TAM, this study integrates social network theory into the research model. The 317 sets of data collected in a survey were tested against the model using SmartPLS. Results - Our findings suggest the following: 1) Personal innovativeness positively influences perceived usefulness (PU), perceived ease of use (PEU) and perceived enjoyment (PE); 2) PEU affects PU and PE; 3) Both PU and Satisfaction are directly correlated with WOMI. Although PEU and PE has no direct impact on WOMI, they may indirectly affect WOMI via Satisfaction, as PU, PEU and PE all positively influence satisfaction; 4) Network density and network centrality both play a mediating role in the relation between PEU and WOMI. Referral Reward Program have a positive moderating effect on the relation between PU and WOMI. Conclusions - The findings of this study illustrate the traits of Apps that can promote users' WOMI, as well as the characteristics of people who are more likely to participate in the word-of-mouth process. The findings provide a theoretical basis for app developers to make word-of-mouth a marketing strategy.