• Title/Summary/Keyword: Word Association

Search Result 1,087, Processing Time 0.035 seconds

Performance Analysis of Opinion Mining using Word2vec (Word2vec을 이용한 오피니언 마이닝 성과분석 연구)

  • Eo, Kyun Sun;Lee, Kun Chang
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2018.05a
    • /
    • pp.7-8
    • /
    • 2018
  • This study proposes an analysis of the Word2vec-based machine learning classifiers for the sake of opinion mining tasks. As a bench-marking method, BOW (Bag-of-Words) was adopted. On the basis of utilizing the Word2vec and BOW as feature extraction methods, we applied Laptop and Restaurant dataset to LR, DT, SVM, RF classifiers. The results showed that the Word2vec feature extraction yields more improved performance.

  • PDF

Public word development based on Ajax (Ajax기반 공유 word 개발)

  • Sin, Yeong-Sik;Ko, Sung-Taek
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2006.11a
    • /
    • pp.7-12
    • /
    • 2006
  • 한때 word process를 하기 위해 PC를 구매하던 때가 있었다. 그리고 지금도 word process 기능은 PC를 쓰는데 없어서는 안 될 기능중 하나이다. 그리고 최근 들어 web2.0의 추세와 함께 일반 application에서 수행하였던 Application들이 Web에서 수행이 가능한 web application으로 변화하고 진화하는 추세이다. 따라서 본 논문에서는 Ajax(Asynchronous Javascript And XML)기반의 공유 word process인 'Ajax기반 공유 word'를 만들었다. 이는 웹에서 WYSIWYG(what you see is what you get)으로 문서를 편집하면서 문서들을 사용자간 공유하고 문서의 변경내용을 다른 사용자가 실시간으로 확인할 수 있는 web application이다.

  • PDF

Feature Extraction of Web Document using Association Word Mining (연관 단어 마이닝을 사용한 웹문서의 특징 추출)

  • 고수정;최준혁;이정현
    • Journal of KIISE:Databases
    • /
    • v.30 no.4
    • /
    • pp.351-361
    • /
    • 2003
  • The previous studies to extract features for document through word association have the problems of updating profiles periodically, dealing with noun phrases, and calculating the probability for indices. We propose more effective feature extraction method which is using association word mining. The association word mining method, by using Apriori algorithm, represents a feature for document as not single words but association-word-vectors. Association words extracted from document by Apriori algorithm depend on confidence, support, and the number of composed words. This paper proposes an effective method to determine confidence, support, and the number of words composing association words. Since the feature extraction method using association word mining does not use the profile, it need not update the profile, and automatically generates noun phrase by using confidence and support at Apriori algorithm without calculating the probability for index. We apply the proposed method to document classification using Naive Bayes classifier, and compare it with methods of information gain and TFㆍIDF. Besides, we compare the method proposed in this paper with document classification methods using index association and word association based on the model of probability, respectively.

Word Sense Disambiguation using Korean Word Space Model (한국어 단어 공간 모델을 이용한 단어 의미 중의성 해소)

  • Park, Yong-Min;Lee, Jae-Sung
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.6
    • /
    • pp.41-47
    • /
    • 2012
  • Various Korean word sense disambiguation methods have been proposed using small scale of sense-tagged corpra and dictionary definitions to calculate entropy information, conditional probability, mutual information and etc. for each method. This paper proposes a method using Korean Word Space model which builds word vectors from a large scale of sense-tagged corpus and disambiguates word senses with the similarity calculation between the word vectors. Experiment with Sejong morph sense-tagged corpus showed 94% precision for 200 sentences(583 word types), which is much superior to the other known methods.

A Short Test of English Silent Word Reading for English Language Learners

  • Kalindi, Sylvia C.;McBride, Catherine;Chan, Shingfong;Chung, Kien Hoa Kevin;Lee, Chia-Ying;Maurer, Urs;Tong, Xiuhong
    • Child Studies in Asia-Pacific Contexts
    • /
    • v.5 no.2
    • /
    • pp.95-105
    • /
    • 2015
  • We developed a test of English silent word reading, following work by Mather, Hammill, Allen and Roberts (2004) and Bell, McCallum, Krik, Fuller, and McCane-Bowling (2007), in order to tap Hong Kong Chinese children's reading of English as a foreign language. We created one subtest of individual word reading and another of word reading contextualized within sentences; together, these tests require no more than 10 minutes for administration. In Study 1, we administered the entire test to 552 second grade Hong Kong Chinese children between the ages of 70 and 121 months old, from five different primary schools. The association between the subtests of English silent word reading and contextual reading was positively correlated (.78). In Study 2, 77 Hong Kong Chinese second graders were tested on our newly developed English silent word reading test, together with non-verbal IQ, an English word reading and a Chinese character recognition test (both read aloud). With age and non-verbal IQ statistically controlled, there was a significant correlation between English silent word reading and the more standard English word reading, read aloud, (.78); the association between English silent word reading and Chinese character recognition was also positively correlated (.49). This newly created test is a quick and reliable measure, suitable for both educators and researchers to use to identify poor readers who learn English as a foreign or second language.

Twitter Hashtags Clustering with Word Embedding (Word Embedding기반 Twitter 해시 태그 클러스터링)

  • Nguyen, Tien Anh;Yang, Hyung-Jeong
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2019.05a
    • /
    • pp.179-180
    • /
    • 2019
  • Nowadays, clustering algorithm is considered as a promising solution for lacking human-labeled and massive data of social media sites in numerous machine learning tasks. Many researchers propose disaster event detection systems have ability to determine special local events, such as missing people, public transport damage by clustering similar tweets and hashtags together. In this paper, we try to extend tweet hashtag feature definition by applying word embedding. The experimental results are described that word embedding achieve better performance than the reference method.

  • PDF

Effects of Association and Imagery on Word Recognition (단어재인에 미치는 연상과 심상성의 영향)

  • Kim, Min-Jung;Lee, Seung-Bok;Jung, Bum-Suk
    • Korean Journal of Cognitive Science
    • /
    • v.20 no.3
    • /
    • pp.243-274
    • /
    • 2009
  • The association, word frequency and imagery have been considered as the main factors that affect the word recognition. The present study aimed to examine the imagery effect and the interaction of the association effect while controlling the frequency effect. To explain the imagery effect, we compared the two theories (dual-coding theory, context availability model). The lexical decision task using priming paradigm was administered. The duration of prime words was manipulated as 20ms, 50ms, and 450ms in experiments 1, 2, and 3, respectively. The association and imagery of prime words were manipulated as the main factors in each of the three experiments. In experiment 1, the duration of prime words (20ms) which is expected to not activate the semantic context enough to affects the word recognition was used. As a result, only imagery effect was statically significant. In experiment 2, the duration of prime word was 50ms, which we expected to activate the semantic context without perceptual awareness. The result showed both the association and imagery effects. The interaction between the two effects was also significant. In experiment 3, to activate the semantic context with perceptual awareness, the prime words were presented for 450ms. Only association effect was statically significant in this experimental condition. The results of the three experiments suggest that the influence of the imagery was at the early stages of word recognition, while the association effect appeared rather later than the imagery. These results implied that the two theories are not contrary to each other. The dual-coding theory just concerned imagery effect which affects the early stage of word recognition, and context-availability model is more for the semantic context effect which affects rather later stage of word recognition. To explain the word recognition process more completely, some integrated model need to be developed considering not only the main 3 effects but also the stages which extends along the time course of the process.

  • PDF

A Study on Word-of-Mouth Communication of Hairshop Customers (헤어 샵 이용 소비자의 구전 커뮤니케이션에 관한 연구)

  • 황연순
    • Journal of the Korean Home Economics Association
    • /
    • v.41 no.11
    • /
    • pp.189-200
    • /
    • 2003
  • The purpose of this study was to investigate that positive and negative word-of-mouth informations getting hairshop customers have influence on visiting intention of potential consumers. Data were collected from 354 university or college women. The results showed as follows; First, positive word-of-mouth informations that consumers have experienced in using hairshop were employee altitude/technique, consideration in customer's situation, kindness, saving of time/additional service, facilities, rational price, gift service/benefit in conditions of location. Second, negative word-of-mouth informations that consumers have experienced in using hairshop were inconsistent service, service focus on non-customers, irrational price/technique insufficiency/ inadequate compensational system, irrelevance of face-to-face management. Third, in getting positive word-of-mouth informations, consideration in customer's situation, rational price and gift service/benefit in conditions of location, consumers had visiting intention, and in getting negative informations, irrational price/technique insufficiency/inadequate compensational system, consumers had no visiting intention.

Vocabulary Analysis of Listening and Reading Texts in 2020 EBS-linked Textbooks and CSAT (2020년 EBS 연계교재와 대학수학능력시험의 듣기 및 읽기 어휘 분석)

  • Kang, Dongho
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.10
    • /
    • pp.679-687
    • /
    • 2020
  • The present study aims to investigate lexical coverage of BNC (British National Corpus) word lists and 2015 Basic Vocabulary of Ministry of Education in 2020 EBS-linked textbooks and CSAT. For the data analysis, AntWordProfiler was used to find lexical coverage and frequency. The findings showed that Students can understand 95% of the tokens with a vocabulary of BNC 3,000 and 4,000 word-families in 2020 EBS-linked listening and reading books respectively. 98% can be understood with 4,000 word-families in the EBS-linked listening book while the same lexical coverage can be covered with 8,000 word-families in the EBS-linked reading textbook. By the way, 95% of the tokens can be understood with 2,000 and 4,000 word-families in 2020 CSAT listening and reading tests respectively, while 98% requires 4,000 and 7,000 word-families in the 2020 listening and reading tests respectively. In summary, students should understand more words in 2020 EBS-linked textbooks than in 2020 CSAT tests confirming Kim's (2016) findings. In summary, students should understand more words in 2020 EBS-linked textbooks than in 2020 CSAT tests.

An Analysis of the Word-Final Cluster of the Syllable Structure (음절구조의 어말 자음군에 관한 분석)

  • Oh, Kwan-Young
    • English Language & Literature Teaching
    • /
    • v.10 no.2
    • /
    • pp.67-87
    • /
    • 2004
  • The purpose of this paper is to show how the coda of a syllable and word-final clusters are represented in the English syllable structure. Previous theories on the syllable assume that there is only one segment in the coda position. And, as we know, the theories that license only one segment in the coda make it difficult to syllabicate the word-final cluster appropriately when more than two segments in the word-final cluster are encountered. I considered three approaches: the previous syllable structure (Selkirk, 1982; Borowsky 1989), sonority sequencing (Giegerich, 1992; Roca, 1999) and feature analysis (Goldsmith, 1990), But, all the considered methods don't give us a satisfactory explanation regarding word-final clusters. Finally, I will suggest a modified syllable representation as an alternative by placing two different appendixes under the Phonological Word which forms a constituent above the syllable node. From this it is possible to explain the former problematic word-final clusters including morphological information asan inflectional suffix in the structure.

  • PDF