• Title/Summary/Keyword: Word form

Search Result 382, Processing Time 0.031 seconds

Concept Hierarchy Creation Using Hypernym Relationship (상위어 관계를 이용한 개념 계층의 생성)

  • Shin, Myung-Keun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.5 s.43
    • /
    • pp.115-125
    • /
    • 2006
  • A concept hierarchy represents the knowledge with multi-level form, which is very useful to categorize, store and retrieve the data. Traditionally, a concept hierarchy has been built manually by domain experts. However, the manual construction of a concept hierarchy has caused many problems such as enormous development and maintenance costs and human errors such as inconsistency. This paper proposes the automatic creation of concept hierarchies using the predefined hypernym relation. To create the hierarchy automatically, we first eliminate the ambiguity of the senses of data values, and construct the hierarchy by grouping and leveling of the remaining senses. We use the WordNet explanations for multi-meaning word to eliminate the ambiguity and use the WordNet hypernym relations to create multi-level hierarchy structure.

  • PDF

Study on Influence and Diffusion of Word-of-Mouth in Online Fashion Community Network (온라인 패션커뮤니티 네트워크에서의 구전 영향력과 확산력에 관한 연구)

  • Song, Kieun;Lee, Duk Hee
    • Journal of the Korean Society of Costume
    • /
    • v.65 no.6
    • /
    • pp.25-35
    • /
    • 2015
  • The purpose of this study is to investigate the characteristics of members and communities that have significant influence in the online fashion community through their word-of-mouth activities. In order to identify the influence and the diffusion of word-of-mouth in fashion community, the study selected one online fashion community. Then, the study sorted the online posts and comments made on fashion information and put them into the matrix form to perform social network analysis. The result of the analysis is as follows: First, the fashion community network used in the study has many active members that relay information very quickly. Average time for information diffusion is very short, taking only one or two days in most cases. Second, the influence of word-of-mouth is led by key information produced from only a few members. The number of influential members account for less than 20% of the total number of community members, which indicate high level of degree centrality. The diffusion of word-of-mouth is led by even fewer members, which represent high level of betweenness centrality, compared to the case of degree centrality. Third, component characteristic shares similar information with about 70% of all members being linked to maximize information influence and diffusion. Fourth, a node with high degree centrality and betweenness centrality shares similar interests, presenting strain effect to particular information. Specially, members with high betweenness centrality show similar interests with members of high degree centrality. The members with high betweenness centrality also help expansion of related information by actively commenting on posts. The result of this research emphasizes the necessity of creation and management of network to efficiently convey fashion information by identifying key members with high level of information influence and diffusion to enhance the outcome of online word-of-mouth.

Expansion of Topic Modeling with Word2Vec and Case Analysis (Word2Vec를 이용한 토픽모델링의 확장 및 분석사례)

  • Yoon, Sang Hun;Kim, Keun Hyung
    • The Journal of Information Systems
    • /
    • v.30 no.1
    • /
    • pp.45-64
    • /
    • 2021
  • Purpose The traditional topic modeling technique makes it difficult to distinguish the semantic of topics because the key words assigned to each topic would be also assigned to other topics. This problem could become severe when the number of online reviews are small. In this paper, the extended model of topic modeling technique that can be used for analyzing a small amount of online reviews is proposed. Design/methodology/approach The extended model of being proposed in this paper is a form that combines the traditional topic modeling technique and the Word2Vec technique. The extended model only allocates main words to the extracted topics, but also generates discriminatory words between topics. In particular, Word2vec technique is applied in the process of extracting related words semantically for each discriminatory word. In the extended model, main words and discriminatory words with similar words semantically are used in the process of semantic classification and naming of extracted topics, so that the semantic classification and naming of topics can be more clearly performed. For case study, online reviews related with Udo in Tripadvisor web site were analyzed by applying the traditional topic modeling and the proposed extension model. In the process of semantic classification and naming of the extracted topics, the traditional topic modeling technique and the extended model were compared. Findings Since the extended model is a concept that utilizes additional information in the existing topic modeling information, it can be confirmed that it is more effective than the existing topic modeling in semantic division between topics and the process of assigning topic names.

Effects of the Mathematical Modeling Learning on the Word Problem Solving (수학적 모델링 학습이 문장제 해결에 미치는 효과)

  • Shin, Hyun-Yong;Jeong, In-Su
    • Education of Primary School Mathematics
    • /
    • v.15 no.2
    • /
    • pp.107-134
    • /
    • 2012
  • The purpose of this study is to investigate the effectiveness of two teaching methods of word problems, one based on mathematical modeling learning(ML) and the other on traditional learning(TL). Additionally, the influence of mathematical modeling learning in word problem solving behavior, application ability of real world experiences in word problem solving and the beliefs of word problem solving will be examined. The results of this study were as follows: First, as to word problem solving behavior, there was a significant difference between the two groups. This mean that the ML was effective for word problem solving behavior. Second, all of the students in the ML group and the TL group had a strong tendency to exclude real world knowledge and sense-making when solving word problems during the pre-test. but A significant difference appeared between the two groups during post-test. classroom culture improvement efforts. Third, mathematical modeling learning(ML) was effective for improvement of traditional beliefs about word problems. Fourth, mathematical modeling learning(ML) exerted more influence on mathematically strong and average students and a positive effect to mathematically weak students. High and average-level students tended to benefit from mathematical modeling learning(ML) more than their low-level peers. This difference was caused by less involvement from low-level students in group assignments and whole-class discussions. While using the mathematical modeling learning method, elementary students were able to build various models about problem situations, justify, and elaborate models by discussions and comparisons from each other. This proves that elementary students could participate in mathematical modeling activities via word problems, it results form the use of more authentic tasks, small group activities and whole-class discussions, exclusion of teacher's direct intervention, and classroom culture improvement efforts. The conclusions drawn from the results obtained in this study are as follows: First, mathematical modeling learning(ML) can become an effective method, guiding word problem solving behavior from the direct translation approach(DTA) based on numbers and key words without understanding about problem situations to the meaningful based approach(MBA) building rich models for problem situations. Second, mathematical modeling learning(ML) will contribute attitudes considering real world situations in solving word problems. Mathematical modeling activities for word problems can help elementary students to understand relations between word problems and the real world. It will be also help them to develop the ability to look at the real world mathematically. Third, mathematical modeling learning(ML) will contribute to the development of positive beliefs for mathematics and word problem solving. Word problem teaching focused on just mathematical operations can't develop proper beliefs for mathematics and word problem solving. Mathematical modeling learning(ML) for word problems provide elementary students the opportunity to understand the real world mathematically, and it increases students' modeling abilities. Futhermore, it is a very useful method of reforming the current problems of word problem teaching and learning. Therefore, word problems in school mathematics should be replaced by more authentic ones and modeling activities should be introduced early in elementary school eduction, which would help change the perceptions about word problem teaching.

Preprocessing Technique for Malicious Comments Detection Considering the Form of Comments Used in the Online Community (온라인 커뮤니티에서 사용되는 댓글의 형태를 고려한 악플 탐지를 위한 전처리 기법)

  • Kim Hae Soo;Kim Mi Hui
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.3
    • /
    • pp.103-110
    • /
    • 2023
  • With the spread of the Internet, anonymous communities emerged along with the activation of communities for communication between people, and many users are doing harm to others, such as posting aggressive posts and leaving comments using anonymity. In the past, administrators directly checked posts and comments, then deleted and blocked them, but as the number of community users increased, they reached a level that managers could not continue to monitor. Initially, word filtering techniques were used to prevent malicious writing from being posted in a form that could not post or comment if a specific word was included, but they avoided filtering in a bypassed form, such as using similar words. As a way to solve this problem, deep learning was used to monitor posts posted by users in real-time, but recently, the community uses words that can only be understood by the community or from a human perspective, not from a general Korean word. There are various types and forms of characters, making it difficult to learn everything in the artificial intelligence model. Therefore, in this paper, we proposes a preprocessing technique in which each character of a sentence is imaged using a CNN model that learns the consonants, vowel and spacing images of Korean word and converts characters that can only be understood from a human perspective into characters predicted by the CNN model. As a result of the experiment, it was confirmed that the performance of the LSTM, BiLSTM and CNN-BiLSTM models increased by 3.2%, 3.3%, and 4.88%, respectively, through the proposed preprocessing technique.

Selective Word Embedding for Sentence Classification by Considering Information Gain and Word Similarity (문장 분류를 위한 정보 이득 및 유사도에 따른 단어 제거와 선택적 단어 임베딩 방안)

  • Lee, Min Seok;Yang, Seok Woo;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.105-122
    • /
    • 2019
  • Dimensionality reduction is one of the methods to handle big data in text mining. For dimensionality reduction, we should consider the density of data, which has a significant influence on the performance of sentence classification. It requires lots of computations for data of higher dimensions. Eventually, it can cause lots of computational cost and overfitting in the model. Thus, the dimension reduction process is necessary to improve the performance of the model. Diverse methods have been proposed from only lessening the noise of data like misspelling or informal text to including semantic and syntactic information. On top of it, the expression and selection of the text features have impacts on the performance of the classifier for sentence classification, which is one of the fields of Natural Language Processing. The common goal of dimension reduction is to find latent space that is representative of raw data from observation space. Existing methods utilize various algorithms for dimensionality reduction, such as feature extraction and feature selection. In addition to these algorithms, word embeddings, learning low-dimensional vector space representations of words, that can capture semantic and syntactic information from data are also utilized. For improving performance, recent studies have suggested methods that the word dictionary is modified according to the positive and negative score of pre-defined words. The basic idea of this study is that similar words have similar vector representations. Once the feature selection algorithm selects the words that are not important, we thought the words that are similar to the selected words also have no impacts on sentence classification. This study proposes two ways to achieve more accurate classification that conduct selective word elimination under specific regulations and construct word embedding based on Word2Vec embedding. To select words having low importance from the text, we use information gain algorithm to measure the importance and cosine similarity to search for similar words. First, we eliminate words that have comparatively low information gain values from the raw text and form word embedding. Second, we select words additionally that are similar to the words that have a low level of information gain values and make word embedding. In the end, these filtered text and word embedding apply to the deep learning models; Convolutional Neural Network and Attention-Based Bidirectional LSTM. This study uses customer reviews on Kindle in Amazon.com, IMDB, and Yelp as datasets, and classify each data using the deep learning models. The reviews got more than five helpful votes, and the ratio of helpful votes was over 70% classified as helpful reviews. Also, Yelp only shows the number of helpful votes. We extracted 100,000 reviews which got more than five helpful votes using a random sampling method among 750,000 reviews. The minimal preprocessing was executed to each dataset, such as removing numbers and special characters from text data. To evaluate the proposed methods, we compared the performances of Word2Vec and GloVe word embeddings, which used all the words. We showed that one of the proposed methods is better than the embeddings with all the words. By removing unimportant words, we can get better performance. However, if we removed too many words, it showed that the performance was lowered. For future research, it is required to consider diverse ways of preprocessing and the in-depth analysis for the co-occurrence of words to measure similarity values among words. Also, we only applied the proposed method with Word2Vec. Other embedding methods such as GloVe, fastText, ELMo can be applied with the proposed methods, and it is possible to identify the possible combinations between word embedding methods and elimination methods.

The Psychological Reality of Intensification (경음화의 심리적 실체)

  • Lee Mi Jae
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.43-52
    • /
    • 1996
  • This paper deals with the nature and function of intensification in Korean in a wider scope of intensification which was not paid proper attention including intensification in the initial position as well as middle position. Unobserved new areas of intensification in the initial position are paid more attention like sound split of polysemy e.g. (s'eda), (kyongk'i) by means of intensification and north Korean application of intensification on (wonsu) and intensification of borrowed English. The recent phenomenon of ‘gwua’ intensification is experimented on two groups of people, young students and old people beyond 65 years old by means of sociolinguistic analysis. The result shows that its intensification is a form of student violent power and a mark of extreme solidarity among activist students. Thirty three university students(male 16, female 17) are asked to explained to write the meanings(feelings or when to use, etc.) of the words which have normal form and intensified forms. The results show intensification attaches the meaning of ‘emphasis,’ to bring the extremely polarized emotion: samll to the smallest, exact to the perfect exactness, bad to the worst feeling. Four words are being split to express different meaning with the word intensified. In conclusion, the nature of so called saisiot(t) e.g. intensification is voiceless tensed pause and its functions are the polarization of the original meaning of the word, sound split of polysemy and attachment of social values by intensification.

  • PDF

Aspects of Chinese Korean learners' production of Korean aspiration at different prosodic boundaries (운율 층위에 따른 중국인학습자들의 한국어 유기음화 적용 양상)

  • Yune, Youngsook
    • Phonetics and Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.9-17
    • /
    • 2017
  • The aim of this study is to examine whether Chinese Korean learners (CKL) can correctly produce the aspiration in 'a lenis obstruents /k/, /t/, /p/, /ʧ/+/h/ sound' sequence at the lexical and post-lexical level. For this purpose 4 Korean native speakers (KNS), 10 advanced and 10 intermediate CKL participated in a production test. The material analyzed consisted of 10 Korean sentences in which aspiration can be applied at different prosodic boundaries (syllable, word, accentual phrase). The results showed that for KNS and CKL, the rate of application of aspiration was different according to prosodic boundaries. Aspiration was more frequently applied at the lexical level than at the post-lexical level and it was more frequent at the word boundary than at the accentual phrase boundary. For CKL, pronunciation errors were either non-application of aspiration or coda obstruent omission. In the case of non-application of aspiration, CKL produced the target syllable as an underling form and they did not transform it as a surface form. In the case of coda obstruent ommision, most of the errors were caused by the inherent complexity of phonological process.

Video retrieval system based on closed caption (폐쇄자막을 기반한 자막기반 동영상 검색 시스템)

  • 김효진;황인정;이은주;이응혁;민홍기
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2000.12a
    • /
    • pp.57-60
    • /
    • 2000
  • Even if the video data is utilized for a lot of field, its very difficult to reuse and search easily because of its atypical(unfixed form) and complicated structure. In this study, we presented the video retrieval system which is based on the synchronized closed caption and video, SMIL and SAMI languages which are described to structured and systematic form like multimedia data These have next structure; At first, a key word is inputted by user, then time stamp would be sampling from the string which has a key word in the caption file. To the result, the screen shows an appropriate video frame.

  • PDF

Profiling and Co-word Analysis of Teaching Korean as a Foreign Language Domain (프로파일링 분석과 동시출현단어 분석을 이용한 한국어교육학의 정체성 분석)

  • Kang, Beomil;Park, Ji-Hong
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.4
    • /
    • pp.195-213
    • /
    • 2013
  • This study aims at establishing the identity of teaching Korean as a Foreign Language (KFL) domain by using journal profiling and co-word analysis in comparison with the relevant and adjacent domains. Firstly, by extracting and comparing topic terms, we calculate the similarity of academic journals of the three domains, KFL, teaching Korean as a Native Language (KNL), and Korean Linguistics (KL). The result shows that the journals of KFL form a distinct cluster from the others. The profiling analysis and co-word analysis are then conducted to visualize the relationship among all the three domains in order to uncover the characteristics of KFL. The findings show that KFL is more similar to KNL than to KL. Finally, the comparison of knowledge structures of these three domains based on the co-word analysis demonstrates the uniqueness of KFL as an independent domain in relation with the other relevant domains.