• Title/Summary/Keyword: word class

Search Result 156, Processing Time 0.022 seconds

Probabilistic Segmentation and Tagging of Unknown Words (확률 기반 미등록 단어 분리 및 태깅)

  • Kim, Bogyum;Lee, Jae Sung
    • Journal of KIISE
    • /
    • v.43 no.4
    • /
    • pp.430-436
    • /
    • 2016
  • Processing of unknown words such as proper nouns and newly coined words is important for a morphological analyzer to process documents in various domains. In this study, a segmentation and tagging method for unknown Korean words is proposed for the 3-step probabilistic morphological analysis. For guessing unknown word, it uses rich suffixes that are attached to open class words, such as general nouns and proper nouns. We propose a method to learn the suffix patterns from a morpheme tagged corpus, and calculate their probabilities for unknown open word segmentation and tagging in the probabilistic morphological analysis model. Results of the experiment showed that the performance of unknown word processing is greatly improved in the documents containing many unregistered words.

A Comparative Study of Word Embedding Models for Arabic Text Processing

  • Assiri, Fatmah;Alghamdi, Nuha
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.8
    • /
    • pp.399-403
    • /
    • 2022
  • Natural texts are analyzed to obtain their intended meaning to be classified depending on the problem under study. One way to represent words is by generating vectors of real values to encode the meaning; this is called word embedding. Similarities between word representations are measured to identify text class. Word embeddings can be created using word2vec technique. However, recently fastText was implemented to provide better results when it is used with classifiers. In this paper, we will study the performance of well-known classifiers when using both techniques for word embedding with Arabic dataset. We applied them to real data collected from Wikipedia, and we found that both word2vec and fastText had similar accuracy with all used classifiers.

An Algorithm for Text Image Watermarking based on Word Classification (단어 분류에 기반한 텍스트 영상 워터마킹 알고리즘)

  • Kim Young-Won;Oh Il-Seok
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.8
    • /
    • pp.742-751
    • /
    • 2005
  • This paper proposes a novel text image watermarking algorithm based on word classification. The words are classified into K classes using simple features. Several adjacent words are grouped into a segment. and the segments are also classified using the word class information. The same amount of information is inserted into each of the segment classes. The signal is encoded by modifying some inter-word spaces statistics of segment classes. Subjective comparisons with conventional word-shift algorithms are presented under several criteria.

A Study on Promoting Early Reading Ability through an Explicit High-frequency Sight Word Instruction

  • Huh, Keun
    • English Language & Literature Teaching
    • /
    • v.17 no.1
    • /
    • pp.17-35
    • /
    • 2011
  • The purpose of this study was to explore the effect of an explicit word instruction for EFL beginning readers and their perception on the learning experience. Data were attained from 16 fourth graders who took English class as a development activity. Data include the results of pre- and post-test of high frequency sight word recognition, oral reading ability, students' survey responses, and teacher observation. The descriptive statistics were obtained for the result of the pre- and post-test. The findings from the student survey and teacher observation were also provided and interpreted to better understand the result of project and students' perception on the learning experience. The followings are the results of this study. The word recognition ability of the students was dramatically improved after the project. The students were satisfied with the overall learning experience perceiving it as helpful and fun learning. They expressed that the explicit word instruction helped their word recognition and reading ability. The results also supported that the confidence of students on their reading ability were heightened. Several suggestions are made for teachers and researchers on the word instruction for young EFL learners who are beginning readers.

  • PDF

Analysis of Word Problems in the Domain of 'Numbers and Operations' of Textbooks from the Perspective of 'Nominalization' (명사화의 관점에서 수와 연산 영역의 교과서 문장제 분석)

  • Chang, Hyewon;Kang, Yunji
    • Education of Primary School Mathematics
    • /
    • v.25 no.4
    • /
    • pp.395-410
    • /
    • 2022
  • Nominalization is one of the grammatical metaphors, and it is the representation of verbal meaning through noun equivalent phrases. In mathematical word problems, texts using nominalization have both the advantage of clarifying the object to be noted in the mathematization stage, and the disadvantage of complicating sentence structure, making it difficult to understand the sentences and hindering the experience of the full steps in mathematical modelling. The purpose of this study is to analyze word problems in the textbooks from the perspective of nominalization, a linguistic element, and to derive implications in relation to students' difficulties during solving the word problems. To this end, the types of nominalization of 341 word problems from the content domain of 'Numbers and Operations' of elementary math textbooks according to the 2015 revised national curriculum were analyzed in four aspects: grade-band group, main class and unit assessment, specialized class, and mathematical expression required word problems. Based on the analysis results, didactical implications related to the linguistic expression of the mathematical word problems were derived.

The Relationships among Sport Enjoyment, Leisure Satisfaction and Continuous Participation Intention in Ski Class (스키수업 참여 대학생들의 스포츠 재미와 여가만족 및 지속적 참여의도와의 관계)

  • NAM, Sang-Back;KWON, Jae-Yoon;KIM, Yong-Joon
    • Journal of Fisheries and Marine Sciences Education
    • /
    • v.29 no.3
    • /
    • pp.655-664
    • /
    • 2017
  • The purpose of this study was to verify the relationships among sport enjoyment, leisure satisfaction, and continuous participation intention in ski class. The study conducted a research survey through purposive sampling method. 280 questionnaires distributed and 271 were selected as final valid sample by removing 9 questionnaires that have insufficient answers. Hypothesis test on collected data was conducted by using SPSS 20.0 and AMOS 18.0 programs. The results were as followings. First, sport enjoyment factors a significant effect on psychological satisfaction, environmental satisfaction. Second, sport enjoyment factors had effect on re-participation intention, word of mouth intention. Third, leisure satisfaction had effect on re-participation intention, word of mouth intention. In addition, it had effect on continuos participation intention as the partial mediation of the leisure satisfaction.

A Comparative Study of the Trisyllabic Words with same form-morpheme and same meaning in Modern Chinese and the Trisyllabic Korean Words Written in Chinese Characters with same form-morpheme and same meaning (현대 중국어의 삼음사(三音詞)와 현용 한국 삼음절(三音節) 한자어(漢字語)의 동형(同形) 동소어(同素語) 비교 연구)

  • CHOE, GEUM DAN
    • Cross-Cultural Studies
    • /
    • v.25
    • /
    • pp.743-773
    • /
    • 2011
  • In this research, the writer has done a comparative analysis of 4,791 trisyllabic modern Chinese vocabularies from "a dictionary for trisyllabic modern Chinese word" and the corresponding Korean words written in Chinese characters out of 170,000 vocabularies hereupon that are collected in "new age new Korean dictionar y". Aa a result, we have the total 407 pairs of corresponding group with the following 3 types: 1) Chinese : Korean 3(2) : 3 syllable Chinese characters with completely same form-morpheme and same meaning, use, class (376pairs, 92.38% of 407), 2) Chinese : Korean 3 : 3 syllable Chinese characters with completely same form-morpheme and partly same meaning, use, class (18pairs, 4.42% of 407), 3)Chinese : Korean 3 : 3 syllable Chinese characters with completely same form-morpheme and different meaning, use, class (13pairs, 3.19% of 407).

Word class information in perception of prosodic prominence by Korean learners of English

  • Im, Suyeon
    • Phonetics and Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.1-8
    • /
    • 2019
  • This study aims to investigate how prosodic prominence is perceived in relation to word class information (or parts-of-speech) by Korean learners of English compared with native English speakers in public speech. Two groups, Korean learners of English and native English speakers, were asked to judge words perceived as prominent simultaneously while listening to a speech. Parts-of-speech and three acoustic cues (i.e., max F0, mean phone duration, and mean intensity) were analyzed for each word in the speech. The results showed that content words tended to be higher in pitch and longer in duration than function words. Both groups of listeners rated prominence on content words more frequently than on function words. This tendency, however, was significantly greater for Korean learners of English than for native English speakers. Among the parts-of-speech of the content words, Korean learners of English were more likely than native English speakers to judge nouns and verbs as prominent. This study presents evidence that Korean learners of English consider most, if not all, content words as landing locations of prosodic prominence, in alignment with the previous study on the production of prominence.

WordNet-Based Category Utility Approach for Author Name Disambiguation (저자명 모호성 해결을 위한 개념망 기반 카테고리 유틸리티)

  • Kim, Je-Min;Park, Young-Tack
    • The KIPS Transactions:PartB
    • /
    • v.16B no.3
    • /
    • pp.225-232
    • /
    • 2009
  • Author name disambiguation is essential for improving performance of document indexing, retrieval, and web search. Author name disambiguation resolves the conflict when multiple authors share the same name label. This paper introduces a novel approach which exploits ontologies and WordNet-based category utility for author name disambiguation. Our method utilizes author knowledge in the form of populated ontology that uses various types of properties: titles, abstracts and co-authors of papers and authors' affiliation. Author ontology has been constructed in the artificial intelligence and semantic web areas semi-automatically using OWL API and heuristics. Author name disambiguation determines the correct author from various candidate authors in the populated author ontology. Candidate authors are evaluated using proposed WordNet-based category utility to resolve disambiguation. Category utility is a tradeoff between intra-class similarity and inter-class dissimilarity of author instances, where author instances are described in terms of attribute-value pairs. WordNet-based category utility has been proposed to exploit concept information in WordNet for semantic analysis for disambiguation. Experiments using the WordNet-based category utility increase the number of disambiguation by about 10% compared with that of category utility, and increase the overall amount of accuracy by around 98%.

Increasing Output Nodes for Performance Improvement of Multilayer Perceptrons (다층퍼셉트론의 성능향상을 위한 출력노드 수 증가)

  • Oh, Sang-Hoon
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2006.11a
    • /
    • pp.13-15
    • /
    • 2006
  • When we use multilayer perceptron model for pattern classification probmems, we allocate one output node for each class. In this paper, we increase the number of output nodes for each class and investigate the performance of multilayer perceptrons through the simulation of isolated-word recognition problems.

  • PDF