• Title/Summary/Keyword: semantic classification

Search Result 329, Processing Time 0.023 seconds

Improving a Korean Spell/Grammar Checker for the Web-Based Language Learning System (웹기반 언어 학습시스템을 위한 한국어 철자/문법 검사기의 성능 향상)

  • 남현숙;김광영;권혁철
    • Korean Journal of Cognitive Science
    • /
    • v.12 no.3
    • /
    • pp.1-18
    • /
    • 2001
  • The goal of this paper is the pedagogical application of a Korean Spell/Grammar Checker to the web-based language learning system for Korean writing. To maximize the efficient instruction of our learning system \\`Urimal Baeumteo\\` we have to improve our Korean Spell/Grammar Checker. Today the NLP system\\`s performance defends on its semantic processing capability. In our Korean Spell/Grammar Checker. the tasks accomplished in the semantic level are: the detection and correction of misused derived and compound nouns in a Korean spell-checking device and the detection and correction of syntactic and semantic errors in a Korean grammars-checking device. We describe a common approach to the partial parsing using collocation rules based on the dependency grammar. To provide more detailed semantic rules. we classified nouns according to their concepts. and subcategorized verbs referring to their syntactic and semantic features. Improving a Korean Spell/Gl-Grammar Checker makes our learning system active and intelligent in a web-based environment. We acknowledge the flaws in our system: the classification of nouns based on their meanings and concepts is a time consuming task. the analytic unit of this study is principally limited to the phrases in a sentence therefore the accurate parsing of embedded sentences remains a difficult problem to solve. Concerning the web-based language learning system. it is critically important to consider its interface design and structure of its contents.

  • PDF

Building Korean Science Textbook Corpus (K-STeC) for research of Scientific Language in Education (교육용 과학언어 연구를 위한 범용 자료로서 과학교과서 말뭉치 K-STeC(Korean Science Textbook Corpus) 구축)

  • Yun, Eunjeong;Kim, Jinho;Nam, Kilim;Song, Hyunju;Ok, Cheolyoung;Choi, Jun;Park, Yunebae
    • Journal of The Korean Association For Science Education
    • /
    • v.38 no.4
    • /
    • pp.575-585
    • /
    • 2018
  • In this study, the texts of science textbooks of the past 20 years were collected in order to systematically carry out researches on scientific languages and scientific terms that have not been noticed in science education. We have collected all the science textbooks from elementary school to high school in the 6th curriculum, the 7th curriculum, and the 2009 revised curriculum, and constructed a corpus comprising of 132 textbooks in total. Sequentially, a raw corpus, a morphological annotated corpus, and a semantic annotated corpus of science terms, were constructed. The final constructed science textbook corpus was named K-STeC (Korean Science Textbook Corpus). K-STeC is a semantic annotated corpus with semantic classification and classification of scientific terms, together with meta information of bibliographic information such as curriculum, subject, grade, and publisher, location information such as chapter, section, lesson, page, and sentence, and structure information such as main, inquiry activities, reference materials, and titles. Throughout the three-year study period, a new research method was created by integrating the know-how of the three fields of linguistic informatics, computer science and science education, and a large number of experts were put in to produce labor-intensive results. This paper introduces new research methodologies and outcomes by looking at the whole research process and methods, and discusses the possibility of future development of scientific language research and how to use the results.

Conceptual Structure Analysis of Metamorphic Rock by Earth Science Teachers Using Semantic Network Analysis (언어네트워크분석을 활용한 지구과학교사들의 변성암에 대한 개념 구조 분석)

  • Duk Ho Chung;Chul Min Lee
    • Journal of the Korean earth science society
    • /
    • v.43 no.6
    • /
    • pp.762-776
    • /
    • 2022
  • The purpose of this study was to determined the conceptual structure used by earth science teachers to classify metamorphic rocks as well as the criteria applied in the process of classifying metamorphic rocks. To this end, the researchers collected verbal data uttered in the process of classifying metamorphic rock using think-aloud from 21 earth science teachers in middle and high schools in Jeollabuk-do, Republic of Korea. The collected verbal data were analyzed using the semantic network analysis method, and the following results were obtained. First, in the process of classifying metamorphic rocks, earth science teachers classified them based on characteristics such as color, compositional minerals, and particle size, which can be generally observed in rocks, and foliation that appears in metamorphic rocks. Second, earth science teachers recognize the classification criteria for metamorphic rocks and focus on metamorphism such as contact metamorphism or regional metamorphism. However, there were cases where rocks were mistakenly classified through incorrect identification. Therefore, it is necessary to provide sufficient observational information about, and experience of, metamorphic rocks to enable earth science teachers to recognize and relate to the scientific process of identifying metamorphic rocks through the phenomena observed.

Image Classification Using Bag of Visual Words and Visual Saliency Model (이미지 단어집과 관심영역 자동추출을 사용한 이미지 분류)

  • Jang, Hyunwoong;Cho, Soosun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.12
    • /
    • pp.547-552
    • /
    • 2014
  • As social multimedia sites are getting popular such as Flickr and Facebook, the amount of image information has been increasing very fast. So there have been many studies for accurate social image retrieval. Some of them were web image classification using semantic relations of image tags and BoVW(Bag of Visual Words). In this paper, we propose a method to detect salient region in images using GBVS(Graph Based Visual Saliency) model which can eliminate less important region like a background. First, We construct BoVW based on SIFT algorithm from the database of the preliminary retrieved images with semantically related tags. Second, detect salient region in test images using GBVS model. The result of image classification showed higher accuracy than the previous research. Therefore we expect that our method can classify a variety of images more accurately.

An Experimental Study on Feature Selection Using Wikipedia for Text Categorization (위키피디아를 이용한 분류자질 선정에 관한 연구)

  • Kim, Yong-Hwan;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.2
    • /
    • pp.155-171
    • /
    • 2012
  • In text categorization, core terms of an input document are hardly selected as classification features if they do not occur in a training document set. Besides, synonymous terms with the same concept are usually treated as different features. This study aims to improve text categorization performance by integrating synonyms into a single feature and by replacing input terms not in the training document set with the most similar term occurring in training documents using Wikipedia. For the selection of classification features, experiments were performed in various settings composed of three different conditions: the use of category information of non-training terms, the part of Wikipedia used for measuring term-term similarity, and the type of similarity measures. The categorization performance of a kNN classifier was improved by 0.35~1.85% in $F_1$ value in all the experimental settings when non-learning terms were replaced by the learning term with the highest similarity above the threshold value. Although the improvement ratio is not as high as expected, several semantic as well as structural devices of Wikipedia could be used for selecting more effective classification features.

Construction of the Digital Archive System from the Records of Westerners Who Stayed in Korea during the Enlightenment Period of Chosun (개화기 조선 체류 서양인 기록물의 디지털 아카이브 시스템 구축)

  • Chung, Heesun;Kim, Heesoon;Song, Hyun-Sook;Lee, Myeong-Hee
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.27 no.4
    • /
    • pp.229-249
    • /
    • 2016
  • This study was conducted to create a digital archive for local cultural contents compiled from the records of westerners who stayed in Korea during the Enlightenment Period of Chosun. The compiled information were gathered from 22 records, and 10 main subjects, 40 sub-subjects and 239 mini-subjects were derived through the subject classification scheme. Item analysis was conducted through 38 metadata and input data types were classified and databased in Excel. Finally, a web-based digital archiving system was developed for searching and providing information through various access points. Suggestions for future research were made to expand archive contents through continuous excavation of westerners' records, to build an integrated information system of Korean digital archives incorporating individual archive systems, to develop standardization of classification schemes and a multidimensional classification system considering facet structure in cultural heritage areas, to keep consistency of contents through standardization of metadata format, and to build ontology using semantic search functions and data mining functions.

A Study on the Pattern Distribution of Yin-Yang Ren [음양인] (Used on Questionnaire) (음양인 유형분류에 관한 연구 (설문지를 중심으로))

  • 이상범;최경미;박영배
    • The Journal of Korean Medicine
    • /
    • v.25 no.1
    • /
    • pp.1-20
    • /
    • 2004
  • Objectives : Based on the analysis of Yin-Yang[음양] characteristics and symptoms, each person is classified into Yin-Yang. Also the validity of the result is statistically analized. Methods : From Feb. to May. 2003, the data were collected through a questionnaire given to 690 patients. The questionnaire was composed of 34 items which were about personality, habit, sweat, response to coldness, thirst, bowel, urine, physical shape, and menstruation for women only. SD(Semantic Differential Technique) used for each item, each item is measured as a contrast of two opposite symptoms. Reliability analysis was used to select items and categories. Based on means of items in each category the Yin-Yang index was developed. The validity of Yin-Yang index was investigated using classification and clustering analysis. In statistical analysis, SPSS V10.0.7 PC was used. Results : The obtained results are summarized as follows: 1) We constructed Yin-Yang index based on the middle point of the sum of categorical means. Then we classified each person into Yin or Yang. 2) To investigate the validity of the distribution of personal Yin-Yang degree, the crosstabulation of results from clustering and classification was used. The hit ratio for classification was much higher than Maximum Chance Criterion($C_{max}$), and concurrence in crosstabulation was successful. Therefore we can infer that the distribution of Yin-Yang was valid. Conclusions : Based on Yin-Yang characteristics and symptoms, we was analyzed personal degree of Yin-Yang, and confirmed the validity of its distribution. Therefore this index can be used further for Bian-Zheng [변증] and classification of the constitution.

  • PDF

Construction of Hierarchical Classification of User Tags using WordNet-based Formal Concept Analysis (WordNet기반의 형식개념분석기법을 이용한 사용자태그 분류체계의 구축)

  • Hwang, Suk-Hyung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.10
    • /
    • pp.149-161
    • /
    • 2013
  • In this paper, we propose a novel approach to construction of classification hierarchies for user tags of folksonomies, using WordNet-based Formal Concept Analysis tool, called TagLighter, which is developed on this research. Finally, to give evidence of the usefulness of this approach in practice, we describe some experiments on user tag data of Bibsonomy.org site. The classification hierarchies of user tags constructed by our approach allow us to gain a better and further understanding and insight in tagged data during information retrieval and data analysis on the folksonomy-based systems. We expect that the proposed approach can be used in the fields of web data mining for folksonomy-based web services, social networking systems and semantic web applications.

Building Domain Ontology through Concept and Relation Classification (개념 및 관계 분류를 통한 분야 온톨로지 구축)

  • Huang, Jin-Xia;Shin, Ji-Ae;Choi, Key-Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.9
    • /
    • pp.562-571
    • /
    • 2008
  • For the purpose of building domain ontology, this paper proposes a methodology for building core ontology first, and then enriching the core ontology with the concepts and relations in the domain thesaurus. First, the top-level concept taxonomy of the core ontology is built using domain dictionary and general domain thesaurus. Then, the concepts of the domain thesaurus are classified into top-level concepts in the core ontology, and relations between broader terms (BT) - narrower terms (NT) and related terms (RT) are classified into semantic relations defined for the core ontology. To classify concepts, a two-step approach is adopted, in which a frequency-based approach is complemented with a similarity-based approach. To classify relations, two techniques are applied: (i) for the case of insufficient training data, a rule-based module is for identifying isa relation out of non-isa ones; a pattern-based approach is for classifying non-taxonomic semantic relations from non-isa. (ii) For the case of sufficient training data, a maximum-entropy model is adopted in the feature-based classification, where k-NN approach is for noisy filtering of training data. A series of experiments show that performances of the proposed systems are quite promising and comparable to judgments by human experts.

Semantic Topic Selection Method of Document for Classification (문서분류를 위한 의미적 주제선정방법)

  • Ko, kwang-Sup;Kim, Pan-Koo;Lee, Chang-Hoon;Hwang, Myung-Gwon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.1
    • /
    • pp.163-172
    • /
    • 2007
  • The web as global network includes text document, video, sound, etc and connects each distributed information using link Through development of web, it accumulates abundant information and the main is text based documents. Most of user use the web to retrieve information what they want. So, numerous researches have progressed to retrieve the text documents using the many methods, such as probability, statistics, vector similarity, Bayesian, and so on. These researches however, could not consider both the subject and the semantics of documents. As a result user have to find by their hand again. Especially, it is more hard to find the korean document because the researches of korean document classification is insufficient. So, to overcome the previous problems, we propose the korean document classification method for semantic retrieval. This method firstly, extracts TF value and RV value of concepts that is included in document, and maps into U-WIN that is korean vocabulary dictionary to select the topic of document. This method is possible to classify the document semantically and showed the efficiency through experiment.