• Title/Summary/Keyword: 규칙 자동 구축

Search Result 132, Processing Time 0.027 seconds

The effects of Korean logical ending connective affix on text comprehension and recall (연결어미가 글 이해와 기억에 미치는 효과)

  • Nam, Ki-Chun;Kim, Hyun-Jeong;Park, Chang-Su;Whang, Yu-Mi;Kim, Young-Tae;Sim, Hyun-Sup
    • Annual Conference on Human and Language Technology
    • /
    • 2004.10d
    • /
    • pp.251-258
    • /
    • 2004
  • 본 연구는 연결어미가 글 이해와 기억에 미치는 영향을 조사하고, 연결어미의 효과와 글읽기 능력과는 어떤 관련성이 있는지를 조사하기 위해 실시되었다. 연결어미로는 인과 관계와 부가 관계를 나타내는 연결어미가 사용되었다. 앞뒤에 제시되는 두 문장의 국소적 응집성(Local coherence)을 형성하는데 연결어미가 도움을 준다면, 연결어미가 있는 경우에 문장을 이해하는 속도가 빨라지고 글 내용을 기억하는 데에도 도움을 줄 것으로 예측하였다. 만일에 글읽기 능력이 연결어미를 적절히 사용할 수 있는 능력에 의해서도 영향을 받는다면, 연결어미의 출현 여부와 읽기 능력간에 상호작용이 있을 것으로 예측하였다. 실험 1에서는 인과 관계 연결어미를 사용하여 문장 읽기 시간에 연결어미의 출현이 미치는 효과와 문장 회상에 미치는 효과를 조사하였다. 실험 결과, 인과 관계 연결어미는 뒤의 문장을 읽는데 촉진적인 효과를 주었으며, 이런 연결어미의 효과는 읽기 능력에 관계없이 일관된 촉진 효과를 나타냈다. 또한, 연결어미의 출현은 문장의 회상에 도움을 주었으며, 연결어미가 문장 회상에 미치는 효과는 읽기 능력의 상하에 관계없이 일관되게 나타났다. 실험 2에서는 부가 관계 연결어미가 문장 읽기 시간과 회상에 미치는 효과를 조사하였다. 실험 결과. 부가 관계 연결어미 역시 인과 관계 연결어미와 유사한 형태의 효과를 보였다. 실험 1과 실험 2의 결과는 인과 관계와 부가 관계 연결어미가 앞뒤 문장의 응집성 형성에 긍정적인 영향을 주고, 이런 연결어미의 글읽기에 대한 효과는 글읽기 능력에 관계없이 일정하다는 것을 시사한다.건이 복합 명사의 중심어 선택과 의미 결정에 재활용 될 수 있으며, 병렬말뭉치에 의해 반자동으로 구축되는 의미 대역 패턴을 사용하여 데이터 구축의 어려움을 개선하고자 한다. 및 산출 과정에 즉각적으로 활용될 수 있을 것이다. 또한, 이러한 정보들은 현재 구축중인 세종 전자사전에도 직접 반영되고 있다.teness)은 언화행위가 성공적이라는 것이다.[J. Searle] (7) 수로 쓰인 것(상수)(象數)과 시로 쓰인 것(의리)(義理)이 하나인 것은 그 나타난 것과 나타나지 않은 것들 사이에 어떠한 들도 없음을 말한다. [(성중영)(成中英)] (8) 공통의 규범의 공통성 속에 규범적인 측면이 벌써 있다. 공통성에서 개인적이 아닌 공적인 규범으로의 전이는 규범, 가치, 규칙, 과정, 제도로의 전이라고 본다. [C. Morrison] (9) 우리의 언어사용에 신비적인 요소를 부인할 수가 없다. 넓은 의미의 발화의미(utterance meaning) 속에 신비적인 요소나 애정표시도 수용된다. 의미분석은 지금 한글을 연구하고, 그 결과에 의존하여서 우리의 실제의 생활에 사용하는 $\ulcorner$한국어사전$\lrcorner$ 등을 만드는 과정에서, 어떤 의미에서 실험되었다고 말할 수가 있는 언어과학의 연구의 결과에 의존하여서 수행되는 철학적인 작업이다. 여기에서는 하나의 철학적인 연구의 시작으로 받아들여지는 이 의미분석의 문제를 반성하여 본다.반인과 다르다는 것이 밝혀졌다. 이 결과가 옳다면 한국의 심성 어휘집은 어절 문맥에 따라서 어간이나 어근 또는 활용형 그 자체로 이루어져

  • PDF

Dropout Prediction Modeling and Investigating the Feasibility of Early Detection in e-Learning Courses (일반대학에서 교양 e-러닝 강좌의 중도탈락 예측모형 개발과 조기 판별 가능성 탐색)

  • You, Ji Won
    • The Journal of Korean Association of Computer Education
    • /
    • v.17 no.1
    • /
    • pp.1-12
    • /
    • 2014
  • Since students' behaviors during e-learning are automatically stored in LMS(Learning Management System), the LMS log data convey the valuable information of students' engagement. The purpose of this study is to develop a prediction model of e-learning course dropout by utilizing LMS log data. Log data of 578 college students who registered e-learning courses in a traditional university were used for the logistic regression analysis. The results showed that attendance and study time were significant to predict dropout, and the model classified between dropouts and completers of e-learning courses with 96% accuracy. Furthermore, the feasibility of early detection of dropouts by utilizing the model were discussed.

  • PDF

Intra-Sentence Segmentation using Maximum Entropy Model for Efficient Parsing of English Sentences (효율적인 영어 구문 분석을 위한 최대 엔트로피 모델에 의한 문장 분할)

  • Kim Sung-Dong
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.5
    • /
    • pp.385-395
    • /
    • 2005
  • Long sentence analysis has been a critical problem in machine translation because of high complexity. The methods of intra-sentence segmentation have been proposed to reduce parsing complexity. This paper presents the intra-sentence segmentation method based on maximum entropy probability model to increase the coverage and accuracy of the segmentation. We construct the rules for choosing candidate segmentation positions by a teaming method using the lexical context of the words tagged as segmentation position. We also generate the model that gives probability value to each candidate segmentation positions. The lexical contexts are extracted from the corpus tagged with segmentation positions and are incorporated into the probability model. We construct training data using the sentences from Wall Street Journal and experiment the intra-sentence segmentation on the sentences from four different domains. The experiments show about $88\%$ accuracy and about $98\%$ coverage of the segmentation. Also, the proposed method results in parsing efficiency improvement by 4.8 times in speed and 3.6 times in space.

Direct Load Control Using Active Database (능동 데이터베이스를 이용한 직접부하제어)

  • Choi, Sang-Yule;Kim, Hak-Man
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.20 no.5
    • /
    • pp.107-115
    • /
    • 2006
  • The existing DLC system functionally has two defects. One is it has to be controlled by operators whenever customer's portion of loads are increased more than predefined objected load. Therefore, it may be possible for propagating uncontrolled loads if operators make a mistake. The other one is that currently used DLC algorithm is usually focused on ON/OFF load control not concerning about reliving participated customer's inconvenience. Therefore, that is a major obstacle to attract customer participating in demand response program. This paper represents direct load control system using active database. By using active database, DLC system can control customer's load effectively without intervening of operator. And by using dynamic programming based on the order of priority for DLC algorithm, it is possible to maximize participating customer's satisfaction.

Integrated Semantic Querying on Distributed Bioinformatics Databases Based on GO (분산 생물정보 DB 에 대한 GO 기반의 통합 시맨틱 질의 기법)

  • Park Hyoung-Woo;Jung Jun-Won;Kim Hyoung-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.12 no.4
    • /
    • pp.219-228
    • /
    • 2006
  • Many biomedical research groups have been trying to share their outputs to increase the efficiency of research. As part of their efforts, a common ontology named Gene Ontology(GO), which comprises controlled vocabulary for the functions of genes, was built. However, data from many research groups are distributed and most systems don't support integrated semantic queries on them. Furthermore, the semantics of the associations between concepts from external classification systems and GO are still not clarified, which makes integrated semantic query infeasible. In this paper we present an ontology matching and integration system, called AutoGOA, which first resolves the semantics of the associations between concepts semi-automatically, and then constructs integrated ontology containing concepts from GO and external classification systems. Also we describe a web-based application, named GOGuide II, which allows the user to browse, query and visualize integrated data.

A method for automatic EPC code conversion based on ontology methodology (온톨로지 기반 EPC 코드 자동 변환 방법)

  • Noh, Young-Sik;Byun, Yung-Cheol
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.3
    • /
    • pp.452-460
    • /
    • 2008
  • ALE-complient RFID middleware system receives EPC code data from reader devices and converts the data into URN format data internally. After filtering and grouping, the system sends the resulting URN code to application and(or) users. Meanwhile, not only the types of EPC code are very diverse, but also another new kinds of EPC code can be emerged in the future. Therefore, a method to process all kinds of EPC code effectively is required by RFID middleware. In this paper, a method to process various kinds of EPC code acquired from RFID reader devices in ALE-complient RFID middleware is proposed. Especially, we propose an approach using ontology technology to process not only existing EPC code but also newly defined code in the future. That is, we build an ontology of conversion rules for each EPC data type to effectively convert EPC data into URL format data. In this case, we can easily extend RFID middleware to process a new EPC code data by adding a conversion rule ontology for the code.

The Unsupervised Learning-based Language Modeling of Word Comprehension in Korean

  • Kim, Euhee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.11
    • /
    • pp.41-49
    • /
    • 2019
  • We are to build an unsupervised machine learning-based language model which can estimate the amount of information that are in need to process words consisting of subword-level morphemes and syllables. We are then to investigate whether the reading times of words reflecting their morphemic and syllabic structures are predicted by an information-theoretic measure such as surprisal. Specifically, the proposed Morfessor-based unsupervised machine learning model is first to be trained on the large dataset of sentences on Sejong Corpus and is then to be applied to estimate the information-theoretic measure on each word in the test data of Korean words. The reading times of the words in the test data are to be recruited from Korean Lexicon Project (KLP) Database. A comparison between the information-theoretic measures of the words in point and the corresponding reading times by using a linear mixed effect model reveals a reliable correlation between surprisal and reading time. We conclude that surprisal is positively related to the processing effort (i.e. reading time), confirming the surprisal hypothesis.

Re-defining Named Entity Type for Personal Information De-identification and A Generation method of Training Data (개인정보 비식별화를 위한 개체명 유형 재정의와 학습데이터 생성 방법)

  • Choi, Jae-hoon;Cho, Sang-hyun;Kim, Min-ho;Kwon, Hyuk-chul
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.206-208
    • /
    • 2022
  • As the big data industry has recently developed significantly, interest in privacy violations caused by personal information leakage has increased. There have been attempts to automate this through named entity recognition in natural language processing. In this paper, named entity recognition data is constructed semi-automatically by identifying sentences with de-identification information from de-identification information in Korean Wikipedia. This can reduce the cost of learning about information that is not subject to de-identification compared to using general named entity recognition data. In addition, it has the advantage of minimizing additional systems based on rules and statistics to classify de-identification information in the output. The named entity recognition data proposed in this paper is classified into twelve categories. There are included de-identification information, such as medical records and family relationships. In the experiment using the generated dataset, KoELECTRA showed performance of 0.87796 and RoBERTa of 0.88.

  • PDF

An Automatic Setting Method of Data Constraints for Cleansing Data Errors between Business Services (비즈니스 서비스간의 오류 정제를 위한 데이터 제약조건 자동 설정 기법)

  • Lee, Jung-Won
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.3
    • /
    • pp.161-171
    • /
    • 2009
  • In this paper, we propose an automatic method for setting data constraints of a data cleansing service, which is for managing the quality of data exchanged between composite services based on SOA(Service-Oriented Architecture) and enables to minimize human intervention during the process. Because it is impossible to deal with all kinds of real-world data, we focus on business data (i.e. costumer order, order processing) which are frequently used in services such as CRM(Customer Relationship Management) and ERP(Enterprise Resource Planning). We first generate an extended-element vector by extending semantics of data exchanged between composite services and then build a rule-based system for setting data constraints automatically using the decision tree learning algorithm. We applied this rule-based system into the data cleansing service and showed the automation rate over 41% by learning data from multiple registered services in the field of business.

A Study on Improving the Data Quality Validation of Underground Facilities(Structure-type) (지하시설물(구조물형) 데이터 품질검증방법 개선방안 연구)

  • Bae, Sang-Keun;Kim, Sang-Min;Yoo, Eun-Jin;Im, Keo-Bae
    • Journal of Cadastre & Land InformatiX
    • /
    • v.51 no.2
    • /
    • pp.5-20
    • /
    • 2021
  • With the available national spatial information that started from the sinkholes that occurred nationwide in 2014 and integrated 15 areas of underground information, the Underground Spatial Integrated Map has been continuously maintained since 2015. However, until recently, as disasters and accidents in underground spaces such as hot water pipes rupture, cable tunnel fires, and ground subsidence continue to occur, there is an increasing demand for quality improvement of underground information. Thus, this paper attempted to prepare a plan to improve the quality of the Underground Spatial Integrated Map data. In particular, among the 15 types of underground information managed through the Underground Spatial Integrated Map, quality validation improvement measures were proposed for underground facility (structure-type) data, which has the highest proportion of new constructions. To improve the current inspection methods that primarily rely on visual inspection, we elaborate on and subdivide the current quality inspection standards. Specifically, we present an approach for software-based automated inspection of databases, including graphics and attribute information, by adding three quality inspection items, namely, quality inspection methods, rules, and flow diagram, solvable error types, to the current four quality inspection items consisting of quality elements, sub-elements, detailed sub-elements, and quality inspection standards.