• Title/Summary/Keyword: 문장의 복잡도

Search Result 133, Processing Time 0.031 seconds

Query-Based Text Summarization Using Cosine Similarity and NMF (NMF 와 코사인유사도를 이용한 질의 기반 문서요약)

  • Park Sun;Lee Ju-Hong;Ahn Chan-Min;Park Tae-Su;Song Jae-Won;Kim Deok-Hwan
    • Annual Conference of KIPS
    • /
    • 2006.05a
    • /
    • pp.473-476
    • /
    • 2006
  • 인터넷의 발달로 인하여 정보의 양은 시간이 지날수록 폭발적으로 증가하고 있다. 이러한 방대한 정보로부터 정보검색시스템은 사용자에게 너무 많은 검색결과를 제시하여 사용자가 원하는 정보를 찾기 위해 너무 많은 시간을 소요하게 하는 정보의 과적재 문제가 있다. 질의 기반의 문서요약은 정보의 사용자가 원하는 정보의 검색시간을 줄임으로써 정보의 과적재 문제를 해결하는 방법으로서 점차 중요성이 증가하고 있다. 본 논문은 비음수 행렬 인수분해 (NMF, Non-negative Matrix Factorization)과 코사인 유사도를 이용하여 질의 기반의 문서를 요약하는 새로운 방법을 제안하였다. 제안된 방법은 질의와 문서 간에 사전학습이 필요 없다. 또한 문서를 그래프로 변형시키는 복잡한 처리 없이 NMF 에 의해 얻어진 의미 특징(semantic feature)과 의미 변수(semantic variable)로 문서의 고유 구조를 반영하여 요약의 정확도를 높일 수 있다. 마지막으로 단순한 방법으로 문장을 쉽게 요약할 수 있다.

  • PDF

CRFs versus Bi-LSTM/CRFs: Automatic Word Spacing Perspective (CRFs와 Bi-LSTM/CRFs의 비교 분석: 자동 띄어쓰기 관점에서)

  • Yoon, Ho;Kim, Chang-Hyun;Cheon, Min-Ah;Park, Ho-min;Namgoong, Young;Choi, Minseok;Kim, Jae-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.189-192
    • /
    • 2018
  • 자동 띄어쓰기란 컴퓨터를 사용하여 띄어쓰기가 수행되어 있지 않은 문장에 대해 띄어쓰기를 수행하는 것이다. 이는 자연언어처리 분야에서 형태소 분석 전에 수행되는 과정으로, 띄어쓰기에 오류가 발생할 경우, 형태소 분석이나 구문 분석 등에 영향을 주어 그 결과의 모호성을 높이기 때문에 매우 중요한 전처리 과정 중 하나이다. 본 논문에서는 기계학습의 방법 중 하나인 CRFs(Conditional Random Fields)를 이용하여 자동 띄어쓰기를 수행하고 심층 학습의 방법 중 하나인 양방향 LSTM/CRFs (Bidirectional Long Short Term Memory/CRFs)를 이용하여 자동 띄어쓰기를 수행한 뒤 각 모델의 성능을 비교하고 분석한다. CRFs 모델이 양방향 LSTM/CRFs모델보다 성능이 약간 더 높은 모습을 보였다. 따라서 소형 기기와 같은 환경에서는 CRF와 같은 모델을 적용하여 모델의 경량화 및 시간복잡도를 개선하는 것이 훨씬 더 효과적인 것으로 생각된다.

  • PDF

Controlled Korean Phrase-Structure Standard Spec. for the Automatic Information Trading Mediator System (정보거래 자동 중개 시스템을 위한 한국어 문형 표준안)

  • Chung, Eui-Sok;Kim, Ki-Tae;Lim, Soo-Jong;Cha, Gun-Hae;Park, Jae-Deuk;Yoon, Bo-Hyun;Kang, Hyun-Kyu
    • Annual Conference on Human and Language Technology
    • /
    • 2000.10d
    • /
    • pp.138-145
    • /
    • 2000
  • 본 논문은 정보거래 자동 중개 시스템을 위한 한국어 문형 표준안에 대하여 기술한다. 정보거래 자동 중개 시스템은 인터넷상에서 지식정보자산의 공급자와 수요자를 자동으로 연결해주는 시스템으로서 텍스트로 기술되는 수요자의 의도와 공급자의 지식정보 내용을 정확히 연결할 수 있는 신뢰성을 보장한 고품질의 정보검색 기술이 필수적이다. 그러나 자연어의 복잡성과 불규칙성은 정확한 언어처리 기술이 필수적인 고품질의 정보검색을 보장할 수 없다. 따라서 본 논문은 한국어 문장 표현 방식을 표준화하여 언어처리 기술 적용의 한계를 극복해보자는 데 그 목적이 있다. 또한 일반 사용자의 언어 표현을 문형 표준안으로 유도하는 방법에 대하여 기술한다. 문형 표준안의 구성은 표준 문형 표준 문형 유도 방법, 어휘부로 구성되어 있다.

  • PDF

Analysis of problems of current science textbooks perceived by teachers and students in view of learner-centered classroom (학습자 중심 수업 운영의 관점에서 초중등 교사와 학생이 본 현행 과학 교과서의 문제점 분석)

  • Yun, Eunjeong;Kwon, Sung Gi;Park, Yunebae
    • Journal of Science Education
    • /
    • v.39 no.3
    • /
    • pp.404-417
    • /
    • 2015
  • It is important for student to participate in classroom actively in order to raise effeciveness of education. In this study, we have considered the science textbooks as major factor which influence to participation in the science class, and aimed to find the problems of current sicence textbooks as tool to promote students' participation, and the improvement method. The questionnaire which include the questions to ask requirements for and problems of science textbooks for learner-centered instruction was developed, and then 99 science teachers and 821 students answered the questionnaire. As a result, students responded that current science textbooks lacked explanation, had many of difficult words and complex sentences, and were uninteresting. Teachers responded that current science textbooks had large in quantity, were written knowledge centered, and lacked of link with real life, and of story. To conclude, science textbooks revitalizing the students' participation had to strengthen the link with real life, increase students' activities, use words and sentences appropriate level for students, strengthen storyline, and provide sufficient chances to check the students' understanding by themselves.

  • PDF

Characteristics of Narrative Writing in Normal Aging: Story Grammar and Syntactic Structure (노년층의 글쓰기 특성 -이야기문법과 구문구조)

  • Kim, Hyeon Ah;Won, Sae Rom;Lee, Bo Eun;Yoon, Ji Hye
    • 재활복지
    • /
    • v.21 no.1
    • /
    • pp.193-212
    • /
    • 2017
  • The elderly often produce irrelevant speech and get off-topic more easily than the young; the former also has difficulty generating fewer syntactic structures and makes errors of grammatical morphemes. In particular, the elderly might have more difficulty writing since it requires more complex cognitive processes than storytelling. The participants in this study were 32 young people and 32 older people. They were asked to write a short story of Korean fairy tale('Heungbu Nolbu'). The data was analyzed in narrative composition and syntactic structures. The study revealed the following: First, in composition aspects, the elderly group showed significantly lower total number of story grammar and episodes. In addition, the elderly produced more off topic statements. Second, in syntactic aspects, although there was no significant difference in the number of producing complex sentences between two groups, the elderly group generated more inadequate cohesive devices and used fewer relative and adverbial clauses. These findings suggest that the elderly have a tendency to perform tasks by producing more off-topic statements and shows decreasing coherence by using lower number of relative and adverbial clauses. However, this study also uncovers that the elderly were able to write more complex and longer sentences using visual feedback.

Maritime Safety Tribunal Ruling Analysis using SentenceBERT (SentenceBERT 모델을 활용한 해양안전심판 재결서 분석 방법에 대한 연구)

  • Bori Yoon;SeKil Park;Hyerim Bae;Sunghyun Sim
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.7
    • /
    • pp.843-856
    • /
    • 2023
  • The global surge in maritime traffic has resulted in an increased number of ship collisions, leading to significant economic, environmental, physical, and human damage. The causes of these maritime accidents are multifaceted, often arising from a combination of crew judgment errors, negligence, complexity of navigation routes, weather conditions, and technical deficiencies in the vessels. Given the intricate nuances and contextual information inherent in each incident, a methodology capable of deeply understanding the semantics and context of sentences is imperative. Accordingly, this study utilized the SentenceBERT model to analyze maritime safety tribunal decisions over the last 20 years in the Busan Sea area, which encapsulated data on ship collision incidents. The analysis revealed important keywords potentially responsible for these incidents. Cluster analysis based on the frequency of specific keyword appearances was conducted and visualized. This information can serve as foundational data for the preemptive identification of accident causes and the development of strategies for collision prevention and response.

A Research on Optimal Water Allocation Methodology for Water Management in River Basin (유역 물관리를 위한 최적 물배분 방식에 대한 연구)

  • Lee, Jin-Hee;Lee, Dong-Ryul;Yi, Choong-Sung;Moon, Jang-Won
    • Journal of Wetlands Research
    • /
    • v.10 no.2
    • /
    • pp.155-164
    • /
    • 2008
  • As popultations expand and economies develop, increasing competition for limited available water resources is occurring among many water users. This has brought greater attention to water allocation with legal and institutional constraints. This paper develops a optimal water allocation methodology to basinwide water resources allocation, which ensures that scare water resources are allocated among competing water users. The methodology need to be based on optimization technique to allocate water resources due to an extended scaled of river basin. The recommended model is developed to accomplish economic efficiency, equity and sustainability objectives. The appropriate case study is tested with various existing water right system allocation model and the recommended model. The result shows the applicability of model to the complex hydrologic system with legal and institutional constraints.

  • PDF

Extraction of Features in key frames of News Video for Content-based Retrieval (내용 기반 검색을 위한 뉴스 비디오 키 프레임의 특징 정보 추출)

  • Jung, Yung-Eun;Lee, Dong-Seop;Jeon, Keun-Hwan;Lee, Yang-Weon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.9
    • /
    • pp.2294-2301
    • /
    • 1998
  • The aim of this paper is to extract features from each news scenes for example, symbol icon which can be distinct each broadcasting corp, icon and caption which are has feature and important information for the scene in respectively, In this paper, we propose extraction methods of caption that has important prohlem of news videos and it can be classified in three steps, First of al!, we converted that input images from video frame to YIQ color vector in first stage. And then, we divide input image into regions in clear hy using equalized color histogram of input image, In last, we extracts caption using edge histogram based on vertical and horizontal line, We also propose the method which can extract news icon in selected key frames by the difference of inter-histogram and can divide each scene by the extracted icon. In this paper, we used comparison method of edge histogram instead of complex methcxls based on color histogram or wavelet or moving objects, so we shorten computation through using simpler algorithm. and we shown good result of feature's extraction.

  • PDF

TripleDiff: an Incremental Update Algorithm on RDF Documents in Triple Stores (TripleDiff: 트리플 저장소에서 RDF 문서에 대한 점진적 갱신 알고리즘)

  • Lee, Tae-Whi;Kim, Ki-Sung;Yoo, Sang-Won;Kim, Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.33 no.5
    • /
    • pp.476-485
    • /
    • 2006
  • The Resource Description Framework(RDF), which emerged with the semantic web, is settling down as a standard for representing information about the resources in the World Wide Web Hence, a lot of research on storing and query processing RDF documents has been done and several RDF storage systems, such as Sesame and Jena, have been developed. But the research on updating RDF documents is still insufficient. When a RDF document is changed, data in the RDF triple store also needs to be updated. However, current RDF triple stores don't support incremental update. So updating can be peformed only by deleting the old version and then storing the new document. This updating method is very inefficient because RDF documents are steadily updated. Furthermore, it makes worse when several RDF documents are stored in the same database. In this paper, we propose an incremental update algorithm on RDF, documents in triple stores. We use a text matching technique for two versions of a RDF document and compensate for the text matching result to find the right target triples to be updated. We show that our approach efficiently update RDF documents through experiments with real-life RDF datasets.

A Model of English Part-Of-Speech Determination for English-Korean Machine Translation (영한 기계번역에서의 영어 품사결정 모델)

  • Kim, Sung-Dong;Park, Sung-Hoon
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.3
    • /
    • pp.53-65
    • /
    • 2009
  • The part-of-speech determination is necessary for resolving the part-of-speech ambiguity in English-Korean machine translation. The part-of-speech ambiguity causes high parsing complexity and makes the accurate translation difficult. In order to solve the problem, the resolution of the part-of-speech ambiguity must be performed after the lexical analysis and before the parsing. This paper proposes the CatAmRes model, which resolves the part-of-speech ambiguity, and compares the performance with that of other part-of-speech tagging methods. CatAmRes model determines the part-of-speech using the probability distribution from Bayesian network training and the statistical information, which are based on the Penn Treebank corpus. The proposed CatAmRes model consists of Calculator and POSDeterminer. Calculator calculates the degree of appropriateness of the partof-speech, and POSDeterminer determines the part-of-speech of the word based on the calculated values. In the experiment, we measure the performance using sentences from WSJ, Brown, IBM corpus.

  • PDF