• Title/Summary/Keyword: 한국어 논문 요약

Search Result 124, Processing Time 0.022 seconds

Sentence-Frame based English-to-Korean Machine Translation (문틀기반 영한 자동번역 시스템)

  • Choi, Sung-Kwon;Seo, Kwang-Jun;Kim, Young-Kil;Seo, Young-Ae;Roh, Yoon-Hyung;Lee, Hyun-Keun
    • Annual Conference on Human and Language Technology
    • /
    • 2000.10d
    • /
    • pp.323-328
    • /
    • 2000
  • 국내에서 영한 자동번역 시스템을 1985 년부터 개발한 지 벌써 15년이 흐르고 있다. 15 년의 영한 자동번역 기술개발에도 불구하고 아직도 영한 자동번역 시스템의 번역품질은 40%를 넘지 못하고 있다. 이렇게 번역품질이 낮은 이유는 다음과 같이 요약할 수 있을 것이다. o 입력문에 대해 파싱할 때 오른쪽 경계를 잘못 인식함으로써 구조적 모호성의 발생문제: 예를 들어 등위 접속절에서 오른쪽 등위절이 등위 접속절에 포함되는 지의 모호성. o 번역 단위로써 전체 문장을 대상으로 한 번역패턴이 아닌 구나 절과 같은 부분적인 번역패턴으로 인한 문장 전체의 잘못된 번역 결과 발생. o 점차 증가하는 대용량 번역지식의 구축과 관련해 새로 구축되는 번역 지식과 기구축된 대용량 번역지식들 간의 상호 충돌로 인한 번역 품질의 저하. 이러한 심각한 원인들을 극복하기 위해 본 논문에서는 문틀에 기반한 새로운 영한 자동번역 방법론을 소개하고자 한다. 이 문틀에 기반한 영한 자동번역 방법론은 현재 CNN뉴스 방송 자막을 대상으로 한 영한 자동번역 시스템에서 실제 활용되고 있다. 이 방법론은 기본적으로 data-driven 방법론에 속하다. 문틀 기반 자동번역 방법론은 규칙기반 자동번역 방법론보다는 낮은 단계에서 예제 기반 자동번역 방법론보다는 높은 단계에서 번역을 하는 번역방법론이다. 이 방법론은 영한 자동번역에 뿐만 아니라 다른 언어쌍에서의 번역에도 적용할 수 있을 것이다.

  • PDF

Judgment about the Usefulness of Automatically Extracted Temporal Information from News Articles for Event Detection and Tracking (사건 탐지 및 추적을 위해 신문기사에서 자동 추출된 시간정보의 유용성 판단)

  • Kim Pyung;Myaeng Sung-Hyon
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.6
    • /
    • pp.564-573
    • /
    • 2006
  • Temporal information plays an important role in natural language processing (NLP) applications such as information extraction, discourse analysis, automatic summarization, and question-answering. In the topic detection and tracking (TDT) area, the temporal information often used is the publication date of a message, which is readily available but limited in its usefulness. We developed a relatively simple NLP method of extracting temporal information from Korean news articles, with the goal of improving performance of TDT tasks. To extract temporal information, we make use of finite state automata and a lexicon containing time-revealing vocabulary. Extracted information is converted into a canonicalized representation of a time point or a time duration. We first evaluated the extraction and canonicalization methods for their accuracy and investigated on the extent to which temporal information extracted as such can help TDT tasks. The experimental results show that time information extracted from text indeed helps improve both precision and recall significantly.

Detection of Incivility based on Attention-embedding and multi-channel CNN (어텐션임베딩과 다채널 CNN 기반 반시민성 검출 알고리즘)

  • Park, Youn-Jung;Lee, Se-Young;Keum, Hee-Jo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.12
    • /
    • pp.1880-1889
    • /
    • 2022
  • The online portal platform provides online news with online comments, but the anonymity of comments causes incivility, and online comments are considered social problems. While there are many foreign language-based incivility detection studies, in-depth research is not being conducted in Korea since there has not been implemented Korean language dataset which is labeled detailed criteria of incivility. In this study, the incivility notation of comments was conducted in a total of 13 items, uncivil words were summarized. Furthermore, Attention algorithm was applied to each comment and summary to extract embedding vectors. 2-d CNN followed at the end to detect incivility in given data. As a result, we showed that the proposed algorithm is useful for anti-citizen detection such as name-calling and offensive tones. This study is expected to contribute to the formation of a healthy online comment culture by detecting uncivil comments which hinder democratic discourse.

Exploratory Study on the Specification of Content Knowledge Formation - Based on Analysis of University Writing Textbooks - (글쓰기 내용지식 구성의 세분화에 관한 탐색적 연구 - 대학 글쓰기교재 분석을 중심으로 -)

  • Lee, Ran
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.7
    • /
    • pp.486-497
    • /
    • 2022
  • The aim of this study was to subdivide and present the units and the standards of knowledge integration in creating the students' integrated knowledge from content knowledge in college writing classes. For these, it analyzed three typical writing textbooks being used in colleges and examined the ways of presentation on forming integrated knowledge by text qualitative analysis methods. The analysis procedure and the presentation followed Creswell's spiral analysis model It is a method model which repeats the procedure from material collection and analysis to presentation circularly. This examination illustrates three dimensions of the units in forming content knowledge. Also, it suggested those should be all treated for the more systematic education: the units of the whole text, the paragraphs, and the sentences. In the next chapter, the standards and contents of knowledge integration were suggested in each process. For the process of knowledge selection, the suitability and the contradictoriness between the text materials and author's thesis were proposed as the standards and contents. For the process of organization and integration, the corresponsive integration, contradictive integration, background integration, synthetic integration were suggested. Finally the procedure knowledge such as correct expression and spelling, source indication were shown for the process of expression and citation. Furthermore, it showed, in terms of expression, the process of paraphrasing frequently practiced in writing textbooks needs to be exercised in the three dimensions including summarization, connection, and interpretation(or transformation). This result, however, calls for the further study about the subdividing processes to enhance the adequateness to writing textbooks in the level of universities and for a more refined syllabus on the systematic knowledge integration. Accordingly, it suggested the tasks mentioned above for further study.