• 제목/요약/키워드: Punctuation Marks

검색결과 13건 처리시간 0.02초

자동 구두점 삽입을 이용한 Rich Transcription 생성 (Rich Transcription Generation Using Automatic Insertion of Punctuation Marks)

  • 김지환
    • 대한음성학회지:말소리
    • /
    • 제61호
    • /
    • pp.87-100
    • /
    • 2007
  • A punctuation generation system which combines prosodic information with acoustic and language model information is presented. Experiments have been conducted first for the reference text transcriptions. In these experiments, prosodic information was shown to be more useful than language model information. When these information sources are combined, an F-measure of up to 0.7830 was obtained for adding punctuation to a reference transcription. This method of punctuation generation can also be applied to the 1-best output of a speech recogniser. The 1-best output is first time aligned. Based on the time alignment information, prosodic features are generated. As in the approach applied in the punctuation generation for reference transcriptions, the best sequence of punctuation marks for this 1-best output is found using the prosodic feature model and an language model trained on texts which contain punctuation marks.

  • PDF

한글 문서 인식 시스템 개발 연구 (Development of a korean Text Recognition System)

  • 고견;이일병
    • 인지과학
    • /
    • 제1권1호
    • /
    • pp.77-102
    • /
    • 1989
  • This paper reports on the development of a recognition system for Korean character,numbers and punctuation marks by syntactic approach after extracting a character or punctuation mark from a page of text.First,using the projection profile(Masudaet.al.1985,Pavlidin 1981)method, we segment a page into different regions of column or row major and then extracts lines of characters from it.Considering the height,width and connectivity of character block,we proceed to extract syllables from the extracted lines.Basically we distinguish syables into six types of formal pattern(남궁재찬 1982,이주근등 1981)following the research of lee and others,and the punctuation marks and numbers into two kinds of formal patterns,and discriminate the surface structure of the extracted syllables.By Index-Removal algorithm,we subdivide them into 44 kinds of basic korean subpattern and special characters (numbers,punctuation marks)and recognize them by syntactic method(이주근등 1981.)

여각본 "동인지문오칠" 잔본(권7~권9)에 대하여 (On the Donginjimun-ouchil, the Remnant Book (Kwean 7~9) of Incunabulum published in the period of koryo.)

  • 신승운
    • 한국문헌정보학회지
    • /
    • 제20권
    • /
    • pp.473-491
    • /
    • 1991
  • Summarizing the conclusion of this article is following this: 1. Donginjimun-ouchil(동인지문오칠) published at the close of Koryo, is not only the oldest anthology but also the only one of the same kinds that we have in present. 2. Donginjimun-ouchil is consist of 9 Kweons. We can know the fact through comparing samhansiguegam(삼한시구감), becouse it seems to summerize Donginjimun-ouchil. 3. Donginjimun-ouchil is different from other books and espically has a speial features which in eluding profils about the characters. 4. With additional punctuation marks in profile and critiism marks, we can know the rule of punctuation mark and at the same time it can give many assistances to the study of poetics.

  • PDF

형태소 분석을 통한 한국어 문장 유형 자동 분류 (Automated Classification of Sentential Types in Korean with Morphological Analysis)

  • 정진우;박종철
    • 한국언어정보학회지:언어와정보
    • /
    • 제13권2호
    • /
    • pp.59-97
    • /
    • 2009
  • The type of a given sentence indicates the speaker's attitude towards the listener and is usually determined by its final endings and punctuation marks. However, some 6na1 endings are used in several types of sentences, which means that we cannot identify the sentential type by considering only the final endings and punctuation marks. In this paper, we propose methods of finding some other linguistic clues for indentifying the sentential type with a morphological analysis. We also propose to use these methods to implement a system that automatically classifies sentences in Korean according to their sentential types.

  • PDF

외국인을 위한 한글맞춤법 시안 연구 (A Study on Hangeul Orthography Guidelines for Foreigners)

  • 한재영
    • 한국어교육
    • /
    • 제28권4호
    • /
    • pp.273-296
    • /
    • 2017
  • This study focuses on a review of Hangeul orthography guidelines in Korean language regulations. It is indispensable to revise the guidelines thoroughly because it has been more than 80 years since a unified plan of Korean orthography was established in 1933, which the current orthography is based on. Also, it has been approximately 30 years since 1989, when the current guidelines were issued and promulgated. The viewpoint towards this review reflects the requirements by education fields of Korean as a foreign language and modern Korean users. Hangeul orthography consists of six clauses, along with an appendix regarding punctuation marks: 1) general rules, 2) consonants and vowels, 3) related to sounds, 4) about forms, 5) spacing between words, and 6) miscellaneous. This paper examined individual clauses and specific usages of the clauses, in terms of Korean as a foreign language. Based on the review, this paper suggests the following tasks in order to establish a draft of Hangeul orthography for foreigners. A. Among the individual clauses, some clauses that embody vocabulary education aspects should be addressed in a Korean dictionary, and deleted in Hangeul orthography guidelines. B. The clauses of Hangeul orthography guidelines should be edited for revision and substitution where necessary. C. The usage of individual clauses should be replaced with more appropriate examples aligned with everyday conversation. D. In order to establish 'Hangeul orthography for foreigners', linguists should continuously review several chapters and the appendix of Hangeul orthography, such as components about forms, spacing between words, miscellaneous, and punctuation marks. The purpose of this review is to pursue the simplicity of Hangeul orthography guidelines and the practicality in terms of reflecting more realistic examples. This review contributes to facilitate Korean language usage not only for non-native learners, but also native users.

UX/UI 아이콘 디자인 요소로서 특수 문자 체계 연구 (A Study on the Special Characters as UX/UI Icon Design Elements)

  • 송재연
    • 디지털융복합연구
    • /
    • 제19권5호
    • /
    • pp.397-405
    • /
    • 2021
  • 본 연구의 목적은 UX/UI 디자인에서 아이콘 요소로 활용되고 있는 특수 문자의 체계를 정리하여, 명확하지 않은 사용 규정에 대한 개선 방향의 근거를 마련하는 것이다. 목적을 달성하기 위한 연구 방법으로 UX/UI 디자인과 특수 문자의 이론적 배경을 고찰하고, 특수 문자와 UX/UI 디자인의 연관성 및 당위 과제를 발견하였다. 그리고 사례연구로 기업의 UX/UI 아이콘 디자인 가이드라인에서 활용되고 있는 특수 문자의 체계를 정리하여 연구 결과를 도출하였다. 분석 결과, UX/UI에서 활용되고 있는 특수 문자 종류는 도형 문자, 수학 기호, 문장 부호, 괄호였다. 그리고 분석사례에서 공통으로 빈번하게 사용되는 특수 문자는 ▶, ♥, ★, ○, ⊙, +, ×, ⋯ 였고, 이에 대한 사용 체계를 정리하여 표준화 근거 자료로 제시하였다. 본 연구의 자료가 특수문자 체계화에 관한 연구의 관심도를 높이는 데 기여하고, 향후 기준의 틀을 마련하는 데 도움이 되기를 기대한다.

운율 경계 정보를 이용한 HMM 기반의 한국어 음성합성 시스템 (An HMM-based Korean TTS synthesis system using phrase information)

  • 주영선;정치상;강홍구
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송공학회 2011년도 하계학술대회
    • /
    • pp.89-91
    • /
    • 2011
  • In this paper, phrase boundaries in sentence are predicted and a phrase break information is applied to an HMM-based Korean Text-to-Speech synthesis system. Synthesis with phrase break information increases a naturalness of the synthetic speech and an understanding of sentences. To predict these phrase boundaries, context-dependent information like forward/backward POS(Part-of-Speech) of eojeol, a position of eojeol in a sentence, length of eojeol, and presence or absence of punctuation marks are used. The experimental results show that the naturalness of synthetic speech with phrase break information increases.

  • PDF

텍스트의 의미 정보에 기반을 둔 음성컨트롤 태그에 관한 연구 (A Study of Speech Control Tags Based on Semantic Information of a Text)

  • 장문수;정경채;강선미
    • 음성과학
    • /
    • 제13권4호
    • /
    • pp.187-200
    • /
    • 2006
  • The speech synthesis technology is widely used and its application area is also being broadened to an automatic response service, a learning system for handicapped person, etc. However, the sound quality of the speech synthesizer has not yet reached to the satisfactory level of users. To make a synthesized speech, the existing synthesizer generates rhythms only by the interval information such as space and comma or by several punctuation marks such as a question mark and an exclamation mark so that it is not easy to generate natural rhythms of people even though it is based on mass speech database. To make up for the problem, there is a way to select rhythms after processing language from a higher level information. This paper proposes a method for generating tags for controling rhythms by analyzing the meaning of sentence with speech situation information. We use the Systemic Functional Grammar (SFG) [4] which analyzes the meaning of sentence with speech situation information considering the sentence prior to the given one, the situation of a conversation, the relationship among people in the conversation, etc. In this study, we generate Semantic Speech Control Tag (SSCT) by the result of SFG's meaning analysis and the voice wave analysis.

  • PDF

Difference, not Differentiation: The Thingness of Language in Sun Yung Shin's Skirt Full of Black

  • Shin, Haerin
    • 영어영문학
    • /
    • 제64권3호
    • /
    • pp.329-345
    • /
    • 2018
  • Sun Yung Shin's poetry collection Skirt Full of Black (2007) brings the author's personal history as a Korean female adoptee to bear upon poetic language in daring formal experiments, instantiating the liminal state of being shuttled across borders to land in an in-between state of marginalization. Other Korean American poets have also drawn on the experience of transnational adoption and racialization explore the literary potential of English to materialize haunting memories or the untranslatable yet persistent echoes of a lost home that gestures across linguistic boundaries, as seen in the case of Lee Herrick or Jennifer Kwon Dobbs. Shin however dismantles the referential foundation of English as a language she was transplanted into through formal transgressions such as frazzled syntax, atypical typography, decontextualized punctuation marks, and phonetic and visual play. The power to signify and thereby differentiate one entity or meaning from another dissipates in the cacophonic feast of signs in Skirt Full of Black; the word fragments of identificatory markers that turn racialized, gendered, and culturally contained subjects into exotic things lose the power to define them as such, and instead become alterities by departing from the conventional meaning-making dynamics of language. Expanding on the avant-garde legacy of Korean American poets Theresa Hak Kyung Cha and Myung Mi Kim to delve further into the liminal space between Korean and American, referential and representational, or spoken and written words, Shin carves out a space for discreteness that does not subscribe to the hierarchical ontology of differential value assignment.

웹 문서를 위한 개선된 문장경계인식 방법 (Improved Sentence Boundary Detection Method for Web Documents)

  • 이충희;장명길;서영훈
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제37권6호
    • /
    • pp.455-463
    • /
    • 2010
  • 본 논문은 다양한 형태의 웹 문서에 적용하기 위해서, 언어의 통계정보 및 후처리 규칙에 기반하여 개선한 문장경계 인식 기술을 제안한다. 제안한 방법은 구두점 생략 및 띄어쓰기 오류가 빈번한 웹문서에 적용하기 위해서 문장경계로 사용될 수 있는 모든 종결어미를 대상으로 학습하여 문장경계 인식을 수행하였다. 또한 문장경계인식 성능을 최대화하기 위해서 다양한 실험을 통해 최적의 자질 및 학습데이터를 선정하였고, 학습데이터에 의존적인 통계모델의 오류를 규칙에 기반 해서 보정하였다. 성능 실험은 다양한 문서별 성능 측정을 위해서 구두점이 주로 문장경계로 사용된 문어체 위주의 평가셋1(신문기사와 블로그 문서)과 구두점 생략 및 띄어쓰기 오류가 빈번한 웹 문서 위주의 평가셋2(웹 사이트의 게시판 글)를 대상으로 성능을 측정하였다. 평가 척도로는 F-measure를 사용하였으며, 기존 연구와 동일하게 구두점만을 문장경계 대상으로 학습한 기본 모델을 만들어서 실험한 결과, 평가셋1에 대해서 96.5%의 성능을 보였지만, 평가셋2에 대해서는 56.7%로 매우 저조한 성능을 보였다. 제안하는 개선 방법은 기본 모델을 웹 문서의 특징을 반영시키도록 자질 및 엔진을 개선시켰고, 최종 모델을 평가셋2로 평가한 결과, 96.3%의 성능을 보여서 39.6%의 성능 향상이 있음을 확인하였다.