• Title/Summary/Keyword: Punctuation Marks

Search Result 13, Processing Time 0.024 seconds

Rich Transcription Generation Using Automatic Insertion of Punctuation Marks (자동 구두점 삽입을 이용한 Rich Transcription 생성)

  • Kim, Ji-Hwan
    • MALSORI
    • /
    • no.61
    • /
    • pp.87-100
    • /
    • 2007
  • A punctuation generation system which combines prosodic information with acoustic and language model information is presented. Experiments have been conducted first for the reference text transcriptions. In these experiments, prosodic information was shown to be more useful than language model information. When these information sources are combined, an F-measure of up to 0.7830 was obtained for adding punctuation to a reference transcription. This method of punctuation generation can also be applied to the 1-best output of a speech recogniser. The 1-best output is first time aligned. Based on the time alignment information, prosodic features are generated. As in the approach applied in the punctuation generation for reference transcriptions, the best sequence of punctuation marks for this 1-best output is found using the prosodic feature model and an language model trained on texts which contain punctuation marks.

  • PDF

Development of a korean Text Recognition System (한글 문서 인식 시스템 개발 연구)

  • 고견;이일병
    • Korean Journal of Cognitive Science
    • /
    • v.1 no.1
    • /
    • pp.77-102
    • /
    • 1989
  • This paper reports on the development of a recognition system for Korean character,numbers and punctuation marks by syntactic approach after extracting a character or punctuation mark from a page of text.First,using the projection profile(Masudaet.al.1985,Pavlidin 1981)method, we segment a page into different regions of column or row major and then extracts lines of characters from it.Considering the height,width and connectivity of character block,we proceed to extract syllables from the extracted lines.Basically we distinguish syables into six types of formal pattern(남궁재찬 1982,이주근등 1981)following the research of lee and others,and the punctuation marks and numbers into two kinds of formal patterns,and discriminate the surface structure of the extracted syllables.By Index-Removal algorithm,we subdivide them into 44 kinds of basic korean subpattern and special characters (numbers,punctuation marks)and recognize them by syntactic method(이주근등 1981.)

On the Donginjimun-ouchil, the Remnant Book (Kwean 7~9) of Incunabulum published in the period of koryo. (여각본 "동인지문오칠" 잔본(권7~권9)에 대하여)

  • Shin Seung-Woon
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.20
    • /
    • pp.473-491
    • /
    • 1991
  • Summarizing the conclusion of this article is following this: 1. Donginjimun-ouchil(동인지문오칠) published at the close of Koryo, is not only the oldest anthology but also the only one of the same kinds that we have in present. 2. Donginjimun-ouchil is consist of 9 Kweons. We can know the fact through comparing samhansiguegam(삼한시구감), becouse it seems to summerize Donginjimun-ouchil. 3. Donginjimun-ouchil is different from other books and espically has a speial features which in eluding profils about the characters. 4. With additional punctuation marks in profile and critiism marks, we can know the rule of punctuation mark and at the same time it can give many assistances to the study of poetics.

  • PDF

Automated Classification of Sentential Types in Korean with Morphological Analysis (형태소 분석을 통한 한국어 문장 유형 자동 분류)

  • Chung, Jin-Woo;Park, Jong-C.
    • Language and Information
    • /
    • v.13 no.2
    • /
    • pp.59-97
    • /
    • 2009
  • The type of a given sentence indicates the speaker's attitude towards the listener and is usually determined by its final endings and punctuation marks. However, some 6na1 endings are used in several types of sentences, which means that we cannot identify the sentential type by considering only the final endings and punctuation marks. In this paper, we propose methods of finding some other linguistic clues for indentifying the sentential type with a morphological analysis. We also propose to use these methods to implement a system that automatically classifies sentences in Korean according to their sentential types.

  • PDF

A Study on Hangeul Orthography Guidelines for Foreigners (외국인을 위한 한글맞춤법 시안 연구)

  • Han, Jae young
    • Journal of Korean language education
    • /
    • v.28 no.4
    • /
    • pp.273-296
    • /
    • 2017
  • This study focuses on a review of Hangeul orthography guidelines in Korean language regulations. It is indispensable to revise the guidelines thoroughly because it has been more than 80 years since a unified plan of Korean orthography was established in 1933, which the current orthography is based on. Also, it has been approximately 30 years since 1989, when the current guidelines were issued and promulgated. The viewpoint towards this review reflects the requirements by education fields of Korean as a foreign language and modern Korean users. Hangeul orthography consists of six clauses, along with an appendix regarding punctuation marks: 1) general rules, 2) consonants and vowels, 3) related to sounds, 4) about forms, 5) spacing between words, and 6) miscellaneous. This paper examined individual clauses and specific usages of the clauses, in terms of Korean as a foreign language. Based on the review, this paper suggests the following tasks in order to establish a draft of Hangeul orthography for foreigners. A. Among the individual clauses, some clauses that embody vocabulary education aspects should be addressed in a Korean dictionary, and deleted in Hangeul orthography guidelines. B. The clauses of Hangeul orthography guidelines should be edited for revision and substitution where necessary. C. The usage of individual clauses should be replaced with more appropriate examples aligned with everyday conversation. D. In order to establish 'Hangeul orthography for foreigners', linguists should continuously review several chapters and the appendix of Hangeul orthography, such as components about forms, spacing between words, miscellaneous, and punctuation marks. The purpose of this review is to pursue the simplicity of Hangeul orthography guidelines and the practicality in terms of reflecting more realistic examples. This review contributes to facilitate Korean language usage not only for non-native learners, but also native users.

A Study on the Special Characters as UX/UI Icon Design Elements (UX/UI 아이콘 디자인 요소로서 특수 문자 체계 연구)

  • Song, Jae-yeon
    • Journal of Digital Convergence
    • /
    • v.19 no.5
    • /
    • pp.397-405
    • /
    • 2021
  • The purpose of this study is to organize the system of special characters as UX/UI icon design elements, thus laying the groundwork for improvement direction for unclear use regulations. This study examines the theoretical background of UX/UI design and special characters and discovers UX/UI design and special characters' relations and assignments. Besides, the case study summarizes the system of special characters being utilized in the company's UX/UI icon design guidelines to produce the study results. As a result of the analysis, the special character types being utilized in UX/UI were graphic characters, mathematical symbols, punctuation marks, and parentheses. And the special characters commonly used in analysis cases, iOS, Android, and Windows, are ▶, ♥, ★, ○, ⊙, +, ×, ⋯. So this study organizes the common characters to standardize them. Hopefully, this study contributes to increasing the interest in the study of 'special characters' in the UX/UI design field and helps establish a framework for future industrial standards.

An HMM-based Korean TTS synthesis system using phrase information (운율 경계 정보를 이용한 HMM 기반의 한국어 음성합성 시스템)

  • Joo, Young-Seon;Jung, Chi-Sang;Kang, Hong-Goo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2011.07a
    • /
    • pp.89-91
    • /
    • 2011
  • In this paper, phrase boundaries in sentence are predicted and a phrase break information is applied to an HMM-based Korean Text-to-Speech synthesis system. Synthesis with phrase break information increases a naturalness of the synthetic speech and an understanding of sentences. To predict these phrase boundaries, context-dependent information like forward/backward POS(Part-of-Speech) of eojeol, a position of eojeol in a sentence, length of eojeol, and presence or absence of punctuation marks are used. The experimental results show that the naturalness of synthetic speech with phrase break information increases.

  • PDF

A Study of Speech Control Tags Based on Semantic Information of a Text (텍스트의 의미 정보에 기반을 둔 음성컨트롤 태그에 관한 연구)

  • Chang, Moon-Soo;Chung, Kyeong-Chae;Kang, Sun-Mee
    • Speech Sciences
    • /
    • v.13 no.4
    • /
    • pp.187-200
    • /
    • 2006
  • The speech synthesis technology is widely used and its application area is also being broadened to an automatic response service, a learning system for handicapped person, etc. However, the sound quality of the speech synthesizer has not yet reached to the satisfactory level of users. To make a synthesized speech, the existing synthesizer generates rhythms only by the interval information such as space and comma or by several punctuation marks such as a question mark and an exclamation mark so that it is not easy to generate natural rhythms of people even though it is based on mass speech database. To make up for the problem, there is a way to select rhythms after processing language from a higher level information. This paper proposes a method for generating tags for controling rhythms by analyzing the meaning of sentence with speech situation information. We use the Systemic Functional Grammar (SFG) [4] which analyzes the meaning of sentence with speech situation information considering the sentence prior to the given one, the situation of a conversation, the relationship among people in the conversation, etc. In this study, we generate Semantic Speech Control Tag (SSCT) by the result of SFG's meaning analysis and the voice wave analysis.

  • PDF

Difference, not Differentiation: The Thingness of Language in Sun Yung Shin's Skirt Full of Black

  • Shin, Haerin
    • Journal of English Language & Literature
    • /
    • v.64 no.3
    • /
    • pp.329-345
    • /
    • 2018
  • Sun Yung Shin's poetry collection Skirt Full of Black (2007) brings the author's personal history as a Korean female adoptee to bear upon poetic language in daring formal experiments, instantiating the liminal state of being shuttled across borders to land in an in-between state of marginalization. Other Korean American poets have also drawn on the experience of transnational adoption and racialization explore the literary potential of English to materialize haunting memories or the untranslatable yet persistent echoes of a lost home that gestures across linguistic boundaries, as seen in the case of Lee Herrick or Jennifer Kwon Dobbs. Shin however dismantles the referential foundation of English as a language she was transplanted into through formal transgressions such as frazzled syntax, atypical typography, decontextualized punctuation marks, and phonetic and visual play. The power to signify and thereby differentiate one entity or meaning from another dissipates in the cacophonic feast of signs in Skirt Full of Black; the word fragments of identificatory markers that turn racialized, gendered, and culturally contained subjects into exotic things lose the power to define them as such, and instead become alterities by departing from the conventional meaning-making dynamics of language. Expanding on the avant-garde legacy of Korean American poets Theresa Hak Kyung Cha and Myung Mi Kim to delve further into the liminal space between Korean and American, referential and representational, or spoken and written words, Shin carves out a space for discreteness that does not subscribe to the hierarchical ontology of differential value assignment.

Improved Sentence Boundary Detection Method for Web Documents (웹 문서를 위한 개선된 문장경계인식 방법)

  • Lee, Chung-Hee;Jang, Myung-Gil;Seo, Young-Hoon
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.6
    • /
    • pp.455-463
    • /
    • 2010
  • In this paper, we present an approach to sentence boundary detection for web documents that builds on statistical-based methods and uses rule-based correction. The proposed system uses the classification model learned offline using a training set of human-labeled web documents. The web documents have many word-spacing errors and frequently no punctuation mark that indicates the end of sentence boundary. As sentence boundary candidates, the proposed method considers every Ending Eomis as well as punctuation marks. We optimize engine performance by selecting the best feature, the best training data, and the best classification algorithm. For evaluation, we made two test sets; Set1 consisting of articles and blog documents and Set2 of web community documents. We use F-measure to compare results on a large variety of tasks, Detecting only periods as sentence boundary, our basis engine showed 96.5% in Set1 and 56.7% in Set2. We improved our basis engine by adapting features and the boundary search algorithm. For the final evaluation, we compared our adaptation engine with our basis engine in Set2. As a result, the adaptation engine obtained improvements over the basis engine by 39.6%. We proved the effectiveness of the proposed method in sentence boundary detection.