• 제목/요약/키워드: punctuation

검색결과 38건 처리시간 0.02초

확장청크와 세분화된 문장부호에 기반한 중국어 최장명사구 식별 (Identification of Maximal-Length Noun Phrases Based on Expanded Chunks and Classified Punctuations in Chinese)

  • 백설매;이금희;김동일;이종혁
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제36권4호
    • /
    • pp.320-328
    • /
    • 2009
  • 일반적으로 명사구는 기본명사구와 최장명사구로 분류되는데 최장명사구에 대한 정확한 식별은 문장의 전체적인 구문구조를 파악하고 정확한 지배용언을 찾아내는데 중요한 역할을 하게 된다. 본 논문에서는 확장된 청크(chunk) 개념과 다섯 개의 클래스로 세분화된 문장부호 정보를 자질로 사용한 두 단계 최장명사구 식별 기법을 제안한다. 제안한 기법은 기본모델보다 2.65% 향상된 평균 89.66%($F_1$-measure)의 우수한 성능을 보인다.

텍스트의 의미 정보에 기반을 둔 음성컨트롤 태그에 관한 연구 (A Study of Speech Control Tags Based on Semantic Information of a Text)

  • 장문수;정경채;강선미
    • 음성과학
    • /
    • 제13권4호
    • /
    • pp.187-200
    • /
    • 2006
  • The speech synthesis technology is widely used and its application area is also being broadened to an automatic response service, a learning system for handicapped person, etc. However, the sound quality of the speech synthesizer has not yet reached to the satisfactory level of users. To make a synthesized speech, the existing synthesizer generates rhythms only by the interval information such as space and comma or by several punctuation marks such as a question mark and an exclamation mark so that it is not easy to generate natural rhythms of people even though it is based on mass speech database. To make up for the problem, there is a way to select rhythms after processing language from a higher level information. This paper proposes a method for generating tags for controling rhythms by analyzing the meaning of sentence with speech situation information. We use the Systemic Functional Grammar (SFG) [4] which analyzes the meaning of sentence with speech situation information considering the sentence prior to the given one, the situation of a conversation, the relationship among people in the conversation, etc. In this study, we generate Semantic Speech Control Tag (SSCT) by the result of SFG's meaning analysis and the voice wave analysis.

  • PDF

Difference, not Differentiation: The Thingness of Language in Sun Yung Shin's Skirt Full of Black

  • Shin, Haerin
    • 영어영문학
    • /
    • 제64권3호
    • /
    • pp.329-345
    • /
    • 2018
  • Sun Yung Shin's poetry collection Skirt Full of Black (2007) brings the author's personal history as a Korean female adoptee to bear upon poetic language in daring formal experiments, instantiating the liminal state of being shuttled across borders to land in an in-between state of marginalization. Other Korean American poets have also drawn on the experience of transnational adoption and racialization explore the literary potential of English to materialize haunting memories or the untranslatable yet persistent echoes of a lost home that gestures across linguistic boundaries, as seen in the case of Lee Herrick or Jennifer Kwon Dobbs. Shin however dismantles the referential foundation of English as a language she was transplanted into through formal transgressions such as frazzled syntax, atypical typography, decontextualized punctuation marks, and phonetic and visual play. The power to signify and thereby differentiate one entity or meaning from another dissipates in the cacophonic feast of signs in Skirt Full of Black; the word fragments of identificatory markers that turn racialized, gendered, and culturally contained subjects into exotic things lose the power to define them as such, and instead become alterities by departing from the conventional meaning-making dynamics of language. Expanding on the avant-garde legacy of Korean American poets Theresa Hak Kyung Cha and Myung Mi Kim to delve further into the liminal space between Korean and American, referential and representational, or spoken and written words, Shin carves out a space for discreteness that does not subscribe to the hierarchical ontology of differential value assignment.

영미계 목록규칙의 슬라이드자료에 대한 대조사항 기술형식의 변천 (Development of the physical description area on filmstrips and slides in the British and American cataloging rules)

  • 이창수
    • 한국도서관정보학회지
    • /
    • 제10권
    • /
    • pp.229-265
    • /
    • 1983
  • Many changes have been made on the cataloging rules on filmstrips and slides all the way from the Cox's rules of 1963 to AACR 2 of 1978. The purpose of this study is to analyze eight major British and American cataloging rules on filmstrips and slides, from the results of which to identify what major changes have been made chronologically, and to clarify major differences among them in describing the form ol Physical Description Area. The findings of the study can be summarized as follows: 1. In order to make a clear distinction from one element to the other in Physical Description Area, the use of punctuation has been made more concrete. In AACR 2, various punctuations in accordance with the each element's character are used. 2. The rules on the describing the number of physical units of filmstrips have got more and more specified. 3. The descriptive form of specific material designation is closely related to the existence or nonexistence of rules on the general material designation in the body of entry. Therefore, the rules of AECT and CLA having rules on the general material designation do not use the specific material designation in Physical Description Area. On the other hand, ISBD(NBM), LA rules and AACR 2 which makes it optional to use the general material designation prescribe to use the specific material designation. 4. As for the descriptions of the physical status other than the unit and size of the filmstrips and slides, the first LC card and Cox's rules, had the color designation, and the CLA rules had sound designation. In the LA rules, AACR1(Chapter 12 Revised) and AACR 2, the detailed description of the physical status including the indication of color, sound, kind of frame, time etc. has become more and more important for Physical Description Area. 5. All the rules adopt millimetre as the measuring unit of the size of filmstrips. For the slides, most rules employ inch instead. But LA rules and ISBD(NBM) use centimeters, and AACR 2 takes either inch or centimeters. 6. Most rules, including Cox's rules, give the information on the accompanying materials. The information has been added as the last element of the Physical Description Area in the AACR 2, and recognized very important.

  • PDF

한국목록규칙과 중국문헌편목규칙의 고전자료 목록기술규칙 비교 분석 (A Comparative Analysis of Classical Data in KCR 4 and CCR 2)

  • 한미경
    • 한국문헌정보학회지
    • /
    • 제47권3호
    • /
    • pp.275-293
    • /
    • 2013
  • 이 연구는 고전자료 이용에 필요한 목록기술규칙의 이해와 동양 고전자료 목록(서지기술) 네트워크를 위하여 한국목록규칙 4판(KCR 4)과 중국문헌편목규칙 2판(CCR 2)에서의 고전자료의 목록기술 규칙을 다음과 같이 비교 분석하였다. 첫째, 기술총칙의 비교를 위하여 기술의 대상과 기술사항의 구성, 기재순위와 구두점 그리고 기술의 정보원을 대상으로 진행하였다. 그 결과 KCR 4는 책임표시와 판사항의 정보원 규정이 상세하며, CCR 2는 출판 발행사항 및 총서사항의 정보원 규정이 상세하였다. 둘째, 세부사항의 비교를 위하여 표제와 책임표시사항, 판사항, 발행사항 및 형태사항 그리고 주기사항을 대상으로 진행하였다. 그 결과 판종의 기술과 발행사항의 경우 KCR 4가 상세하게 규정하고 있으며, 주기사항의 경우 CCR 2의 제요가 특징적이었다.

히치콕 <사이코>에 내재된 영화 사운드의 미학적 고찰 (Aesthetic Study of Film Sound Inherent in Hitchcock's )

  • 박병규
    • 한국콘텐츠학회논문지
    • /
    • 제14권6호
    • /
    • pp.26-33
    • /
    • 2014
  • 본고는 히치콕 영화 <사이코>에서 사운드의 의미작용에 대해 음성, 배경소리, 음악으로 나누어, 사운드 구성요소 모두를 영화미학적인 관점에서 다루고 있다. 음성은 보이스오버를 통해 정신적 이미지를 청각화하며, 주인 없는 음성은 육화하기 위해 삶과 죽음의 식별 불가능성을 갖기도 한다. 본고는 메츠가 주목한시각적 기법 외에 배경소리 또한 거시적 맥락 속에서 구두점-서사적 경계를 표시할 수 있음을 보였으며, 뇌리 속 비명소리를 상쇄시키며 샤워신을 매듭짓고 있는 물소리를 그 예로 들고 있다. 음악에서는 욕망과 억압이 상징되어 충돌의 불협화음을 만들고 있고, 때로 병존하는 두 화음들은 노먼-어머니의 이중성을 나타낸다. 또한, 음악은 정지된 시간 속에서 무음의 형태로 미이라화 되어 소멸하기도 한다. 이렇듯, <사이코>에 쓰인 사운드들의 공통된 영화적 의미작용은 이미지의 재생산이라 할 수 있다.

영미계목록규칙 발전의 사적 고찰 (A Historical Study on the American-British Cataloging Rules)

  • 심의순;손문철
    • 한국도서관정보학회지
    • /
    • 제11권
    • /
    • pp.143-173
    • /
    • 1984
  • This study has been done to review the historical development of the cataloging system of books with emphasis on those in England and the U. S. The findings can be summarized as follows: (1) In 1844, Sir Panizzi invented what seems to be the first of its kind in history to list the inventory systematically at the British Museum. It is believed to be a complete system consisting of 91 articles. (2) A comparatively systematic system was developed in America by Jewett. in 1852. Composed of only 39 articles, the system is considered a renovative one worked out with due regard to the infrastructure of a library. (3) In 1876, a classic system based on a lexicographical order was set up by Cutter. Rated as the best one that was designed by an individual, the theory has since exercised widespreading effects on cataloging. (4) American and British library scientists collaborated in printing several editions of numerous volumes on the principles of classification, but they are not believed fully successful in establishing a consistent and compressive system. Their efforts found significance rather in their being the first international collaboration and setting a foundation upon which the international system of today has been developed. (5) The ALA Rule, published concurrently in ALA and LC in 1949, had two parts in its classification, the list of authors and that of titles. Its scientific classification has completed the cataloging of books in its developmental stage. (6) The 1967 American-British Rules integrated the cataloging systems published under separate covers by authors and titles. The system as well as the 1961 Paris System has greatly contributed to the standardization of bibliographical description throughout the English-speaking countries. The International Standard Book Description standardized Bibliographic system has enabled the librarians in different countries to exchange their bibliographical sources easily, helped to overcome the language barrier in listing and contributed to the efficient reading of bibliographical records through machines. (7) The second edition of the Angelo-American cataloging Rules, promulgated in 1978 under the influence of the international standard bibliographical description, was the one in which all the previous Rules were revised to have their strong points. The adoption of punctuation system to employ the computerized data processing and the standardization of description are expected to improve the cataloging system not only in the English speaking countries but in the Universal Bibliographic Control as well.

  • PDF

Language-Independent Word Acquisition Method Using a State-Transition Model

  • Xu, Bin;Yamagishi, Naohide;Suzuki, Makoto;Goto, Masayuki
    • Industrial Engineering and Management Systems
    • /
    • 제15권3호
    • /
    • pp.224-230
    • /
    • 2016
  • The use of new words, numerous spoken languages, and abbreviations on the Internet is extensive. As such, automatically acquiring words for the purpose of analyzing Internet content is very difficult. In a previous study, we proposed a method for Japanese word segmentation using character N-grams. The previously proposed method is based on a simple state-transition model that is established under the assumption that the input document is described based on four states (denoted as A, B, C, and D) specified beforehand: state A represents words (nouns, verbs, etc.); state B represents statement separators (punctuation marks, conjunctions, etc.); state C represents postpositions (namely, words that follow nouns); and state D represents prepositions (namely, words that precede nouns). According to this state-transition model, based on the states applied to each pseudo-word, we search the document from beginning to end for an accessible pattern. In other words, the process of this transition detects some words during the search. In the present paper, we perform experiments based on the proposed word acquisition algorithm using Japanese and Chinese newspaper articles. These articles were obtained from Japan's Kyoto University and the Chinese People's Daily. The proposed method does not depend on the language structure. If text documents are expressed in Unicode the proposed method can, using the same algorithm, obtain words in Japanese and Chinese, which do not contain spaces between words. Hence, we demonstrate that the proposed method is language independent.

영어쓰기학습에서 온라인 문법체커 활용 연구 (The use of an online grammar checker in English writing learning)

  • 임희주
    • 디지털융복합연구
    • /
    • 제19권1호
    • /
    • pp.51-58
    • /
    • 2021
  • 본 논문은 온라인 문법체커에 대해서 살펴보고 영어쓰기에 있어서 온라인 문법체커 활용 시에 유의해야 할 점에 대해 제언을 하는데 목적이 있다. 본 연구는 충청도에 있는 D 대학교에서 2019년 2학기에 실시하였으며 총 35명의 대학교 1학년 학생이 참여하였다. 본 연구 자료 수집을 위해서 사전 · 사후의 문법 테스트와 설문지 그리고 학습저널이 수집되었고 분석되었다. 본 연구의 결과는 다음과 같다. 첫째, 영어문법 테스트 결과 온라인 문법체커는 영어쓰기수업에서 효과적인 것으로 나타났다. 둘째, 학생들은 온라인 문법체커에서 제공되는 피드백 무조건 받아들이기 보다는 스스로 판단하여 수용을 하거나 하지 않는 것으로 나타났다. 셋째, 온라인 문법체커에서 제공하는 피드백 중 (부정)관사, 전치사, 문장부호, 동사 수의 일치, 명사수의 일치 순으로 나타났다. 본 연구의 결과를 바탕으로 시사점과 제한점을 토론하였다.

An Exploratory Analysis of Online Discussion of Library and Information Science Professionals in India using Text Mining

  • Garg, Mohit;Kanjilal, Uma
    • Journal of Information Science Theory and Practice
    • /
    • 제10권3호
    • /
    • pp.40-56
    • /
    • 2022
  • This paper aims to implement a topic modeling technique for extracting the topics of online discussions among library professionals in India. Topic modeling is the established text mining technique popularly used for modeling text data from Twitter, Facebook, Yelp, and other social media platforms. The present study modeled the online discussions of Library and Information Science (LIS) professionals posted on Lis Links. The text data of these posts was extracted using a program written in R using the package "rvest." The data was pre-processed to remove blank posts, posts having text in non-English fonts, punctuation, URLs, emails, etc. Topic modeling with the Latent Dirichlet Allocation algorithm was applied to the pre-processed corpus to identify each topic associated with the posts. The frequency analysis of the occurrence of words in the text corpus was calculated. The results found that the most frequent words included: library, information, university, librarian, book, professional, science, research, paper, question, answer, and management. This shows that the LIS professionals actively discussed exams, research, and library operations on the forum of Lis Links. The study categorized the online discussions on Lis Links into ten topics, i.e. "LIS Recruitment," "LIS Issues," "Other Discussion," "LIS Education," "LIS Research," "LIS Exams," "General Information related to Library," "LIS Admission," "Library and Professional Activities," and "Information Communication Technology (ICT)." It was found that the majority of the posts belonged to "LIS Exam," followed by "Other Discussions" and "General Information related to the Library."