• 제목/요약/키워드: word-final

검색결과 251건 처리시간 0.024초

Automatic Single Document Text Summarization Using Key Concepts in Documents

  • Sarkar, Kamal
    • Journal of Information Processing Systems
    • /
    • 제9권4호
    • /
    • pp.602-620
    • /
    • 2013
  • Many previous research studies on extractive text summarization consider a subset of words in a document as keywords and use a sentence ranking function that ranks sentences based on their similarities with the list of extracted keywords. But the use of key concepts in automatic text summarization task has received less attention in literature on summarization. The proposed work uses key concepts identified from a document for creating a summary of the document. We view single-word or multi-word keyphrases of a document as the important concepts that a document elaborates on. Our work is based on the hypothesis that an extract is an elaboration of the important concepts to some permissible extent and it is controlled by the given summary length restriction. In other words, our method of text summarization chooses a subset of sentences from a document that maximizes the important concepts in the final summary. To allow diverse information in the summary, for each important concept, we select one sentence that is the best possible elaboration of the concept. Accordingly, the most important concept will contribute first to the summary, then to the second best concept, and so on. To prove the effectiveness of our proposed summarization method, we have compared it to some state-of-the art summarization systems and the results show that the proposed method outperforms the existing systems to which it is compared.

Comparison of English and Korean speakers for the nasalization of English stops

  • Yun, Ilsung
    • 말소리와 음성과학
    • /
    • 제7권3호
    • /
    • pp.3-11
    • /
    • 2015
  • This study compared English and Korean speakers with regard to the nasalization of the English stops /b, d, g, p, t, k/before a nasal within and across a word boundary. Nine English and thirty Korean speakers participated in the experiment. We used 37 speech items with different grammatical structures. Overall the English informants rarely nasalized the stops while the Korean informants generally greatly nasalized them though widely varying from no nasalization to almost complete nasalization. In general, voiced stops were more likely to be nasalized than voiceless stops. Also, the alveolar stops /d, t/tended to be nasalized the most, the bilabial stops /b, p/ the second most, and the velar stops /g, k/ the least. Besides, the closer the grammatical relationship between neighboring words, the more likely the stop nasalization occurred. In contrast, the Korean syllabification - the addition of the vowel /i/ to the final stops - worked against the stop nasalization. On the other hand, different stress (accent) or rhythm effects of the two languages are assumed to contribute to the significantly different nasalization between English and Korean speakers. The spectrum of stop nasalization obtained from this study can be used as an index to measure how close a certain Korean speaker's stop nasalization is to English speakers'.

Designing a large recording script for open-domain English speech synthesis

  • Kim, Sunhee;Kim, Hojeong;Lee, Yooseop;Kim, Boryoung;Won, Yongkook;Kim, Bongwan
    • 말소리와 음성과학
    • /
    • 제13권3호
    • /
    • pp.65-70
    • /
    • 2021
  • This paper proposes a method for designing a large recording script for open domain English speech synthesis. For read-aloud style text, 12 domains and 294 sub-domains were designed using text contained in five different news media publications. For conversational style text, 4 domains and 36 sub-domains were designed using movie subtitles. The final script consists of 43,013 sentences, 27,085 read-aloud style sentences, and 15,928 conversational style sentences, consisting of 549,683 tokens and 38,356 types. The completed script is analyzed using four criteria: word coverage (type coverage and token coverage), high-frequency vocabulary coverage, phonetic coverage (diphone coverage and triphone coverage), and readability. The type coverage of our script reaches 36.86% despite its low token coverage of 2.97%. The high-frequency vocabulary coverage of the script is 73.82%, and the diphone coverage and triphone coverage of the whole script is 86.70% and 38.92%, respectively. The average readability of whole sentences is 9.03. The results of analysis show that the proposed method is effective in producing a large recording script for English speech synthesis, demonstrating good coverage in terms of unique words, high-frequency vocabulary, phonetic units, and readability.

The Effect of eWOM on Purchase Intention for Korean-brand Cars in Russia: The Mediating Role of Brand Image and Perceived Quality

  • Evgeniy, Yu;Lee, Kangmun;Roh, Taewoo
    • Journal of Korea Trade
    • /
    • 제23권5호
    • /
    • pp.102-117
    • /
    • 2019
  • Purpose - This paper tried to identify the impact of electronic word of mouth (eWOM) on purchase intention (PI) of Korean-brand cars in the context of Russian consumers, taking into consideration the credibility, quality, and quantity of eWOM while also considering the mediation effects of brand image (BI) and perceived quality (PQ). Although there is a considerable number of studies discussing the impact of eWOM determinants on PI, not many studies were conducted focusing on the Russian market. Design/methodology - This paper is considered to fill this gap between eWOM and (PI) and, in order to do so, 211 Russian respondents were randomly selected. Descriptive analysis, factor, and reliability analysis were conducted using SPSS version 22.0. While structural equation modeling was conducted using AMOS version 24.0. Findings - The results display that, in terms of Russian consumers' perception, eWOM credibility, quality, and quantity for Korean-brand cars show a substantial impact on PI. The mediation effects of brand image, as well as perceived quality, were also supported by analysis. In the final part of the paper, theoretical and managerial implications alongside limitations with further research suggestions are presented. Originality/value - This study endeavored to explore the degree of impact of eWOM and mediating roles of BI and PQ on Russian customer intentions to buy Korean-brand cars.

20세기 초 베를린 한인 음원의 음운과 형태 (A Research on the Spoken Language in Korean Voices from Berlin: Focusing on Phonological and Morphological Features)

  • 차재은;홍종선
    • 한국어학
    • /
    • 제72권
    • /
    • pp.257-282
    • /
    • 2016
  • The aim of this paper is to research phonological and morphological features in Korean Voices from Berlin. The Korean Voices from Berlin was recorded in 1917 at Berlin by 5 Korean prisoners engaged in World War I, some of them came from North Hamgyeong Province, the others came from Pyeongan Province, therefore these data show a North Korean regional dialect. The data are composed of three materials, counting numbers, reciting scriptures and singing folksongs. The results of this research are as follows. 1) The consonant system of Korean voices is similar to standard Korean. The 19 consonants are classified according to 5 manners of articulations and 5 points of articulations. 2) The liquid /l/ has three allophones, [ɾ] appeared in an onset position, [l] in a word medial coda position or preceded by [l], [ɹ] in a word final coda position. 3) The vowel system of Korean voices is similar to early 20th Korean's. It has 8 monophthongs, /a, ʌ, o, u, ɯ, i, e, ${\varepsilon}$/. 4) The 1 to 10 numbers in Korean voices are similar to Middle-Korean numerals. 5) The genitive particle '/ɯi/의' is pronounced [i], [ɯ], [${\varepsilon}$], especially [ɯ] is appeared in Sino Korean. 6) The /l/-deletion of conjugations are similar to Middle-Korean, /l/ deletion always occurred, if [+cor] consonants are followed.

맛집 블로그의 신뢰성이 외식소비자의 지각혜택, 지각위험, 그리고 온라인구전에 미치는 영향 (A Study of Relationship of Gourmet Blog's Reliability with the Perceived benefits, Perceived Risk and Online Word of Mouth of Eating out Consumer)

  • 송흥규
    • 한국조리학회지
    • /
    • 제20권6호
    • /
    • pp.275-291
    • /
    • 2014
  • 본 연구는 맛집 블로그의 신뢰성이 온라인구전에 미치는 영향관계를 실증적으로 분석하여 블로그 마케팅 활동을 하고 있는 외식업계에게 시사점을 제공하고자 하였다. 이를 위해 선행연구에서 다루어온 외식소비자의 지각혜택과 지각위험을 블로그의 신뢰성이 온라인구전에 미치는 매개변수로 채택하였고, 각 변수에 대한 이론적 배경을 중심으로 각각의 구성개념을 도출하여 영향관계를 분석하였다. 연구결과는 첫째, 맛집 블로그의 신뢰성이 미치는 범위에 있어 블로그의 지각혜택은 정(+)의 관계로, 그리고 지각위험은 부(-)의 관계로 영향을 미치는 것으로 나타났으며, 둘째, 맛집 블로그의 신뢰성이 온라인구전에 미치는 영향에서도 정(+)의 관계가 있음을 확인하였다. 셋째, 온라인구전에 미치는 변수들의 관계에서 블로그의 신뢰성과 외식소비자의 지각혜택은 정(+)의 영향관계로 유의했으나, 지각위험은 영향관계에 있지 않았다. 결론적으로 블로그의 신뢰성이 높으면 지각혜택은 높이 평가하고, 지각위험은 감소하는 것으로 요약되며, 온라인구전에 블로그의 신뢰성에 의한 외식소비자의 지각혜택이 온라인구전의 핵심 변수임을 확인하였다.

데이터베이스 의미론을 이용한 한국어 구현 시론: 수사-분류사 구조를 중심으로 (A pilot implementation of Korean in Database Semantics: focusing on numeral-classifier construction)

  • 최재웅
    • 인지과학
    • /
    • 제18권4호
    • /
    • pp.457-483
    • /
    • 2007
  • 데이터베이스 의미론(Database Semantics, DBS)은 인간의 의사소통 방식에 대한 종합적인 이론 틀과 분석을 제공하고, 또한 이를 구체적인 컴퓨터 프로그램으로 구현하는 것을 목적으로 하고 있다. DBS의 두 가지 주요 특징으로는 문장 처리 알고리즘으로 좌연접 방식을 취한다는 점과 문장의 의미 내용을 표상하는 데이터베이스로 '어휘은행 (Word bank)를 취한다는 점을 들 수 있다. 본 연구에서는 DBS에 입각하여 한국어의 기본 현상에 대한 분석 및 구현을 시도한다. 우선 간단한 한국어 예를 통해 듣고, 추론하고, 말하는 단계가 어떻게 진행될 수 있는지를 보이고, 이어서 한국어의 특징적 현상중의 하나인 수사-분류사(classifier) 구조가 어떻게 분석되는지를 보임으로써, 영어와 독일어를 소재로 개발중인 DBS가 언어적 특성이 많이 다른 한국어 분석에도 활용될 가능성이 있음을 보인다. 또한 기존 연구에서 제시된 바 있는 좌연접 알고리즘에 대한 한국어 적용상의 문제점을 검토하면서 그에 대한 대안의 방향을 살펴보기로 한다.

  • PDF

Teaching English Stress Using a Drum: Based on Phonetic Experiments

  • Yi, Do-Kyong
    • 영어어문교육
    • /
    • 제15권2호
    • /
    • pp.261-280
    • /
    • 2009
  • This study focuses on providing the pedagogical implications of stress in English pronunciation teaching since stress is one the most important characteristic factors in English pronunciation (Bolinger, 1976; Brown, 1994; Celce-Murcia, Brinton & Goodwin, 1996; Kreidler, 1989). The author investigated stress production regarding in terms of duration, pitch, and intensity by a group of native speakers of English and a group of low-proficiency South Kyungsang Korean college students for their pre-test. For both of the pre- and post-test, the same stimuli, which consisted of a one-syllable word, two two-syllable words, three three-syllable words, and three four-syllable words, were used along with the various sentence positions: isolation, initial, medial, and final. Soft ware programs, ALVIN and Praat, were used to record and analyze the data. Since Celce-Murcia et al. (1996), Klatt (1975), and Ladefoged (2001) treat duration of the stressed syllable more significantly than other factors, pitch and intensity, with respect to the listener's point of view, the author developed a special method of teaching English stress using a traditional Korean drum to emphasize duration. In addition, the results from the native speakers' production showed that their main strategy to realize stress was through lengthening stressed syllables. After six weeks of stress instruction using the drum, the production of the native speakers and the SK Korean participants from the pre- and post-test were compared. The results from the post-test indicated that the participants showed great improvement not only in duration but also in pitch after the stress instruction. Pitch improvement was unexpected but well-explained by the statement that long vowels receive accent in loan word adaptation in North Kyungsang Korean. The results also showed that the Korean participants' pitch values became more even in their duration values for each syllable as the structure of the word or the sentence became more complex, due to their dependency upon their L1.

  • PDF

특징적 단어 및 이모티콘 집합을 활용한 모바일 기기 내 성별 예측 프레임워크 (On-Device Gender Prediction Framework Based on the Development of Discriminative Word and Emoticon Sets)

  • 김소이;최예림;김윤정;박규연;박종헌
    • 정보과학회 컴퓨팅의 실제 논문지
    • /
    • 제21권11호
    • /
    • pp.733-738
    • /
    • 2015
  • 사용자의 인구통계학적 정보는 추천 시스템과 같은 개인화 서비스 발달에 도움이 되며, 모바일 사용 데이터는 사용자의 인구통계학적 정보 예측에 활용될 수 있다. 특히 텍스트 데이터는 성별 예측에 효과적인 것으로 알려져 있지만, 모바일 텍스트 데이터는 프라이버시 이슈가 존재하여 그 활용이 제한되고 있다. 본 연구에서는 디바이스 내 예측 방법론을 제안하여 모바일 텍스트 데이터를 사용하면서 프라이버시 이슈를 최소화는 동시에 사용자의 성별을 효과적으로 예측하고자 한다. 우선, 성별에 따른 특징이 반영된 웹문서를 수집하여 각 성별에 따른 특징적 단어 집합과 특징적 이모티콘 집합을 구성한다. 단어 집합과 이모티콘 집합을 디바이스 내에서 사용자의 모바일 데이터와 비교하여 성별을 각각 예측하고, 두 예측 결과를 앙상블하여 최종적인 성별 예측 결과를 도출한다. 피실험자들의 모바일 텍스트 데이터를 사용하여 성별 예측 실험을 수행하였으며 제안 방법론의 우수한 성능을 확인하였다.