• Title/Summary/Keyword: words frequency

Search Result 887, Processing Time 0.023 seconds

Analysis of Lexical Effect on Spoken Word Recognition Test (한국어 단음절 낱말 인식에 미치는 어휘적 특성의 영향)

  • Yoon, Mi-Sun;Yi, Bong-Won
    • MALSORI
    • /
    • no.54
    • /
    • pp.15-26
    • /
    • 2005
  • The aim of this paper was to analyze the lexical effects on spoken word recognition of Korean monosyllabic word. The lexical factors chosen in this paper was frequency, density and lexical familiarity of words. Result of the analysis was as follows; frequency was the significant factor to predict spoken word recognition score of monosyllabic word. The other factors were not significant. This result suggest that word frequency should be considered in speech perception test.

  • PDF

Analysis of Dental Hygienist Job Recognition Using Text Mining

  • Kim, Bo-Ra;Ahn, Eunsuk;Hwang, Soo-Jeong;Jeong, Soon-Jeong;Kim, Sun-Mi;Han, Ji-Hyoung
    • Journal of dental hygiene science
    • /
    • v.21 no.1
    • /
    • pp.70-78
    • /
    • 2021
  • Background: The aim of this study was to analyze the public demand for information about the job of dental hygienists by mining text data collected from the online Q & A section on an Internet portal site. Methods: Text data were collected from inquiries that were posted on the Naver Q & A section from January 2003 to July 2020 using "dental hygienist job recognition," "role recognition," "medical assistance," and "scaling" as search keywords. Text mining techniques were used to identify significant Korean words and their frequency of occurrence. In addition, the association between words was analyzed. Results: A total of 10,753 Korean words related to the job of dental hygienists were extracted from the text data. "Chi-lyo (treatment)," "chigwa (dental clinic)," "ske-illing (scaling)," "itmom (gum)," and "chia (tooth)" were the five most frequently used words. The words were classified into the following areas of job of the dental hygienist: periodontal disease treatment and prevention, medical assistance, patient care and consultation, and others. Among these areas, the number of words related to medical assistance was the largest, with sixty-six association rules found between the words, and "chi-lyo," "chigwa," and "ske-illing" as core words. Conclusion: The public demand for information about the job of dental hygienists was mainly related to "chi-lyo," "chigwa," and "ske-illing" as core words, demonstrating that scaling is recognized by the public as the job of a dental hygienist. However, the high demand for information related to treatment and medical assistance in the context of dental hygienists indicates that the job of dental hygienists is recognized by the public as being more focused on medical assistance than preventive dental care that are provided with job autonomy.

Phonological phrase boundary and word frequency that influence the phonological word recognition (음운구 경계와 단어빈도가 한국어 음운단어 재인에 미치는 영향)

  • Kim, Jeahong;Shin, Hasun;Kim, Yeseul;Yun, Gwangyeol;Kim, Daseul;Shin, Jiyoung;Nam, Kichun
    • Phonetics and Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.45-56
    • /
    • 2019
  • This study investigated the interaction between phonological phrase boundary and word frequency variable in Korean speech processing. A word monitoring task was performed to examine the interference caused by the frequency effect of target word depending on whether a phonological phrase is formed within the target word. Frequency of target word (high vs low) and phonological phrase boundary (within target word vs between target words) were applied as between and within subject condition respectively. Our results showed the significant main effect of the phonological phrase boundary and the significant interaction. In the post-hoc analysis, the high-frequency target words were detected significantly faster than the low-frequency target words only in the within phonological phrase boundary condition. Frequency effect in the between phonological phrase boundary condition did not appear. The results indicated that the phonological phrase boundary and word frequency variable played an important role in Korean speech processing. In particular, we discussed the possibility of processing the word frequency at the very early sensory information processing stage based on the interaction of two experimental factors.

An Innovative Approach of Bangla Text Summarization by Introducing Pronoun Replacement and Improved Sentence Ranking

  • Haque, Md. Majharul;Pervin, Suraiya;Begum, Zerina
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.752-777
    • /
    • 2017
  • This paper proposes an automatic method to summarize Bangla news document. In the proposed approach, pronoun replacement is accomplished for the first time to minimize the dangling pronoun from summary. After replacing pronoun, sentences are ranked using term frequency, sentence frequency, numerical figures and title words. If two sentences have at least 60% cosine similarity, the frequency of the larger sentence is increased, and the smaller sentence is removed to eliminate redundancy. Moreover, the first sentence is included in summary always if it contains any title word. In Bangla text, numerical figures can be presented both in words and digits with a variety of forms. All these forms are identified to assess the importance of sentences. We have used the rule-based system in this approach with hidden Markov model and Markov chain model. To explore the rules, we have analyzed 3,000 Bangla news documents and studied some Bangla grammar books. A series of experiments are performed on 200 Bangla news documents and 600 summaries (3 summaries are for each document). The evaluation results demonstrate the effectiveness of the proposed technique over the four latest methods.

An Acoustic Study of English Sentence Stress and Rhythm Produced by Korean Speakers

  • Kim, Ok-Young
    • Speech Sciences
    • /
    • v.14 no.1
    • /
    • pp.121-135
    • /
    • 2007
  • The purpose of this paper is to examine how Korean speakers realize English stress and rhythm at the sentence level, and investigate what different acoustic characteristics of English sentence stress and rhythm Korean speakers have, compared with those of American English speakers. Stressed words in the sentence were analyzed in terms of duration, fundamental frequency, and intensity of the stressed vowel in the word with neutral stress and with emphatic stress, respectively. According to the results, when the words had emphatic stress, both Koreans' and Americans' F0 and intensity of the stressed vowel were higher than those with neutral stress. Korean speakers of English realized the sentence stress with shorter vowel duration and higher F0 than American English speakers when the words had emphatic stress. The analysis of the timing of the sentence with increased unstressed syllables showed that both Americans and Koreans produced the sentence with longer duration as the number of unstressed syllables increased. However, the duration of unstressed syllables between stressed syllables by Koreans was longer than that by Americans. Americans seemed to produce unstressed syllables between stressed syllables faster than Koreans for regular intervals of stressed syllables. This analysis implies that if there are more unstressed syllables between stressed syllables, Koreans might produce unstressed syllables and the whole sentence with longer duration.

  • PDF

Vocabulary Coverage Improvement for Embedded Continuous Speech Recognition Using Knowledgebase (지식베이스를 이용한 임베디드용 연속음성인식의 어휘 적용률 개선)

  • Kim, Kwang-Ho;Lim, Min-Kyu;Kim, Ji-Hwan
    • MALSORI
    • /
    • v.68
    • /
    • pp.115-126
    • /
    • 2008
  • In this paper, we propose a vocabulary coverage improvement method for embedded continuous speech recognition (CSR) using knowledgebase. A vocabulary in CSR is normally derived from a word frequency list. Therefore, the vocabulary coverage is dependent on a corpus. In the previous research, we presented an improved way of vocabulary generation using part-of-speech (POS) tagged corpus. We analyzed all words paired with 101 among 152 POS tags and decided on a set of words which have to be included in vocabularies of any size. However, for the other 51 POS tags (e.g. nouns, verbs), the vocabulary inclusion of words paired with such POS tags are still based on word frequency counted on a corpus. In this paper, we propose a corpus independent word inclusion method for noun-, verb-, and named entity(NE)-related POS tags using knowledgebase. For noun-related POS tags, we generate synonym groups and analyze their relative importance using Google search. Then, we categorize verbs by lemma and analyze relative importance of each lemma from a pre-analyzed statistic for verbs. We determine the inclusion order of NEs through Google search. The proposed method shows better coverage for the test short message service (SMS) text corpus.

  • PDF

Detection of Porno Sites on the Web using Fuzzy Inference (퍼지추론을 적용한 웹 음란문서 검출)

  • 김병만;최상필;노순억;김종완
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.5
    • /
    • pp.419-425
    • /
    • 2001
  • A method to detect lots of porno documents on the internet is presented in this parer. The proposed method applies fuzzy inference mechanism to the conventional information retrieval techniques. First, several example sites on porno arc provided by users and then candidate words representing for porno documents are extracted from theme documents. In this process, lexical analysis and stemming are performed. Then, several values such as tole term frequency(TF), the document frequency(DF), and the Heuristic Information(HI) Is computed for each candidate word. Finally, fuzzy inference is performed with the above three values to weight candidate words. The weights of candidate words arc used to determine whether a liven site is sexual or not. From experiments on small test collection, the proposed method was shown useful to detect the sexual sites automatically.

  • PDF

Phonological processes of consonants from orthographic to pronounced words in the Buckeye Corpus

  • Yang, Byunggon
    • Phonetics and Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.55-62
    • /
    • 2019
  • This paper investigates the phonological processes of consonants in pronounced words in the Buckeye Corpus and compares the frequency distribution of these processes to provide a clearer understanding of conversational English for linguists and teachers. Both orthographic and pronounced words were extracted from the transcribed label scripts of the Buckeye Corpus. Next, the phonological processes of consonants in the orthographic and pronounced labels were tabulated separately by onsets and codas, and a frequency distribution by consonant process types was examined. The results showed that the majority of the onset clusters were pronounced as the same sounds in the Buckeye Corpus. The participants in the corpus were presumed to speak semiformally. In addition, the onsets have fewer deletions than the codas, which might be related to the information weight of the syllable components. Moreover, there is a significant association and strong positive correlation between the phonological processes of the onsets and codas in men and women. This paper concludes that an analysis of phonological processes in spontaneous speech corpora can contribute to a practical understanding of spoken English. Further studies comparing the current phonological process data with those of other languages would be desirable to establish universal patterns in phonological processes.

The Hierarchy of Images in the Gathered Skirts According to the Constructing Factors (개더스커트의 구성요인에 따른 이미지 계층구조)

  • Lee, Myung-Hee
    • Fashion & Textile Research Journal
    • /
    • v.9 no.5
    • /
    • pp.472-477
    • /
    • 2007
  • This study was intended to identify the constructing factors and hierarchy of images in the gathered skirts, which is expected to be helpful in shape classification. The gathered skirts were made by different gathering conditions: three kinds of the gathers ratio(1.5T, 2.0T, 2.5T) and different fabrics(cotton, mixed wool, polyester). 45 undergraduate and graduate women students responded to the nine gathered skirts during December in 2004 to February in 2005. 184 words expressing gathered skirt were collected through the investigation and analysis of questionnaires. 32 words arranged in based on the standard form with frequency before conducting factor analysis to identify the constructing factors of gathered skirt images. As a result of factors analysis, 2 factors-H shape, A shape were found out as constructing factors of gathered skirts. To explain the hierarchy of gathered skirt images, cluster analysis was applied. To observe the association of 32 words, dendrogram was introduced, and to interpret the result, five sub clusters were determined. This 5 clusters were continuously combined according to their frequency, based on the factors marks. Two major division of image clusters were 'simple and neat image', and 'fairly good and feminine image'.

On Improving the Listening Ability of Middle School Students Using Verbotonal Method (Verbotonal 법을 이용한 중학생 영어 학습자의 듣기 능력 향상에 관한 연구)

  • Kim, Hyun-Gi;Kim, Ok-Jin;Kang, Sung-Kwan;Jeon, Byoung-Man
    • Speech Sciences
    • /
    • v.14 no.3
    • /
    • pp.21-29
    • /
    • 2007
  • The necessity for improving the English listening ability of Korean learners has been emphasized since the ultimate goal of English education converted to CLT(Communicative Language Teaching) in Korea. Verbotonal Approach as an auditory-based strategy has been proved to be effective substantially in maximizing the listening skill of spoken foreign language. The purpose of this study is to find out an efficient way of improving listening ability for Korean middle school students by employing OFH(Optimal Frequency of Hearing) using Tonality Word Sentence Test, before & after using Listen II Verbotonal training unit based on VTS(Verbotonal System). The results of the listening tests showed that the listening ability of the subjects increased by 16.7% on the words and by 5.5% on the sentences after using Listen II, compared with before using Listen II and that the improvement rate of listening ability on the level of words is much higher than that on the level of sentences. From the results, we can come to a conclusion that training the listening skill with words in mid-tonality and low-tonality based on OFH might give a great positive effect in improving listening ability for Korean learners of English.

  • PDF