• Title/Summary/Keyword: Korean texts

Search Result 1,172, Processing Time 0.025 seconds

Characteristics of Intermediate/Advanced Korean Inter-Englishes: A Corpus-Linguistic Analysis. (우리나라 중.상급학습자 영어의 특징 : 말뭉치 언어학적 분석)

  • 안성호;이영미
    • Korean Journal of English Language and Linguistics
    • /
    • v.4 no.1
    • /
    • pp.83-102
    • /
    • 2004
  • The purpose of this paper is to find out some major characteristics of intermediate-advanced Korean learners' English by corpus- linguistically analyzing their essays in comparison with native speakers'. We construct a corpus of CBT TOEFL essays by Korean learners, NNS1 (94076 words in 402 texts), and its sub-corpus, NNS2 (14291 words in 45 texts), and then a corpus of model essays written or meticulously edited by native speakers, NS (14833 words in 35 texts). We compare NNS1 and NNS2 with NS, and with some other corpora, in terms of high-frequency words, and show that Korean learners' writings have more features of informal writing than those of formal writing, which is in accord with the reports in Granger (1998) that EFL writings by European advanced learners are characterized by informality.

  • PDF

Knowledge Graph-based Korean New Words Detection Mechanism for Spam Filtering (스팸 필터링을 위한 지식 그래프 기반의 신조어 감지 매커니즘)

  • Kim, Ji-hye;Jeong, Ok-ran
    • Journal of Internet Computing and Services
    • /
    • v.21 no.1
    • /
    • pp.79-85
    • /
    • 2020
  • Today, to block spam texts on smartphone, a simple string comparison between text messages and spam keywords or a blocking spam phone numbers is used. As results, spam text is sent in a gradually hanged way to prevent if from being automatically blocked. In particular, for words included in spam keywords, spam texts are sent to abnormal words using special characters, Chinese characters, and whitespace to prevent them from being detected by simple string match. There is a limit that traditional spam filtering methods can't block these spam texts well. Therefore, new technologies are needed to respond to changing spam text messages. In this paper, we propose a knowledge graph-based new words detection mechanism that can detect new words frequently used in spam texts and respond to changing spam texts. Also, we show experimental results of the performance when detected Korean new words are applied to the Naive Bayes algorithm.

Corpus-based analysis of the usage of Korean markers -(n)un and -i/ka in editorial texts

  • Kim, Kyoung-Young
    • Language and Information
    • /
    • v.19 no.2
    • /
    • pp.19-36
    • /
    • 2015
  • The aim of this paper is to investigate the usage of Korean markers -(n)un and -i/ka in editorial texts focusing on information structure. Noun phrases ending with the markers -(n)un and -i/ka were annotated semi-automatically using a corpus obtained from an online newspaper. Two important factors to determine the choice of markers were examined with the annotated data: referential givenness/newness and position in a sentence. Referential givenness and newness were adopted as indicators of information structure, topic and focus respectively. In addition to quantitative analysis, qualitative analysis was conducted on the selected data. The results suggest that both the marker -(n)un and -i/ka could carry a topic and a focus reading. Sentence position also played a crucial role in determining the marker, and the marker -i/ka was used more frequently in a later position of a sentence than the marker -(n)un.

  • PDF

A Study Comparing the Han Period Bamboo Slats of the Beijing University Collection with the Laoguanshan Collection (북경대학 소장 한대의간(漢代醫簡)과 노관산 의간(老官山醫簡)의 비교 연구)

  • Kim, Beomsu;Kim, Kiwang
    • Journal of Korean Medical classics
    • /
    • v.36 no.1
    • /
    • pp.33-43
    • /
    • 2023
  • Objectives : Overlapping contents between two recently discovered Han period bamboo slats, the so-called "Beidahanjian" and the "Liushibingfang" have been identified. This study aims to present new knowledge that could be inferred from the concordance of these two texts. Methods : The most recent original texts of the medical part of the Beidahanjian and medical texts excavated from the Laoguanshan in addition to the Liushibingfang were compared with each other to determine identical parts. The meaning of these concordances was explored. Results : Identical sentences in two verses in the Beidahanjian and the Laoguanshan were identified. Conclusions : The Beidahanjian is a credible Western Han period text, of which the medical bamboo slats are likely to comprise an independent text that is a combination of ancient folk prescriptions and those of doctors.

Comparison of Text Beginning Frame Detection Methods in News Video Sequences (뉴스 비디오 시퀀스에서 텍스트 시작 프레임 검출 방법의 비교)

  • Lee, Sanghee;Ahn, Jungil;Jo, Kanghyun
    • Journal of Broadcast Engineering
    • /
    • v.21 no.3
    • /
    • pp.307-318
    • /
    • 2016
  • 비디오 프레임 내의 오버레이 텍스트는 음성과 시각적 내용에 부가적인 정보를 제공한다. 특히, 뉴스 비디오에서 이 텍스트는 비디오 영상 내용을 압축적이고 직접적인 설명을 한다. 그러므로 뉴스 비디오 색인 시스템을 만드는데 있어서 가장 신뢰할 수 있는 실마리이다. 텔레비전 뉴스 프로그램의 색인 시스템을 만들기 위해서는 텍스트를 검출하고 인식하는 것이 중요하다. 이 논문은 뉴스 비디오에서 오버레이 텍스트를 검출하고 인식하는데 도움이 되는 오버레이 텍스트 시작 프레임 식별을 제안한다. 비디오 시퀀스의 모든 프레임이 오버레이 텍스트를 포함하는 것이 아니기 때문에, 모든 프레임에서 오버레이 텍스트의 추출은 불필요하고 시간 낭비다. 그러므로 오버레이 텍스트를 포함하고 있는 프레임에만 초점을 맞춤으로써 오버레이 텍스트 검출의 정확도를 개선할 수 있다. 텍스트 시작 프레임 식별 방법에 대한 비교 실험을 뉴스 비디오에 대해서 실시하고, 적절한 처리 방법을 제안한다.

Developing History of Theory on Seven Kinds of Prescriptions ('칠방'설('七方'說) 변화·발전 과정)

  • Jo, Hak-Jun
    • Journal of Korean Medical classics
    • /
    • v.26 no.4
    • /
    • pp.1-21
    • /
    • 2013
  • Objective : This study is about how theory on seven kinds of prescriptions in Yellow Emperor's Cannon of Internal Medicine(黃帝內經) had been developed and how it had been applied for in prescription books or clinical texts. Method : I made a comparison of this theory between prescription books and clinical texts. After it, I investigated the change or development of it. Result : The first explanation about this was made by Wang Bing(王氷). Yu Wanso(劉完素) made up several varieties and meanings of it, Jang Jahwa(張子和) corrected what Yu Wanso added. Besides, someone for example, Wang Hogo(王好古), Yi Cheon(李梴), and so on added new varieties and meanings of odd prescription and even prescription. Conclusion : Theory on seven kinds of prescriptions in Yellow Emperor's Cannon of Internal Medicine had been constantly changed and developed in prescription books or clinical texts.

Correction for Misrecognition of Korean Texts in Signboard Images using Improved Levenshtein Metric

  • Lee, Myung-Hun;Kim, Soo-Hyung;Lee, Guee-Sang;Kim, Sun-Hee;Yang, Hyung-Jeong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.2
    • /
    • pp.722-733
    • /
    • 2012
  • Recently various studies on various applications using images taken by mobile phone cameras have been actively conducted. This study proposes a correction method for misrecognition of Korean Texts in signboard images using improved Levenshtein metric. The proposed method calculates distances of five recognized candidates and detects the best match texts from signboard text database. For verifying the efficiency of the proposed method, a database dictionary is built using 1.3 million words of nationwide signboard through removing duplicated words. We compared the proposed method to Levenshtein Metric which is one of representative text string comparison algorithms. As a result, the proposed method based on improved Levenshtein metric represents an improvement in recognition rates 31.5% on average compared to that of conventional methods.

A Critical Review of Researches on the Wenyilun in South Korea -Focusing on the Selection and Analysis of Medical Theories- (국내 『온역론』 연구에 대한 비판적 검토 -의론 선정과 분석을 중심으로-)

  • Kim, Sanghyun
    • Journal of Korean Medical classics
    • /
    • v.34 no.2
    • /
    • pp.75-84
    • /
    • 2021
  • Objectives : To examine texts dealing with the Wenyilun in South Korea and to re-evaluate its medical theories that have been underrated in previous texts. Methods : The contents and organization of the Gejiaxueshuo were analyzed. In addition, a research paper on the overall contents of the Wenyilun was studied. Results : Common theories of the miscellaneous qi that were mentioned in the two documents such as 'specificity', 'nine-part transition treatment theory', and 'one disease one formula' are either irrelevant or resulting from erroneous interpretation. While both texts evaluated the merits and harms of the Wenyilun, erroneously deducted contents were used as evidence for negative assessments in both. Conclusions : Should the contents of the Wenyilun be evaluated with a focus on the critical points that clinicians with vast experiences with epidemic disease patients raised, we would be judging the text differently.

A thought on Joseon's Medical Science through a look at SoGanEum(疎肝飮) (소간음(疎肝飮)으로 살펴보는 조선의학에 대한 일고(一考))

  • Kim, DaeHyeong;An, SangU
    • The Journal of Korean Medical History
    • /
    • v.17 no.2
    • /
    • pp.99-110
    • /
    • 2004
  • This study makes SoGanEum(疎肝飮), which is included in YoYak(要略) [JangBuPyoBonHeoSil MaekYakChongBang (臟腑標本虛實脈藥摠方)], its object. It elucidates the origin of this prescription from Chinese medical texts, and examines the characteristics shown in the process of reception through Joseon's exemplary medical texts.

  • PDF

Use of Word Clustering to Improve Emotion Recognition from Short Text

  • Yuan, Shuai;Huang, Huan;Wu, Linjing
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.4
    • /
    • pp.103-110
    • /
    • 2016
  • Emotion recognition is an important component of affective computing, and is significant in the implementation of natural and friendly human-computer interaction. An effective approach to recognizing emotion from text is based on a machine learning technique, which deals with emotion recognition as a classification problem. However, in emotion recognition, the texts involved are usually very short, leaving a very large, sparse feature space, which decreases the performance of emotion classification. This paper proposes to resolve the problem of feature sparseness, and largely improve the emotion recognition performance from short texts by doing the following: representing short texts with word cluster features, offering a novel word clustering algorithm, and using a new feature weighting scheme. Emotion classification experiments were performed with different features and weighting schemes on a publicly available dataset. The experimental results suggest that the word cluster features and the proposed weighting scheme can partly resolve problems with feature sparseness and emotion recognition performance.