• Title/Summary/Keyword: Chinese text

Search Result 315, Processing Time 0.031 seconds

A Watermarking for Text Document Images using Edge Direction Histograms (에지 방향 히스토그램을 이용한 텍스트 문서 영상의 워터마킹)

  • 김영원;오일석
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.2
    • /
    • pp.203-212
    • /
    • 2004
  • The watermarking is a method to achieve the copyright protection of multimedia contents. Among several media, the left documents show very peculiar properties: block/line/word patterning, clear separation between foreground and background areas. So algorithms specific to the text documents are required that meet those properties. This paper proposes a novel watermarking algorithm for the grayscale text document images. The algorithm inserts the watermark signals through the edge direction histograms. A concept of sub-image consistency is developed that the sub-images have similar shapes in terms of edge direction histograms. Using Korean, Chinese, and English document images, the concept is evaluated and proven to be valid over a wide range of document images. To insert watermark signals, the edge direction histogram is modified slightly. The experiments were performed on various document images and the algorithm was evaluated in terms of imperceptibility and robustness.

The Significance and Limitation of the Publication of the Manual for Buddhist Rituals (釋門儀範) (『석문의범(釋門儀範)』 간행의 의의와 한계)

  • Lee, Sunyi
    • (The)Study of the Eastern Classic
    • /
    • no.72
    • /
    • pp.329-363
    • /
    • 2018
  • The Manual for Buddhist Rituals (1935) is the manual of Buddhist rituals which has a pivotal position in terms of the modernization of Korean Buddhist rituals. The text has been established through the contents and systems of the two texts, The Manual for Practising Rituals (作法龜鑑, 1827) and the Compulsory Manual for Buddhists (佛子必覽, 1931). These three manuals include the examples for practising the Manuals. The analysis of the examples for the practices of the three texts is as follows: The Manual for Practising Rituals tries to include the Sounds of the Music for the Buddhist Rituals through the Four Sounds (四聲) and the Twin words (儷語); the marks of the sounds are excluded after the compulsory Manual for Buddhists. The Manual for Buddhist Rituals has replaced the rituals for repentance (三寶通請) with the rituals for revering (四聖禮): and this text has made it easier that people participate in Buddhist rituals with the text which is written in bilingual format in Korean and Classic Chinese. The text has been popularized through above-mentioned changes against the previous two texts but it has ended up excluding the practices for the music of Korean Buddhism such as the music for Buddhist rituals (梵唄) and reciting the name of Buddhas (念佛).

Research on the Financial Data Fraud Detection of Chinese Listed Enterprises by Integrating Audit Opinions

  • Leiruo Zhou;Yunlong Duan;Wei Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.12
    • /
    • pp.3218-3241
    • /
    • 2023
  • Financial fraud undermines the sustainable development of financial markets. Financial statements can be regarded as the key source of information to obtain the operating conditions of listed companies. Current research focuses more on mining financial digital data instead of looking into text data. However, text data can reveal emotional information, which is an important basis for detecting financial fraud. The audit opinion of the financial statement is especially the fair opinion of a certified public accountant on the quality of enterprise financial reports. Therefore, this research was carried out by using the data features of 4,153 listed companies' financial annual reports and audits of text opinions in the past six years, and the paper puts forward a financial fraud detection model integrating audit opinions. First, the financial data index database and audit opinion text database were built. Second, digitized audit opinions with deep learning Bert model was employed. Finally, both the extracted audit numerical characteristics and the financial numerical indicators were used as the training data of the LightGBM model. What is worth paying attention to is that the imbalanced distribution of sample labels is also one of the focuses of financial fraud research. To solve this problem, data enhancement and Focal Loss feature learning functions were used in data processing and model training respectively. The experimental results show that compared with the conventional financial fraud detection model, the performance of the proposed model is improved greatly, with Area Under the Curve (AUC) and Accuracy reaching 81.42% and 78.15%, respectively.

Development of Sentiment Analysis Model for the hot topic detection of online stock forums (온라인 주식 포럼의 핫토픽 탐지를 위한 감성분석 모형의 개발)

  • Hong, Taeho;Lee, Taewon;Li, Jingjing
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.187-204
    • /
    • 2016
  • Document classification based on emotional polarity has become a welcomed emerging task owing to the great explosion of data on the Web. In the big data age, there are too many information sources to refer to when making decisions. For example, when considering travel to a city, a person may search reviews from a search engine such as Google or social networking services (SNSs) such as blogs, Twitter, and Facebook. The emotional polarity of positive and negative reviews helps a user decide on whether or not to make a trip. Sentiment analysis of customer reviews has become an important research topic as datamining technology is widely accepted for text mining of the Web. Sentiment analysis has been used to classify documents through machine learning techniques, such as the decision tree, neural networks, and support vector machines (SVMs). is used to determine the attitude, position, and sensibility of people who write articles about various topics that are published on the Web. Regardless of the polarity of customer reviews, emotional reviews are very helpful materials for analyzing the opinions of customers through their reviews. Sentiment analysis helps with understanding what customers really want instantly through the help of automated text mining techniques. Sensitivity analysis utilizes text mining techniques on text on the Web to extract subjective information in the text for text analysis. Sensitivity analysis is utilized to determine the attitudes or positions of the person who wrote the article and presented their opinion about a particular topic. In this study, we developed a model that selects a hot topic from user posts at China's online stock forum by using the k-means algorithm and self-organizing map (SOM). In addition, we developed a detecting model to predict a hot topic by using machine learning techniques such as logit, the decision tree, and SVM. We employed sensitivity analysis to develop our model for the selection and detection of hot topics from China's online stock forum. The sensitivity analysis calculates a sentimental value from a document based on contrast and classification according to the polarity sentimental dictionary (positive or negative). The online stock forum was an attractive site because of its information about stock investment. Users post numerous texts about stock movement by analyzing the market according to government policy announcements, market reports, reports from research institutes on the economy, and even rumors. We divided the online forum's topics into 21 categories to utilize sentiment analysis. One hundred forty-four topics were selected among 21 categories at online forums about stock. The posts were crawled to build a positive and negative text database. We ultimately obtained 21,141 posts on 88 topics by preprocessing the text from March 2013 to February 2015. The interest index was defined to select the hot topics, and the k-means algorithm and SOM presented equivalent results with this data. We developed a decision tree model to detect hot topics with three algorithms: CHAID, CART, and C4.5. The results of CHAID were subpar compared to the others. We also employed SVM to detect the hot topics from negative data. The SVM models were trained with the radial basis function (RBF) kernel function by a grid search to detect the hot topics. The detection of hot topics by using sentiment analysis provides the latest trends and hot topics in the stock forum for investors so that they no longer need to search the vast amounts of information on the Web. Our proposed model is also helpful to rapidly determine customers' signals or attitudes towards government policy and firms' products and services.

A Study on "Seung Aeh Ill Chan" ("승애일찬(升厓日纂)"에 관한 소고(小考))

  • Hwang, Sunwook;Yoon, Hyun-Ju;Chong, Chin-Kang
    • Journal for History of Mathematics
    • /
    • v.27 no.1
    • /
    • pp.31-45
    • /
    • 2014
  • The book Su Hak Ip Mun(數學入門, Introduction to Mathematics) is one part of 5 sections of the book Seung Aeh Ill Chan(升厓日簒), which is a hand written manuscript in Chinese characters and the author and the date of writing is unknown. The book Seung Aeh Ill Chan begins with the song of division table so called Guguiga(九歸歌). We first investigate and compare the writing pattern of this with other old Korean mathematical books. Next, we investigate typical expression and calculation methods of mathematical contents and terminologies used in Su Hak Ip Mun and also figure out oddities of writing pattern of mathematical expression and cultural circumference of several problems dealt in the book. From these analysis and investigation, we estimate the writing date of Su Hak Ip Mun later than the year 1723 on which Su Ri Jeong On(數理精蘊) was first published. And we presumably guess that Guguiga and Su Hak Ip Mun are made not for practical use or theoretical purpose but for text to teach students.

Applying Keyword Analysis to Predicting Agriculture Product Price Index: The Case of the Chinese Farming Market

  • Wang, Zhi-yuan;Kwon, Ohbyung;Liu, Fan
    • Asia Pacific Journal of Business Review
    • /
    • v.1 no.1
    • /
    • pp.1-22
    • /
    • 2016
  • The prediction of prices of agricultural products in the agriculture IT sector plays a significant role in the economic life of consumers and anyone engaged in agricultural business, and as these prices fluctuate more often than do other prices, the prediction of these prices holds a great deal of research promise. For this reason, academic literature has provided studies on the factors influencing the prices of agricultural products and the price index. However, as these factors vary, they are difficult to predict, resulting in the challenge of acquiring quantitative data. China is one example of a country without a reliable prediction system for prices of agricultural products. Fortunately, disclosed heterogeneous data can be found on the Internet, which allows for the effective collection of factors related to the prediction of these product prices through the use of text mining. The data provided online is valuable in that they reflect the opinions of the general public in real-time. Accordingly, this study aims to use heterogeneous data from the Internet and suggest a model predicting the prices of agricultural products before functional analyses. Toward this end, data analyses were conducted on the Chinese agricultural products market, one of the largest markets in the world.

A Study on Composition and Content of Uibujeolok [醫部全錄] (의부전록(醫部全錄)의 편집체제(編輯體制)와 주제분류(主題分類))

  • Back, Sang-Ryong;Ahn, Sang-Woo
    • Korean Journal of Oriental Medicine
    • /
    • v.8 no.1
    • /
    • pp.1-15
    • /
    • 2002
  • $\ulcorner$Uibujeolok$\lrcorner$ is a part of ${\ulcorner}$gogeumdoseojibseong[古今圖書集成]${\lrcorner}$ edited in 1726 Chinese Ching[淸] Dynasty. It refers ${\ulcorner}$Uibu[醫部]${\lrcorner}$ included in ${\ulcorner}$Bakmulhwipyeon Yesuljeon-Ha[博物彙編 藝術典下], and is composed of 520 volumes with 141 chapters. Though the name of Eubujeolok is used in most printed editions, the proper name was ${\ulcorner}$Uibujibseong[醫部集成]${\lrcorner}$ according to the original text of the copper type printing. Each category consists of 13 parts including Commentary, Diagnosis, and Organs and Bowels etc. And each chapter follows identical structure with categories. This composition type of ${\ulcorner}$Uibujibseong${\lrcorner}$ is similar with that of ${\ulcorner}$Dong-uibogam[東醫寶鑑]${\lrcorner}$ which was published 130 years ahead, but different from those of clinical books which published in China until then. Unusually there is no chapter dealing with Chinese herbs and acupuncture.

  • PDF

Philological Analysis of Shanghan Prescriptions from the Chinese Unearthed Documents (중국 출토문헌에 보이는 상한방(傷寒方)의 문자학적 분석)

  • Lee, Kyung
    • Journal of Korean Medical classics
    • /
    • v.32 no.3
    • /
    • pp.17-29
    • /
    • 2019
  • Objectives : This paper is an analysis of the name 'Shanghan(傷寒)' and its related contents in unearthed documents. "Shanghanlun" as we know it is an edited version by Wangshuhe of the Jin period, as the original text as written by Zhangzhongjing has been unavailable. Recently in China, documents of the Xian Jin and Liang Han periods are being unearthed, allowing us to look at medical texts of previous times that Zhang referenced. The aim of this paper is to look at the developmental process of the Shanghan theory based on these medical texts. Methods : Research documents include all unearthed documents that include the name 'Shanghan'. There were a total of 4 written cases, 2 in "Wuweihandaiyijian", and one each in "Dunhuanghanjian" and "Juyanhanjian". Meaning of extracted examples were analyzed in reference to the shape of the character, then compared and analyzed with existing medical texts such as "Shanghanlun", "Jibeiqianjinyaofang", and "Bencaogangmu" Conclusions : By examining the 4 examples, Bianzhenglunzi and clinical prescriptions which are characteristics of Shanghanlun could be found. There was an 'Eliminating Wind' formula that was used to eliminate Cold pathogen of the exterior which showed remarkable resemblance to that found in "Jibeiqianjinyaofang". There are also formulas that 'Communicate to Disentangle' and 'Disentangle the Stomach' which are used in progressed stages of Shanghan disease, showing that Bianzhenglunzi had already been applied to Shanghan conditions.

Randomized Block Size (RBS) Model for Secure Data Storage in Distributed Server

  • Sinha, Keshav;Paul, Partha;Amritanjali, Amritanjali
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.12
    • /
    • pp.4508-4530
    • /
    • 2021
  • Today distributed data storage service are being widely used. However lack of proper means of security makes the user data vulnerable. In this work, we propose a Randomized Block Size (RBS) model for secure data storage in distributed environments. The model work with multifold block sizes encrypted with the Chinese Remainder Theorem-based RSA (C-RSA) technique for end-to-end security of multimedia data. The proposed RBS model has a key generation phase (KGP) for constructing asymmetric keys, and a rand generation phase (RGP) for applying optimal asymmetric encryption padding (OAEP) to the original message. The experimental results obtained with text and image files show that the post encryption file size is not much affected, and data is efficiently encrypted while storing at the distributed storage server (DSS). The parameters such as ciphertext size, encryption time, and throughput have been considered for performance evaluation, whereas statistical analysis like similarity measurement, correlation coefficient, histogram, and entropy analysis uses to check image pixels deviation. The number of pixels change rate (NPCR) and unified averaged changed intensity (UACI) were used to check the strength of the proposed encryption technique. The proposed model is robust with high resilience against eavesdropping, insider attack, and chosen-plaintext attack.

Discussion on Classical Text-based Evidence in Guidelines for the Traditional Chinese Medical Treatment of COVID-19 (COVID-19의 중의(中醫) 진료방안에 반영된 문헌 근거에 대한 고찰)

  • Kim, Sanghyun
    • Journal of Korean Medical classics
    • /
    • v.35 no.4
    • /
    • pp.115-125
    • /
    • 2022
  • Objectives : This study reviews whether the traditional medical thought process reflected in the Traditional Chinese Medical Treatment Plan for COVID-19 is based on existing classical texts, and examine concerns over the quality of evidence that the plan is based on. Methods : First, terminology and basic formulas composing the compound formulas in the COVID-19 TCM Treatment Plan were collected. Next, their usage in existing classical texts were searched in the medical classics database. Results : Infectious diseases similar to COVID-19 were understood as external disease due to Six Qi in the texts. Basic formulas used for treatment were those applied in Shanghan and Wenbing, among which cases where such formulas were applied in infectious diseases could be found in the classics. Conclusions : The level of evidence of the Treatment Plan suggested by various specialists could be evaluated as insufficient if we consider the literature. However, if application of such a plan could be supported institutionally, it could become a starting point for evidence generation.