• 제목/요약/키워드: Chinese text

검색결과 315건 처리시간 0.034초

은상(殷商)시기 갑골문(甲骨文)에 나타난 커뮤니케이션 속성(屬性) (A Study of Communication Factor in the Chinese augury bone)

  • 이범수
    • 동양고전연구
    • /
    • 제43호
    • /
    • pp.305-328
    • /
    • 2011
  • 갑골문이 커뮤니케이션 속성을 가진 커뮤니케이션 텍스트인 이유는 갑골문니 커뮤니케이션 개념 및 구성 요소 모두를 갖추고 있는 인간의 커뮤니케이션 현상으로서 커뮤니케이션 연구 대상으로 충분히 설명 가능하기 때문이다. 송신자는 갑골에 새길 내용을 정한 사람이고, 메시지는 새겨진 문자이며, 매체는 거북이나 짐승의 뼈이고, 수용자는 갑골문을 읽은 사람이며, 효과는 독해를 통한 인지(認知) 및 그에 따른 행동이고, 커뮤니케이션 상황은 이러한 일련의 시공간적 환경 등으로 설명할 수 있는 것이 그 근거이다. 중국 은상(殷商)시기의 커뮤니케이션 수단으로서 갑골문이 문자성 및 전래성을 갖는다는 것은 커뮤니케이션 역사 텍스트로서 손색이 없음을 뒷받침한다. 갑골문은 커뮤니케이션 도구인 문양 부호 문자로서의 속성을 갖추고 있고, 왕실 기록과 같은 내용 면에서 당시 사회 현상 전반이라는 메시지나 콘텐츠를 담고 있으며, 갑골 관리자였던 정인(貞人)이나 구인(龜人)의 역할은 오늘날 언론인의 역할과 유사하고, 문서처럼 정리 분류하여 체계적으로 관리 보관되었다는 점 등 또한 이를 뒷받침한다. 당시 사회의 의식과 가치관을 반영하고 있다는 점에서 갑골문은 커뮤니케이션 사상사 연구의 텍스트가 된다. 갑골문 서체 및 그 힘의 강약 변화가 종교적 가치관의 강약에 따라 달라졌다든가, 갑골문 기록인 왕명에서 정치 이데올로기를 분석해 낼 수 있다든지, 갑골문에 새겨진 10간 12지가 음양오행설과 결부되어 중국 철학 원리의 구성 요인이 된 것 등이 그 근거이다. 그러나 본 연구는 갑골문에 대한 연구 시각 넓히기를 위해 던지는 화두에 그치는 초보적 시론인 만큼, 분석의 심층성 및 방법론의 정치성 등에서 부족한 부분은 갑골문에 대한 학제간 연구, 인구의 양적 확산 및 질적 발전으로 인하여 보다 생산적으로 극복 가능하리라고 믿기 때문에 이후의 연구 과제로 남겨 둔다.

중국 주요 국가간행의학서의 편제구성과 질병분류인식에 대한 소고 (A Study on the Way of Organizing Contents of State Sponsored Medical Text in Ancient China)

  • 차웅석;김남일;안상우;김동율
    • 한국의사학회지
    • /
    • 제30권2호
    • /
    • pp.1-12
    • /
    • 2017
  • This paper is focused on the 'contents' of database level medical texts sponsored by the Chinese government. The premise of the study is that the contents of state-sponsored medical texts would show how medical policy makers and practitioners approached the body and diseases of the time, and by association the medical text would reveal the policy associated with state medical education and distribution of medical resources associated with the practitioners' approaches. This paper analyzes the contents of four representative state-sponsored medical texts: Cao's Treatise on the Origins and Symptoms of Various Diseases (巢氏諸病源候論, 610, Sui China); Great Peace and Sagely Benevolence Formulas (太平聖惠方, 996, Song China); Complete Record of Sagely Benevolence (聖濟總錄, 1117, Song China); Formulas for Universal Relief (普濟方, 1406, Ming China).

Chinese-clinical-record Named Entity Recognition using IDCNN-BiLSTM-Highway Network

  • Tinglong Tang;Yunqiao Guo;Qixin Li;Mate Zhou;Wei Huang;Yirong Wu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권7호
    • /
    • pp.1759-1772
    • /
    • 2023
  • Chinese named entity recognition (NER) is a challenging work that seeks to find, recognize and classify various types of information elements in unstructured text. Due to the Chinese text has no natural boundary like the spaces in the English text, Chinese named entity identification is much more difficult. At present, most deep learning based NER models are developed using a bidirectional long short-term memory network (BiLSTM), yet the performance still has some space to improve. To further improve their performance in Chinese NER tasks, we propose a new NER model, IDCNN-BiLSTM-Highway, which is a combination of the BiLSTM, the iterated dilated convolutional neural network (IDCNN) and the highway network. In our model, IDCNN is used to achieve multiscale context aggregation from a long sequence of words. Highway network is used to effectively connect different layers of networks, allowing information to pass through network layers smoothly without attenuation. Finally, the global optimum tag result is obtained by introducing conditional random field (CRF). The experimental results show that compared with other popular deep learning-based NER models, our model shows superior performance on two Chinese NER data sets: Resume and Yidu-S4k, The F1-scores are 94.98 and 77.59, respectively.

CNN-based Skip-Gram Method for Improving Classification Accuracy of Chinese Text

  • Xu, Wenhua;Huang, Hao;Zhang, Jie;Gu, Hao;Yang, Jie;Gui, Guan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권12호
    • /
    • pp.6080-6096
    • /
    • 2019
  • Text classification is one of the fundamental techniques in natural language processing. Numerous studies are based on text classification, such as news subject classification, question answering system classification, and movie review classification. Traditional text classification methods are used to extract features and then classify them. However, traditional methods are too complex to operate, and their accuracy is not sufficiently high. Recently, convolutional neural network (CNN) based one-hot method has been proposed in text classification to solve this problem. In this paper, we propose an improved method using CNN based skip-gram method for Chinese text classification and it conducts in Sogou news corpus. Experimental results indicate that CNN with the skip-gram model performs more efficiently than CNN-based one-hot method.

An Alignment based technique for Text Translation between Traditional Chinese and Simplified Chinese

  • Sue J. Ker;Lin, Chun-Hsien
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2002년도 Language, Information, and Computation Proceedings of The 16th Pacific Asia Conference
    • /
    • pp.147-156
    • /
    • 2002
  • Aligned parallel corpora have proved very useful in many natural language processing tasks, including statistical machine translation and word sense disambiguation. In this paper, we describe an alignment technique for extracting transfer mapping from the parallel corpus. During building our system and data collection, we observe that there are three types of translation approaches can be used. We especially focuses on Traditional Chinese and Simplified Chinese text lexical translation and a method for extracting transfer mappings for machine translation.

  • PDF

An Improved Coverless Text Steganography Algorithm Based on Pretreatment and POS

  • Liu, Yuling;Wu, Jiao;Chen, Xianyi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권4호
    • /
    • pp.1553-1567
    • /
    • 2021
  • Steganography is a current hot research topic in the area of information security and privacy protection. However, most previous steganography methods are not effective against steganalysis and attacks because they are usually carried out by modifying covers. In this paper, we propose an improved coverless text steganography algorithm based on pretreatment and Part of Speech (POS), in which, Chinese character components are used as the locating marks, then the POS is used to hide the number of keywords, the retrieval of stego-texts is optimized by pretreatment finally. The experiment is verified that our algorithm performs well in terms of embedding capacity, the embedding success rate, and extracting accuracy, with appropriate lengths of locating marks and the large scale of the text database.

Research on Chinese Microblog Sentiment Classification Based on TextCNN-BiLSTM Model

  • Haiqin Tang;Ruirui Zhang
    • Journal of Information Processing Systems
    • /
    • 제19권6호
    • /
    • pp.842-857
    • /
    • 2023
  • Currently, most sentiment classification models on microblogging platforms analyze sentence parts of speech and emoticons without comprehending users' emotional inclinations and grasping moral nuances. This study proposes a hybrid sentiment analysis model. Given the distinct nature of microblog comments, the model employs a combined stop-word list and word2vec for word vectorization. To mitigate local information loss, the TextCNN model, devoid of pooling layers, is employed for local feature extraction, while BiLSTM is utilized for contextual feature extraction in deep learning. Subsequently, microblog comment sentiments are categorized using a classification layer. Given the binary classification task at the output layer and the numerous hidden layers within BiLSTM, the Tanh activation function is adopted in this model. Experimental findings demonstrate that the enhanced TextCNN-BiLSTM model attains a precision of 94.75%. This represents a 1.21%, 1.25%, and 1.25% enhancement in precision, recall, and F1 values, respectively, in comparison to the individual deep learning models TextCNN. Furthermore, it outperforms BiLSTM by 0.78%, 0.9%, and 0.9% in precision, recall, and F1 values.

Question Similarity Measurement of Chinese Crop Diseases and Insect Pests Based on Mixed Information Extraction

  • Zhou, Han;Guo, Xuchao;Liu, Chengqi;Tang, Zhan;Lu, Shuhan;Li, Lin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권11호
    • /
    • pp.3991-4010
    • /
    • 2021
  • The Question Similarity Measurement of Chinese Crop Diseases and Insect Pests (QSM-CCD&IP) aims to judge the user's tendency to ask questions regarding input problems. The measurement is the basis of the Agricultural Knowledge Question and Answering (Q & A) system, information retrieval, and other tasks. However, the corpus and measurement methods available in this field have some deficiencies. In addition, error propagation may occur when the word boundary features and local context information are ignored when the general method embeds sentences. Hence, these factors make the task challenging. To solve the above problems and tackle the Question Similarity Measurement task in this work, a corpus on Chinese crop diseases and insect pests(CCDIP), which contains 13 categories, was established. Then, taking the CCDIP as the research object, this study proposes a Chinese agricultural text similarity matching model, namely, the AgrCQS. This model is based on mixed information extraction. Specifically, the hybrid embedding layer can enrich character information and improve the recognition ability of the model on the word boundary. The multi-scale local information can be extracted by multi-core convolutional neural network based on multi-weight (MM-CNN). The self-attention mechanism can enhance the fusion ability of the model on global information. In this research, the performance of the AgrCQS on the CCDIP is verified, and three benchmark datasets, namely, AFQMC, LCQMC, and BQ, are used. The accuracy rates are 93.92%, 74.42%, 86.35%, and 83.05%, respectively, which are higher than that of baseline systems without using any external knowledge. Additionally, the proposed method module can be extracted separately and applied to other models, thus providing reference for related research.