• Title/Summary/Keyword: Semantic feature

Search Result 259, Processing Time 0.03 seconds

Query-Based Text Summarization Using Cosine Similarity and NMF (NMF 와 코사인유사도를 이용한 질의 기반 문서요약)

  • Park Sun;Lee Ju-Hong;Ahn Chan-Min;Park Tae-Su;Song Jae-Won;Kim Deok-Hwan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2006.05a
    • /
    • pp.473-476
    • /
    • 2006
  • 인터넷의 발달로 인하여 정보의 양은 시간이 지날수록 폭발적으로 증가하고 있다. 이러한 방대한 정보로부터 정보검색시스템은 사용자에게 너무 많은 검색결과를 제시하여 사용자가 원하는 정보를 찾기 위해 너무 많은 시간을 소요하게 하는 정보의 과적재 문제가 있다. 질의 기반의 문서요약은 정보의 사용자가 원하는 정보의 검색시간을 줄임으로써 정보의 과적재 문제를 해결하는 방법으로서 점차 중요성이 증가하고 있다. 본 논문은 비음수 행렬 인수분해 (NMF, Non-negative Matrix Factorization)과 코사인 유사도를 이용하여 질의 기반의 문서를 요약하는 새로운 방법을 제안하였다. 제안된 방법은 질의와 문서 간에 사전학습이 필요 없다. 또한 문서를 그래프로 변형시키는 복잡한 처리 없이 NMF 에 의해 얻어진 의미 특징(semantic feature)과 의미 변수(semantic variable)로 문서의 고유 구조를 반영하여 요약의 정확도를 높일 수 있다. 마지막으로 단순한 방법으로 문장을 쉽게 요약할 수 있다.

  • PDF

A Document Ranking Method by Document Clustering Using Bayesian SoM and Botstrap (베이지안 SOM과 붓스트랩을 이용한 문서 군집화에 의한 문서 순위조정)

  • Choe, Jun-Hyeok;Jeon, Seong-Hae;Lee, Jeong-Hyeon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.7
    • /
    • pp.2108-2115
    • /
    • 2000
  • The conventional Boolean retrieval systems based on vector spae model can provide the results of retrieval fast, they can't reflect exactly user's retrieval purpose including semantic information. Consequently, the results of retrieval process are very different from those users expected. This fact forces users to waste much time for finding expected documents among retrieved documents. In his paper, we designed a bayesian SOM(Self-Organizing feature Maps) in combination with bayesian statistical method and Kohonen network as a kind of unsupervised learning, then perform classifying documents depending on the semantic similarity to user query in real time. If it is difficult to observe statistical characteristics as there are less than 30 documents for clustering, the number of documents must be increased to at least 50. Also, to give high rank to the documents which is most similar to user query semantically among generalized classifications for generalized clusters, we find the similarity by means of Kohonen centroid of each document classification and adjust the secondary rank depending on the similarity.

  • PDF

A Semantic Orientation Prediction Method of Sentiment Features Based on the General and Domain-Dependent Characteristics (일반적, 영역 의존적 특성을 반영한 감정 자질의 의미지향성 추정 방법)

  • Hwang, Jaewon;Ko, Youngjoong
    • Annual Conference on Human and Language Technology
    • /
    • 2009.10a
    • /
    • pp.155-159
    • /
    • 2009
  • 본 논문은 한국어 문서 감정분류를 위한 중요한 어휘 자원인 감정자질(Sentiment Feature)의 의미지향성(Semantic Orientation) 추정을 위해 일반적인 특성과 영역(Domain) 의존적인 특성을 반영하여 한국어 문서 감정분류(Sentiment Classification)의 성능 향상을 얻을 수 있는 기법을 제안한다. 감정자질의 의미지 향성은 검색 엔진을 통해 추출한 각 감정 자질의 스니핏(Snippet)과 실험 말뭉치를 이용하여 추정할 수 있다. 검색 엔진을 통해 추출된 스니핏은 감정자질의 일반적인 특성을 반영하며, 실험 말뭉치는 분류하고자 하는 영역 의존적인 특성을 반영한다. 이렇게 얻어진 감정자질의 의미지향성 수치는 각 문장의 감정강도를 추정하기 위해 이용되며, 문장의 감정 강도의 값을 TF-IDF 가중치 기법에 접목하여 감정자질의 가중치를 책정한다. 최종적으로 학습 과정에서 긍정 문서에서는 긍정 감정자질, 부정 문서에서는 부정 감정자질을 대상으로 추가 가중치를 부여하여 학습하였다. 본 논문에서는 문서 분류에 뛰어난 성능을 보여주는 지지 벡터 기계(Support Vector Machine)를 사용하여 제안한 방법의 성능을 평가한다. 평가 결과, 일반적인 정보 검색에서 사용하는 내용어(Content Word) 기반의 자질을 사용한 경우보다 3.1%의 성능향상을 보였다.

  • PDF

3D Spatial Interaction Method using Visual Dynamics and Meaning Production of Character

  • Lim, Sooyeon
    • International journal of advanced smart convergence
    • /
    • v.7 no.3
    • /
    • pp.130-139
    • /
    • 2018
  • This study is to analyze the relationship between character and human semantic production through research on character visualization artworks and to develop a creative platform that visually expresses the formative and semantic dynamics of characters using the results will be. The 3D spatial interaction system using the character visualization proposed generates the transformation of the character in real time using the interaction with user and the deconstruction of the character structure. Transformations of characters including the intentions of the viewers provide a dynamic visual representation to the viewer and maximize the efficiency of meaning transfer by producing various related meanings. The method of dynamic deconstruction and reconstruction of the characters provided by this system creates special shapes that viewers cannot imagine until now and further extends the interpretation range of the meaning of the characters. Therefore, the proposed system not only induces an active viewing attitude from viewers, but also gives them an opportunity to enjoy watching the artwork and demonstrate creativity as a creator. This system induces new gestures of the viewer in real time through the transformation of characters in accordance with the viewer''s gesture, and has the feature of exchanging emotions with viewers.

Document Clustering Method using Coherence of Cluster and Non-negative Matrix Factorization (비음수 행렬 분해와 군집의 응집도를 이용한 문서군집)

  • Kim, Chul-Won;Park, Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.12
    • /
    • pp.2603-2608
    • /
    • 2009
  • Document clustering is an important method for document analysis and is used in many different information retrieval applications. This paper proposes a new document clustering model using the clustering method based NMF(non-negative matrix factorization) and refinement of documents in cluster by using coherence of cluster. The proposed method can improve the quality of document clustering because the re-assigned documents in cluster by using coherence of cluster based similarity between documents, the semantic feature matrix and the semantic variable matrix, which is used in document clustering, can represent an inherent structure of document set more well. The experimental results demonstrate appling the proposed method to document clustering methods achieves better performance than documents clustering methods.

DP-LinkNet: A convolutional network for historical document image binarization

  • Xiong, Wei;Jia, Xiuhong;Yang, Dichun;Ai, Meihui;Li, Lirong;Wang, Song
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1778-1797
    • /
    • 2021
  • Document image binarization is an important pre-processing step in document analysis and archiving. The state-of-the-art models for document image binarization are variants of encoder-decoder architectures, such as FCN (fully convolutional network) and U-Net. Despite their success, they still suffer from three limitations: (1) reduced feature map resolution due to consecutive strided pooling or convolutions, (2) multiple scales of target objects, and (3) reduced localization accuracy due to the built-in invariance of deep convolutional neural networks (DCNNs). To overcome these three challenges, we propose an improved semantic segmentation model, referred to as DP-LinkNet, which adopts the D-LinkNet architecture as its backbone, with the proposed hybrid dilated convolution (HDC) and spatial pyramid pooling (SPP) modules between the encoder and the decoder. Extensive experiments are conducted on recent document image binarization competition (DIBCO) and handwritten document image binarization competition (H-DIBCO) benchmark datasets. Results show that our proposed DP-LinkNet outperforms other state-of-the-art techniques by a large margin. Our implementation and the pre-trained models are available at https://github.com/beargolden/DP-LinkNet.

Application of YOLOv5 Neural Network Based on Improved Attention Mechanism in Recognition of Thangka Image Defects

  • Fan, Yao;Li, Yubo;Shi, Yingnan;Wang, Shuaishuai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.245-265
    • /
    • 2022
  • In response to problems such as insufficient extraction information, low detection accuracy, and frequent misdetection in the field of Thangka image defects, this paper proposes a YOLOv5 prediction algorithm fused with the attention mechanism. Firstly, the Backbone network is used for feature extraction, and the attention mechanism is fused to represent different features, so that the network can fully extract the texture and semantic features of the defect area. The extracted features are then weighted and fused, so as to reduce the loss of information. Next, the weighted fused features are transferred to the Neck network, the semantic features and texture features of different layers are fused by FPN, and the defect target is located more accurately by PAN. In the detection network, the CIOU loss function is used to replace the GIOU loss function to locate the image defect area quickly and accurately, generate the bounding box, and predict the defect category. The results show that compared with the original network, YOLOv5-SE and YOLOv5-CBAM achieve an improvement of 8.95% and 12.87% in detection accuracy respectively. The improved networks can identify the location and category of defects more accurately, and greatly improve the accuracy of defect detection of Thangka images.

Pretext Task Analysis for Self-Supervised Learning Application of Medical Data (의료 데이터의 자기지도학습 적용을 위한 pretext task 분석)

  • Kong, Heesan;Park, Jaehun;Kim, Kwangsu
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.38-40
    • /
    • 2021
  • Medical domain has a massive number of data records without the response value. Self-supervised learning is a suitable method for medical data since it learns pretext-task and supervision, which the model can understand the semantic representation of data without response values. However, since self-supervised learning performance depends on the expression learned by the pretext-task, it is necessary to define an appropriate Pretext-task with data feature consideration. In this paper, to actively exploit the unlabeled medical data into artificial intelligence research, experimentally find pretext-tasks that suitable for the medical data and analyze the result. We use the x-ray image dataset which is effectively utilizable for the medical domain.

  • PDF

NAMA: A Context-Aware Multi-Agent Based Web Service Approach to Proactive Need Identification for Personalized Reminder System (NAMA: 개인화된 상기 시스템 구축에서의 선응적인 욕구 파악을 위한 상황인지가 가능한 다중 에이전트 웹서비스 접근법)

  • Kwon, Oh-Byung;Kim, Min-Yong;Choi, Sung-Chul;Park, Gyu-Ro
    • Asia pacific journal of information systems
    • /
    • v.14 no.3
    • /
    • pp.121-144
    • /
    • 2004
  • Developing a personalized system on a user's behalf which is working around the Internet-based marketplace is one of the challenging issues in intelligent e-business, especially mobile commenrce. It has been highly recommended that such a mobile personalized system has to perceive the user's needs a priori by tracking user's current context such as location with activity and then to identify the current needs dynamically and proactively. Automatically and unobtrusively getting user's context is an inevitable feature for the development of autonomous mobile commenrce. However, personalization methodologies and their feasible architectures for context-aware mobile commerce have been so far very rare. Hence, this paper aims to propose a context-aware mobile commerce development methodology by applying agent and semantic web technologies for personalized reminder system, which is one of the mobile commerce support system. We revisited associationism to understand a buyer's need identification process and adopt the process as 'purchase based on association' to implement a personalized reminder system. Based on this approach, we have showed how the agent-based semantic web service system can be used to realize need-aware reminder system. NAMA(Need-Aware Multi-Agent), a prototype system, has been implemented to show the feasibility of the methodology and framework under mobile setting proposed in this paper. NAMA embeds bluetooth-based location tracking module and identify what a user is currently looking at through her/his mobile device such as PDA. Based on these capabilities, NAMA considers the context, user profile with preferences, and information about currently available services, to aware user's current needs and then link her/him to a set of services, which are implemented as web services.

A Study on User Preference Sharing based on Semantic Web in Personalized Services (개인화서비스에서 시맨틱웹 기반의 사용자 선호정보 공유에 관한 연구)

  • Kim, Ju-Yeon;Kim, Jong-Woo;Kim, Chang-Soo
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.10
    • /
    • pp.1356-1366
    • /
    • 2007
  • Many personalized Services that provide users with adaptive information according to users' requirements and preferences have been researched and developed. However, existing approaches are difficult to share a user's information among heterogeneous services because these approaches manage users' preferences in a single system. In this paper, we propose a user preference sharing model based on the Semantic Web as a solution to resolve the problem. Our model enables user preferences to be described and shared over service-specific ontologies which are affected by the feature of each service. Our model is analyzed and evaluated with an implementation of the middleware that supports our model. Our approach has the advantage of providing more efficient personalized services than existing approaches because it can describe users' preferences centering around each service and share these information among heterogeneous personalized services.

  • PDF