• Title/Summary/Keyword: Feature Based Summarization

Search Result 25, Processing Time 0.031 seconds

Document Summarization using Semantic Feature and Hadoop (하둡과 의미특징을 이용한 문서요약)

  • Kim, Chul-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.9
    • /
    • pp.2155-2160
    • /
    • 2014
  • In this paper, we proposes a new document summarization method using the extracted semantic feature which the semantic feature is extracted by distributed parallel processing based Hadoop. The proposed method can well represent the inherent structure of documents using the semantic feature by the non-negative matrix factorization (NMF). In addition, it can summarize the big data document using Hadoop. The experimental results demonstrate that the proposed method can summarize the big data document which a single computer can not summarize those.

Query-Based Summarization using Semantic Feature Matrix and Semantic Variable Matrix (의미 특징 행렬과 의미 가변행렬을 이용한 질의 기반의 문서 요약)

  • Park, Sun
    • Journal of Advanced Navigation Technology
    • /
    • v.12 no.4
    • /
    • pp.372-377
    • /
    • 2008
  • This paper proposes a new query-based document summarization method using the semantic feature matrix and the semantic variable matrix. The proposed method doesn't need the training phase using training data comprising queries and query specific documents. And it exactly summarizes documents for the given query by using semantic features and semantic variables that is better at identifying sub-topics of document. Because the NMF have a great power to naturally extract semantic features representing the inherent structure of a document. The experimental results show that the proposed method achieves better performance than other methods.

  • PDF

Generic Document Summarization using Coherence of Sentence Cluster and Semantic Feature (문장군집의 응집도와 의미특징을 이용한 포괄적 문서요약)

  • Park, Sun;Lee, Yeonwoo;Shim, Chun Sik;Lee, Seong Ro
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.12
    • /
    • pp.2607-2613
    • /
    • 2012
  • The results of inherent knowledge based generic summarization are influenced by the composition of sentence in document set. In order to resolve the problem, this papser propses a new generic document summarization which uses clustering of semantic feature of document and coherence of document cluster. The proposed method clusters sentences using semantic feature deriving from NMF(non-negative matrix factorization), which it can classify document topic group because inherent structure of document are well represented by the sentence cluster. In addition, the method can improve the quality of summarization because the importance sentences are extracted by using coherence of sentence cluster and the cluster refinement by re-cluster. The experimental results demonstrate appling the proposed method to generic summarization achieves better performance than generic document summarization methods.

Document Summarization using Weighting based on Cloud (클라우드 기반의 가중치에 의한 문서요약)

  • Park, Sun;Kim, Chul Won
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2013.10a
    • /
    • pp.305-306
    • /
    • 2013
  • In this paper, we proposes a document summarization method using the weighting based on cloud. The proposed method can minimize the user intervention to use the relevance feedback. It also can improve the quality of document summaries because the inherent semantic of the sentence set are well reflected by term weighting derived from semantic feature using nonnegative matrix factorizaitno based cloud.

  • PDF

Document Summarization using Weighting based on Cloud (클라우드 기반의 가중치에 의한 문서요약)

  • Park, Sun;Kim, Chul Won
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2013.10a
    • /
    • pp.968-969
    • /
    • 2013
  • In this paper, we proposes a document summarization method using the weighting based on cloud. The proposed method can minimize the user intervention to use the relevance feedback. It also can improve the quality of document summaries because the inherent semantic of the sentence set are well reflected by term weighting derived from semantic feature using nonnegative matrix factorizaitno based cloud.

  • PDF

User-based Document Summarization using Non-negative Matrix Factorization and Wikipedia (비음수행렬분해와 위키피디아를 이용한 사용자기반의 문서요약)

  • Park, Sun;Jeong, Min-A;Lee, Seong-Ro
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.2
    • /
    • pp.53-60
    • /
    • 2012
  • In this paper, we proposes a new document summarization method using the expanded query by wikipedia and the semantic feature representing inherent structure of document set. The proposed method can expand the query from user's initial query using the relevance feedback based on wikipedia in order to reflect the user require. It can well represent the inherent structure of documents using the semantic feature by the non-negative matrix factorization (NMF). In addition, it can reduce the semantic gap between the user require and the result of document summarization to extract the meaningful sentences using the expanded query and semantic features. The experimental results demonstrate that the proposed method achieves better performance than the other methods to summary document.

Topic-based Multi-document Summarization Using Non-negative Matrix Factorization and K-means (비음수 행렬 분해와 K-means를 이용한 주제기반의 다중문서요약)

  • Park, Sun;Lee, Ju-Hong
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.4
    • /
    • pp.255-264
    • /
    • 2008
  • This paper proposes a novel method using K-means and Non-negative matrix factorization (NMF) for topic -based multi-document summarization. NMF decomposes weighted term by sentence matrix into two sparse non-negative matrices: semantic feature matrix and semantic variable matrix. Obtained semantic features are comprehensible intuitively. Weighted similarity between topic and semantic features can prevent meaningless sentences that are similar to a topic from being selected. K-means clustering removes noises from sentences so that biased semantics of documents are not reflected to summaries. Besides, coherence of document summaries can be enhanced by arranging selected sentences in the order of their ranks. The experimental results show that the proposed method achieves better performance than other methods.

Document Summarization using Pseudo Relevance Feedback and Term Weighting (의사연관피드백과 용어 가중치에 의한 문서요약)

  • Kim, Chul-Won;Park, Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.3
    • /
    • pp.533-540
    • /
    • 2012
  • In this paper, we propose a document summarization method using the pseudo relevance feedback and the term weighting based on semantic features. The proposed method can minimize the user intervention to use the pseudo relevance feedback. It also can improve the quality of document summaries because the inherent semantic of the sentence set are well reflected by term weighting derived from semantic feature. In addition, it uses the semantic feature of term weighting and the expanded query to reduce the semantic gap between the user's requirement and the result of proposed method. The experimental results demonstrate that the proposed method achieves better performant than other methods without term weighting.

Feature-Based Summarization Method for a Large Opinion Documents Collection (대용량 오피니언 문서에 대한 특성 기반 요약 기법)

  • Chang, Jae-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.1
    • /
    • pp.33-42
    • /
    • 2016
  • Recently, an environment in which public opinions are expressed about various areas is expanded around SNSs or internet potals, thus, opinion documents get bigger rapidly. Under these circumstances, it is essential to utilize automatic summarization techniques for understanding whole contents of large opinion documents. However, it is hard to summarize efficiently those documents with traditional text summarization technologies since the documents include subject expressions as well as features of targets objects. Proposed method in this paper defines features of opinion documents, and designed to retrieve representative sentences expressing opinions of those features. In addition, through experiments, we prove the usefulness of proposed method.

Query-based Document Summarization using Pseudo Relevance Feedback based on Semantic Features and WordNet (의미특징과 워드넷 기반의 의사 연관 피드백을 사용한 질의기반 문서요약)

  • Kim, Chul-Won;Park, Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.7
    • /
    • pp.1517-1524
    • /
    • 2011
  • In this paper, a new document summarization method, which uses the semantic features and the pseudo relevance feedback (PRF) by using WordNet, is introduced to extract meaningful sentences relevant to a user query. The proposed method can improve the quality of document summaries because the inherent semantic of the documents are well reflected by the semantic feature from NMF. In addition, it uses the PRF by the semantic features and WordNet to reduce the semantic gap between the high level user's requirement and the low level vector representation. The experimental results demonstrate that the proposed method achieves better performance that the other methods.