• Title/Summary/Keyword: Paper Summarization

Search Result 144, Processing Time 0.021 seconds

An Innovative Approach of Bangla Text Summarization by Introducing Pronoun Replacement and Improved Sentence Ranking

  • Haque, Md. Majharul;Pervin, Suraiya;Begum, Zerina
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.752-777
    • /
    • 2017
  • This paper proposes an automatic method to summarize Bangla news document. In the proposed approach, pronoun replacement is accomplished for the first time to minimize the dangling pronoun from summary. After replacing pronoun, sentences are ranked using term frequency, sentence frequency, numerical figures and title words. If two sentences have at least 60% cosine similarity, the frequency of the larger sentence is increased, and the smaller sentence is removed to eliminate redundancy. Moreover, the first sentence is included in summary always if it contains any title word. In Bangla text, numerical figures can be presented both in words and digits with a variety of forms. All these forms are identified to assess the importance of sentences. We have used the rule-based system in this approach with hidden Markov model and Markov chain model. To explore the rules, we have analyzed 3,000 Bangla news documents and studied some Bangla grammar books. A series of experiments are performed on 200 Bangla news documents and 600 summaries (3 summaries are for each document). The evaluation results demonstrate the effectiveness of the proposed technique over the four latest methods.

Information Technology Application for Oral Document Analysis (구술문서 자료분석을 위한 정보검색기술의 응용)

  • Park, Soon-Cheol;Hahm, Han-Hee
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.13 no.2
    • /
    • pp.47-55
    • /
    • 2008
  • The purpose of this paper is to develop an analytical methodology of or릴 documents by the application of. Information Technologies. This system consists of the key word search, contents summary, clustering, classification & topic tracing of the contents. The integrated model of the five levels of retrieval technologies can be exhaustively used in the analysis of oral documents, which were collected as oral history of five men and women in the area of North Jeolla. Of the five methods topic tracing is the most pioneering accomplishment both home and abroad. In final this research will shed light on the methodological and theoretical studies of oral history and culture.

  • PDF

The Optimal Process of Weapon Acquisition Management (I) -With Special Reference to the Cost/Effectiveness Model for the Selection of Weapon Acquisition System- (무기체계 획득관리의 최적화 (I) -무기체계 획득시스템의 선정을 위한 비용대효과분석모형을 중심으로-)

  • Lee Jin-Joo;Kwon Tae-Young;Joo Nam-Youn
    • Journal of the military operations research society of Korea
    • /
    • v.3 no.2
    • /
    • pp.49-77
    • /
    • 1977
  • Weapon systems are curcial instruments for the security of a nation and critical elements for the victory in a war. Since modern weapon systems tend to be capital-intensive with high precision and quality, they become more and more complex and diversified; their acquisition costs become huge; and their technological obsolescence becomes accelerated. Therefore, the systematic management of weapon acquisition process would be one of the most important defense tasks at the national level. To analyze such problems and find solutions, this paper has studied various aspects related to the efficient management of weapon system acquisition. After brief summarization of the general characteristics of weapon systems, their effectiveness, and developmental trend, the paper discusses the defense management policies and techniques for the weapon systems. Specifically, four alternative acquisition methods such as indigenous R & D, foreign purchase, co-production and joint-production are discussed and analyzed by systems approach. The systems analysis procedure to evaluate and select weapon acquisition method is as follows; 1) to analyze the merits and demerits of the alternative methods, 2) to screen unrealistic alternatives through the consideration of significant factors such as political, economic, military, technological, and social constraints, 3) to evaluate and select an optimal one among the remaining acquisition methods after the cost-effectivenss analysis. For the base of cost-effectivess analysis, cost analysis model as well as effectiveness analysis model of each acquisition method are developed.

  • PDF

A Keyphrase Extraction Model for Each Conference or Journal (학술대회 및 저널별 기술 핵심구 추출 모델)

  • Jeong, Hyun Ji;Jang, Gwangseon;Kim, Tae Hyun;Sin, Donggu
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.81-83
    • /
    • 2022
  • Understanding research trends is necessary to select research topics and explore related works. Most researchers search representative keywords of interesting domains or technologies to understand research trends. However some conferences in artificial intelligence or data mining fields recently publish hundreds to thousands of papers for each year. It makes difficult for researchers to understand research trend of interesting domains. In our paper, we propose an automatic technology keyphrase extraction method to support researcher to understand research trend for each conference or journal. Keyphrase extraction that extracts important terms or phrases from a text, is a fundamental technology for a natural language processing such as summarization or searching, etc. Previous keyphrase extraction technologies based on pretrained language model extract keyphrases from long texts so performances are degraded in short texts like titles of papers. In this paper, we propose a techonolgy keyphrase extraction model that is robust in short text and considers the importance of the word.

  • PDF

Automatic Text Categorization using the Importance of Sentences (문장 중요도를 이용한 자동 문서 범주화)

  • Ko, Young-Joong;Park, Jin-Woo;Seo, Jung-Yun
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.6
    • /
    • pp.417-424
    • /
    • 2002
  • Automatic text categorization is a problem of assigning predefined categories to free text documents. In order to classify text documents, we have to extract good features from them. In previous researches, a text document is commonly represented by the frequency of each feature. But there is a difference between important and unimportant sentences in a text document. It has an effect on the importance of features in a text document. In this paper, we measure the importance of sentences in a text document using text summarizing techniques. A text document is represented by features with different weights according to the importance of each sentence. To verify the new method, we constructed Korean news group data set and experiment our method using it. We found that our new method gale a significant improvement over a basis system for our data sets.

Real-Time Video Indexing and Non-Linear Video Browsing for DTV Receivers (디지털 텔레비전 수신환경에서의 실시간 비디오 인덱싱과 비선형적 비디오 브라우징)

  • 윤경로;전성배
    • Journal of Broadcast Engineering
    • /
    • v.7 no.2
    • /
    • pp.79-87
    • /
    • 2002
  • The fast advances in digital video processing and multimedia processing technology over the last decade enabled various non-linear video browsing techniques. Based on the machine-understanding of the video content, non-linear video brows ing interfaces such as key-frame based content summarization have been introduced. The key-frame based user interfaces, such as storyboard or table of content, however, are still very hard for conventional TV users to use, and are very hard to implement without the service providers providing additional information for the construction of the key-frame based interfaces. In this paper, non-linear video browsing techniques, which not only overcome previously described drawbacks but also are easy-to-use, and real-time video indexing technology to support the proposed browsing techniques are proposed. The structure-based skipping and skimming help users easily find interesting scene and understand the content in a very short time, using real-time video indexing technology.

Automatic Summary Method of Linguistic Educational Video Using Multiple Visual Features (다중 비주얼 특징을 이용한 어학 교육 비디오의 자동 요약 방법)

  • Han Hee-Jun;Kim Cheon-Seog;Choo Jin-Ho;Ro Yong-Man
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.10
    • /
    • pp.1452-1463
    • /
    • 2004
  • The requirement of automatic video summary is increasing as bi-directional broadcasting contents and various user requests and preferences for the bi -directional broadcast environment are increasing. Automatic video summary is needed for an efficient management and usage of many contents in service provider as well. In this paper, we propose a method to generate a content-based summary of linguistic educational videos automatically. First, shot-boundaries and keyframes are generated from linguistic educational video and then multiple(low-level) visual features are extracted. Next, the semantic parts (Explanation part, Dialog part, Text-based part) of the linguistic educational video are generated using extracted visual features. Lastly the XMI- document describing summary information is made based on HieraTchical Summary architecture oi MPEG-7 MDS (Multimedia I)escription Scheme). Experimental results show that our proposed algorithm provides reasonable performance for automatic summary of linguistic educational videos. We verified that the proposed method is useful ior video summary system to provide various services as well as management of educational contents.

  • PDF

Creation of Soccer Video Highlight Using The Structural Features of Caption (자막의 구조적 특징을 이용한 축구 비디오 하이라이트 생성)

  • Huh, Moon-Haeng;Shin, Seong-Yoon;Lee, Yang-Weon;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.10D no.4
    • /
    • pp.671-678
    • /
    • 2003
  • A digital video is usually very long temporally. requiring large storage capacity. Therefore, users want to watch pre-summarized video before they watch a large long video. Especially in the field of sports video, they want to watch a highlight video. Consequently, highlight video is used that the viewers decide whether it is valuable for them to watch the video or not. This paper proposes how to create soccer video highlight using the structural features of the caption such as temporal and spatial features. Caption frame intervals and caption key frames are extracted by using those structural features. And then, highlight video is created by using scene relocation, logical indexing and highlight creation rule. Finally. retrieval and browsing of highlight and video segment is performed by selection of item on browser.

Parallel Multithreaded Processing for Data Set Summarization on Multicore CPUs

  • Ordonez, Carlos;Navas, Mario;Garcia-Alvarado, Carlos
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.2
    • /
    • pp.111-120
    • /
    • 2011
  • Data mining algorithms should exploit new hardware technologies to accelerate computations. Such goal is difficult to achieve in database management system (DBMS) due to its complex internal subsystems and because data mining numeric computations of large data sets are difficult to optimize. This paper explores taking advantage of existing multithreaded capabilities of multicore CPUs as well as caching in RAM memory to efficiently compute summaries of a large data set, a fundamental data mining problem. We introduce parallel algorithms working on multiple threads, which overcome the row aggregation processing bottleneck of accessing secondary storage, while maintaining linear time complexity with respect to data set size. Our proposal is based on a combination of table scans and parallel multithreaded processing among multiple cores in the CPU. We introduce several database-style and hardware-level optimizations: caching row blocks of the input table, managing available RAM memory, interleaving I/O and CPU processing, as well as tuning the number of working threads. We experimentally benchmark our algorithms with large data sets on a DBMS running on a computer with a multicore CPU. We show that our algorithms outperform existing DBMS mechanisms in computing aggregations of multidimensional data summaries, especially as dimensionality grows. Furthermore, we show that local memory allocation (RAM block size) does not have a significant impact when the thread management algorithm distributes the workload among a fixed number of threads. Our proposal is unique in the sense that we do not modify or require access to the DBMS source code, but instead, we extend the DBMS with analytic functionality by developing User-Defined Functions.

Automatic Extraction and Usage of Terminology Dictionary Based on Definitional Sentences Patterns in Technical Documents (기술문서 정의문 패턴을 이용한 전문용어사전 자동추출 및 활용방안)

  • Han, Hui-Jeong;Kim, Tae-Young;Doo, Hyo-Chul;Oh, Hyo-Jung
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.4
    • /
    • pp.81-99
    • /
    • 2017
  • Technical documents are important research outputs generated by knowledge and information society. In order to properly use the technical documents properly, it is necessary to utilize advanced information processing techniques, such as summarization and information extraction. In this paper, to extract core information, we automatically extracted the terminologies and their definition based on definitional sentences patterns and the structure of technical documents. Based on this, we proposed the system to build a specialized terminology dictionary. And further we suggested the personalized services so that users can utilize the terminology dictionary in various ways as an knowledge memory. The results of this study will allow users to find up-to-date information faster and easier. In addition, providing a personalized terminology dictionary to users can maximize the value, usability, and retrieval efficiency of the dictionary.