Search | Korea Science

Latent Semantic Analysis Approach for Document Summarization Based on Word Embeddings

Al-Sabahi, Kamal;Zuping, Zhang;Kang, Yang
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.1
- /
- pp.254-276
- /
- 2019
Since the amount of information on the internet is growing rapidly, it is not easy for a user to find relevant information for his/her query. To tackle this issue, the researchers are paying much attention to Document Summarization. The key point in any successful document summarizer is a good document representation. The traditional approaches based on word overlapping mostly fail to produce that kind of representation. Word embedding has shown good performance allowing words to match on a semantic level. Naively concatenating word embeddings makes common words dominant which in turn diminish the representation quality. In this paper, we employ word embeddings to improve the weighting schemes for calculating the Latent Semantic Analysis input matrix. Two embedding-based weighting schemes are proposed and then combined to calculate the values of this matrix. They are modified versions of the augment weight and the entropy frequency that combine the strength of traditional weighting schemes and word embedding. The proposed approach is evaluated on three English datasets, DUC 2002, DUC 2004 and Multilingual 2015 Single-document Summarization. Experimental results on the three datasets show that the proposed model achieved competitive performance compared to the state-of-the-art leading to a conclusion that it provides a better document representation and a better document summary as a result.
https://doi.org/10.3837/tiis.2019.01.015 인용 PDF KSCI HTML

Crawling Algorithm Design for Deep Web Document Collection (심층 웹 문서 수집을 위한 크롤링 알고리즘 설계)

Won, Dong-Hyun;Kang, Yun-Jeong;Park, Hyuk-Gyu
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2022.10a
- /
- pp.367-369
- /
- 2022
With the development of web technology, the web provides customized information that meets the needs of users. Information is provided according to the input form and the user's query, and a web service that provides information that is difficult to search with a search engine is called an in-depth web. These deep webs contain more information than surface webs, but it is difficult to collect information with general crawling, which collects information at the time of the visit. The deep web provides users with information on the server by running script languages such as javascript in their browsers. In this paper, we propose an algorithm capable of exploring dynamically changing websites and collecting information by analyzing scripts for deep web collection. In this paper, the script of the bulletin board of the Korea Centers for Disease Control and Prevention was analyzed for experiments.
PDF

Design and Implementation of GT4 based Database Access and Integration Service in Grid Environment (그리드 환경에서 글로버스 툴킷 4 기반 데이터베이스 접근 및 통합 서비스 설계 및 구현)

Hyuk-Ho Kim;Ha-Na Lee;Pil-Woo, Lee;Yang-Woo Kim
- Proceedings of the Korea Information Processing Society Conference
- /
- 2008.11a
- /
- pp.1103-1106
- /
- 2008
Data Grid is a kind of Grid computing provides the cooperative environment through the distributed data sharing, and can manage the massive data easily and efficiently. We designed and implemented Globus Toolkit4 (GT4) based database access and integration service (GDAIS). This service was implemented as Grid service for run on the GT4 which is Grid middleware. And it provides functions which are automatic registration of database in virtual organization, distributed query service, and the unified user interface. Also this system can use components which are provided from GT4. Therefore it can improve the efficiency to distribute and manage databases, can easily access and integrate of the distributed heterogeneous data in Grid environments.
https://doi.org/10.3745/PKIPS.y2008m011a.1103 인용 PDF

Analysis of user's query and design of system for implementation of highway traffic datawarehouse (교통정보 이력자료 통합데이터베이스 구축을 위한 사용자 요구사항 분석 및 시스템 설계)

Cheong, Su-Jeong;Yun, Hye-Jung;Song, Soo-Kyung;Lee, Yoon-Kyung;Lee, Min-Soo;Oh, Cheol;NamGung, Sung
- Proceedings of the Korea Information Processing Society Conference
- /
- 2007.05a
- /
- pp.88-90
- /
- 2007
수집 및 가공된 교통자료와 시스템 운영 자료의 체계적인 분석 수단으로 사용될 교통 이력자료 관리 시스템의 요구 기능을 정의하는 것은 매우 중요하다. 시스템 구축을 위한 사용자 요구사항을 구체적으로 정의, 설명함으로써 시스템 완성도와 활용도를 높인다. 본 논문에서는 교통 이력자료 관리시스템의 주요 기능으로 '자료 저장 기능', '자료 분석 기능', '자료 보고 기능' 을 제안한다. 이러한 시스템의 사용자 요구 기능은 Rational Rose tool 을 이용하여 Use Case 다이어그램으로 시각화 되어지며 이후 교통정보 이력자료 통합데이터베이스 구축을 위한 개발자들에게 더욱 쉬운 이해를 제공할 수 있다.
https://doi.org/10.3745/PKIPS.y2007m05a.88 인용 PDF

A Collaborative Video Annotation and Browsing System using Linked Data (링크드 데이터를 이용한 협업적 비디오 어노테이션 및 브라우징 시스템)

Lee, Yeon-Ho;Oh, Kyeong-Jin;Sean, Vi-Sal;Jo, Geun-Sik
- Journal of Intelligence and Information Systems
- /
- v.17 no.3
- /
- pp.203-219
- /
- 2011
Previously common users just want to watch the video contents without any specific requirements or purposes. However, in today's life while watching video user attempts to know and discover more about things that appear on the video. Therefore, the requirements for finding multimedia or browsing information of objects that users want, are spreading with the increasing use of multimedia such as videos which are not only available on the internet-capable devices such as computers but also on smart TV and smart phone. In order to meet the users. requirements, labor-intensive annotation of objects in video contents is inevitable. For this reason, many researchers have actively studied about methods of annotating the object that appear on the video. In keyword-based annotation related information of the object that appeared on the video content is immediately added and annotation data including all related information about the object must be individually managed. Users will have to directly input all related information to the object. Consequently, when a user browses for information that related to the object, user can only find and get limited resources that solely exists in annotated data. Also, in order to place annotation for objects user's huge workload is required. To cope with reducing user's workload and to minimize the work involved in annotation, in existing object-based annotation automatic annotation is being attempted using computer vision techniques like object detection, recognition and tracking. By using such computer vision techniques a wide variety of objects that appears on the video content must be all detected and recognized. But until now it is still a problem facing some difficulties which have to deal with automated annotation. To overcome these difficulties, we propose a system which consists of two modules. The first module is the annotation module that enables many annotators to collaboratively annotate the objects in the video content in order to access the semantic data using Linked Data. Annotation data managed by annotation server is represented using ontology so that the information can easily be shared and extended. Since annotation data does not include all the relevant information of the object, existing objects in Linked Data and objects that appear in the video content simply connect with each other to get all the related information of the object. In other words, annotation data which contains only URI and metadata like position, time and size are stored on the annotation sever. So when user needs other related information about the object, all of that information is retrieved from Linked Data through its relevant URI. The second module enables viewers to browse interesting information about the object using annotation data which is collaboratively generated by many users while watching video. With this system, through simple user interaction the query is automatically generated and all the related information is retrieved from Linked Data and finally all the additional information of the object is offered to the user. With this study, in the future of Semantic Web environment our proposed system is expected to establish a better video content service environment by offering users relevant information about the objects that appear on the screen of any internet-capable devices such as PC, smart TV or smart phone.
https://doi.org/10.13088/jiis.2011.17.3.203 인용 PDF KSCI

A Method of Extending a Multiagent Framework with a Plan Generation Module (계획생성 모듈을 갖는 멀티에이전트 기반구조의 확장방법)

Lee, Gowang-Lo;Park, Sang-Kyu;Jang, Myong-Wuk;Min, Byung-Eui;Choi, Joong-Min
- The Transactions of the Korea Information Processing Society
- /
- v.4 no.9
- /
- pp.2280-2288
- /
- 1997
An agent is a software element that, by making use of knowledge and inference, performs tasks on behalf of the user. In general, an agent has the properties of autonomy, social ability, reactivity, and durability. Many researches on agents are more and more aiming at the multiagent systems since it is not sufficient to let a single agent do the whole things, especially in a real world where tasks require many diverse activities. However, the multiagent frameworks still have some limitations in the processing of user queries that are often ambiguous and goal-oriented. Also, a series of procedures or plans could not be generated from a single query directly. In order to give more intelligence to the multiagent framework, we propose a method of extending the framework with a plan generation module. The open agent architecture (OAA), which is a multiagent framework that we developed, is integrated with UCPOP, which is a AI planner. A travel schedule management agent (TSMA) system is implemented to explore the effects of the method. The extended system enables the user to only specify goal-oriented queries, and the plans and procedures to satisfy these goals are generated automatically. Also, this system provides a cooperative and knowledge-sharing environment that integrates several knowledge-based systems and planning systems that are distributed and used independently.
PDF

Design and Implementation of Automatic Linking Support System for Efficient Generating and Retrieving Integrated Documents Based on Web (웹 통합문서의 효율적 생성과 검색을 위한 자동링크지원 시스템의 설계 및 구축)

Lee, Won-Jung;Jung, Eun-Jae;Joo, Su-Chong;Lee, Seung-Yong
- The KIPS Transactions:PartA
- /
- v.10A no.2
- /
- pp.93-100
- /
- 2003
With the advent of distributed computing and Web service technologies, lots of users have been requiring services that can conveniently obtain and/or support well-assembled information based on Web. For this reason, we are to construct Automatic Linking Support Systems for generating Web-based integrated information and supporting retrieval information according to user's various requirements. Our system organization is based on client/server system. A server environment consisted of automatic linking engine that can provide lexical analyzing, query processing and integrated document generating functions, and databases that are made of dictionaries, image and URL contents. Also, client environments consisted of Web editor that can generate integrated documents and Web helper that can retrieve them via automatic linking engine and databases. For client's user-friendly interfaces, web editor and helper programs can directly execute by down leading from a server without setup them before inside clients. For reducing server's overheads, Parts of server's executing modules are distributed to clients on which they can be executing. As an implementation of our system, we use the JDK 1.3, SWING for user interfaces like Web editor and helper, RMI mechanism for interaction between clients and a server, and SQL server 7.0 for database development, respectively. Finally, we showed the access procedures of automatic document linking engine and databases from Web editor or Web helper, and results appearing on their screens.
https://doi.org/10.3745/KIPSTA.2003.10A.2.093 인용 PDF KSCI

Indexing and Retrieval Mechanism using Variation Patterns of Theme Melodies in Content-based Music Information Retrievals (내용 기반 음악 정보 검색에서 주제 선율의 변화 패턴을 이용한 색인 및 검색 기법)

구경이;신창환;김유성
- Journal of KIISE:Databases
- /
- v.30 no.5
- /
- pp.507-520
- /
- 2003
In this paper, an automatic construction method of theme melody index for large music database and an associative content-based music retrieval mechanism in which the constructed theme melody index is mainly used to improve the users' response time are proposed. First, the system automatically extracted the theme melody from a music file by the graphical clustering algorithm based on the similarities between motifs of the music. To place an extracted theme melody into the metric space of M-tree, we chose the average length variation and the average pitch variation of the theme melody as the major features. Moreover, we added the pitch signature and length signature which summarize the pitch variation pattern and the length variation pattern of a theme melody, respectively, to increase the precision of retrieval results. We also proposed the associative content-based music retrieval mechanism in which the k-nearest neighborhood searching and the range searching algorithms of M-tree are used to select the similar melodies to user's query melody from the theme melody index. To improve the users' satisfaction, the proposed retrieval mechanism includes ranking and user's relevance feedback functions. Also, we implemented the proposed mechanisms as the essential components of content-based music retrieval systems to verify the usefulness.
PDF KSCI

The e-Business Component Construction based on Distributed Component Specification (분산 컴포넌트 명세를 통한 e-비즈니스 컴포넌트 구축)

Kim, Haeng-Gon;Choe, Ha-Jeong;Han, Eun-Ju
- The KIPS Transactions:PartD
- /
- v.8D no.6
- /
- pp.705-714
- /
- 2001
The computing systems of today expanded business trade and distributed business process Internet. More and more systems are developed from components with exactly reusability, independency, and portability. Component based development is focused on advanced concepts rater than passive manipulation or source code in class library. The primary component construction in CBD. However, lead to an additional cost for reconstructing the new component with CBD model. It also difficult to serve component information with rapidly and exactly, which normalization model are not established, frequency user logging in Web caused overload. A lot of difficult issues and aspects of Component Based Development have to be investigated to develop good component-based products. There is no established normalization model which will guarantee a proper treatment of components. This paper elaborates on some of those aspects of web application to adapt user requirement with exactly and rapidly. Distributed components in this paper are used in the most tiny size on network and suggest the network-addressable interface based on business domain. We also discuss the internal and external specifications for grasping component internal and external relations of user requirements to be analyzed. The specifications are stored on Servlets after dividing the information between session and entity as an EJB (Enterprise JavaBeans) that are reusable unit size in business domain. The reusable units are used in business component through query to get business component. As a major contribution, we propose a systems model for registration, auto-arrange, search, test, and download component, which covers component reusability and component customization.
PDF

Design of the Flexible Buffer Node Technique to Adjust the Insertion/Search Cost in Historical Index (과거 위치 색인에서 입력/검색 비용 조정을 위한 가변 버퍼 노드 기법 설계)

Jung, Young-Jin;Ahn, Bu-Young;Lee, Yang-Koo;Lee, Dong-Gyu;Ryu, Keun-Ho
- The KIPS Transactions:PartD
- /
- v.18D no.4
- /
- pp.225-236
- /
- 2011
Various applications of LBS (Location Based Services) are being developed to provide the customized service depending on user's location with progress of wireless communication technology and miniaturization of personalized device. To effectively process an amount of vehicles' location data, LBS requires the techniques such as vehicle observation, data communication, data insertion and search, and user query processing. In this paper, we propose the historical location index, GIP-FB (Group Insertion tree with Flexible Buffer Node) and the flexible buffer node technique to adjust the cost of data insertion and search. the designed GIP+ based index employs the buffer node and the projection storage to cut the cost of insertion and search. Besides, it adjusts the cost of insertion and search by changing the number of line segments of the buffer node with user defined time interval. In the experiment, the buffer node size influences the performance of GIP-FB by changing the number of non-leaf node of the index. the proposed flexible buffer node is used to adjust the performance of the historical location index depending on the applications of LBS.
https://doi.org/10.3745/KIPSTD.2011.18D.4.225 인용 PDF KSCI

Search Result 702, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)