• Title/Summary/Keyword: web Indexing

Search Result 113, Processing Time 0.031 seconds

High-Speed Search Mechanism based on B-Tree Index Vector for Huge Web Log Mining and Web Attack Detection (대용량 웹 로그 마이닝 및 공격탐지를 위한 B-트리 인덱스 벡터 기반 고속 검색 기법)

  • Lee, Hyung-Woo;Kim, Tae-Su
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.11
    • /
    • pp.1601-1614
    • /
    • 2008
  • The number of web service users has been increased rapidly as existing services are changed into the web-based internet applications. Therefore, it is necessary for us to use web log pre-processing technique to detect attacks on diverse web service transactions and it is also possible to extract web mining information. However, existing mechanisms did not provide efficient pre-processing procedures for a huge volume of web log data. In this paper, we proposed both a field based parsing and a high-speed log indexing mechanism based on the suggested B-tree Index Vector structure for performance enhancement. In experiments, the proposed mechanism provides an efficient web log pre-processing and search functions with a session classification. Therefore it is useful to enhance web attack detection function.

  • PDF

The Design and Implementation of RIA-Based DNA Sequence Analysis Tools (RIA 기반 DNA서열 분석도구의 설계 및 구현)

  • Kim, Myung-Gwan;Cho, Choong-Hyo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.9 no.2
    • /
    • pp.29-36
    • /
    • 2009
  • Due to the progress of Bioinformatics field, We are making use of analyzing tools for effective analyzing enormous data of DNA sequence. But there was inconvenience in existing tools when searching and applying data for analyzing. Our treatise proposes a tool developed by a form based on RIA(Rich Internet Application) that you can solve the problems came from weak points. The analyzing tool for RIA indexing data of DNA sequence shows the results by real time in basis of Web 2.0 which supplemented basis on a form of Web. The web application was developed in Flex2 on Windows workstation.

  • PDF

Query Rewriting and Indexing Schemes for Distributed Systems based on the Semantic Web (시맨틱 웹 기반의 분산 시스템을 위한 질의 변환 및 인덱싱 기법)

  • Chae, Kwang-Ju;Kim, Youn-Hee;Lim, Hae-Chull
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.7
    • /
    • pp.718-722
    • /
    • 2008
  • Ontology plays an important role of the Semantic Web to describe meaning and reasoning of resources. Ontology has more rich expressive power through OWL that is a next standard representation language recommended by W3C. As the Semantic Web is widely known, an amount of information resources on the Web is growing rapidly and the related information resources are placed in distributed systems on the Web. So, for providing seamless services without the awareness of far distance, efficient management of the distributed information resources is required. Especially, sear ching fast for local repositories that include data related to user's queries is important to the performance of systems in the distributed environment. In this paper, first, we propose an index structure to search local repositories related to queries in the distributed Semantic Web. Second, we propose a query rewriting strategy to extend given user's query using various expression of OWL. Through the proposed index and query strategy, we can utilize various expressions of OWL and find local repositories related to all query patterns on the Semantic Web.

A Study on Machine Learning Algorithm for Intelligent Information Retrieval in World Wide Web (WWW상의 지능형 정보검색을 위한 기계학습 알고리즘 구현에 관한 연구)

  • 김성희
    • Journal of the Korean Society for information Management
    • /
    • v.17 no.2
    • /
    • pp.189-205
    • /
    • 2000
  • We investigate the appropriate design and implementation of an Inductive Learning Alogrithm with a Neural Network in order to solve both inconsistent indexing and incomplete query problems on the web. Specifically, the proposed system based queries and documents in the field of Mathematics shows how inductive learning method and neural networks can apply to information retreival. Also, this study examines all of parameters of the neural networks -- the number of node in input and output, hidden layer size and learning parameters etc. -- which are significant in determining how well the neural network will converge.

  • PDF

VotingRank: A Case Study of e-Commerce Recommender Application Using MapReduce

  • Ren, Jian-Ji;Lee, Jae-Kee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.834-837
    • /
    • 2009
  • There is a growing need for ad-hoc analysis of extremely large data sets, especially at e-Commerce companies which depend on recommender application. Nowadays, as the number of e-Commerce web pages grow to a tremendous proportion; vertical recommender services can help customers to find what they need. Recommender application is one of the reasons for e-Commerce success in today's world. Compared with general e-Commerce recommender application, obviously, general e-Commerce recommender application's processing scope is greatly narrowed down. MapReduce is emerging as an important programming model for large-scale data-parallel applications such as web indexing, data mining, and scientific simulation. The objective of this paper is to explore MapReduce framework for the e-Commerce recommender application on major general and dedicated link analysis for e-Commerce recommender application, and thus the responding time has been decreased and the recommender application's accuracy has been improved.

A Study on Automatic Text Categorization of Web-Based Query Using Synonymy List (유사어 사전을 이용한 웹기반 질의문의 자동 범주화에 관한 연구)

  • Nam, Young-Joon;Kim, Gyu-Hwan
    • Journal of Information Management
    • /
    • v.35 no.4
    • /
    • pp.81-105
    • /
    • 2004
  • In this study, the way of the automatic text categorization on web-based query was implemented. X2 methods based on the Supported Vector Machine were used to test the efficiency of text categorization on queries. This test is carried out by the model using the Synonymy List. 713 synonyms were extracted manually from the tested documents. As the result of this test, the precision ratio and the recall ratio were decreased by -0.01% and by 8.53%, respectively whether the synonyms were assigned or not. It also shows that the Value of F1 Measure was increased by 4.58%. The standard deviation between the recall and precision ratio was improve by 18.39%.

Video Browsing Using An Efficient Scene Change Detection in Telematics (텔레매틱스에서 효율적인 장면전환 검출기법을 이용한 비디오 브라우징)

  • Shin Seong-Yoon;Pyo Seong-Bae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.4 s.42
    • /
    • pp.147-154
    • /
    • 2006
  • Effective and efficient representation of color features of multiple video frames is an important vet challenging task for visual information management systems. This paper Proposes a Video Browsing Service(VBS) that provides both the video content retrieval and the video browsing by the real-time user interface on Web. For the scene segmentation and key frame extraction of video sequence, we proposes an efficient scene change detection method that combine the RGB color histogram with the X2 (Chi Square) histogram. Resulting key frames are linked by both physical and logical indexing. This system involves the video editing and retrieval function of a VCR's. Three elements that are the date, the need and the subject are used for video browsing. A Video Browsing Service is implemented with MySQL, PHP and JMF under Apache Web Server.

  • PDF

Effective Thumbnail Image by Image Indexing Methods (화상인덱싱방법에 의한 효과적 Thumbnail 화상)

  • 김지홍
    • Archives of design research
    • /
    • v.16 no.4
    • /
    • pp.481-488
    • /
    • 2003
  • A method to select the proper file formats of thumbnail images is proposed. After the experimental works for image file formats such as JPEG, GIF, and those effectiveness to the features contained in images, four features are obtained by feature extraction methods used in contents based image indexing, those are, the details, highly saturated colored area, the number of clustered color, and the amount of continuously varying hue. Also it is described the way to select the proper file format with those four features. In the thumbnail image generation experiments, 6 sample images are used, and with subjective assessment experiments, the resulted thumbnail images are shown to be consistent to the file formats chosen by human subjects, that is, favorable to human vision, which means the proposed method can be utilized as an automatic and systematic generation of thumbnail images for a lot of images on Web.

  • PDF

A Comparative Analysis of Content-based Music Retrieval Systems (내용기반 음악검색 시스템의 비교 분석)

  • Ro, Jung-Soon
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.3
    • /
    • pp.23-48
    • /
    • 2013
  • This study compared and analyzed 15 CBMR (Content-based Music Retrieval) systems accessible on the web in terms of DB size and type, query type, access point, input and output type, and search functions, with reviewing features of music information and techniques used for transforming or transcribing of music sources, extracting and segmenting melodies, extracting and indexing features of music, and matching algorithms for CBMR systems. Application of text information retrieval techniques such as inverted indexing, N-gram indexing, Boolean search, truncation, keyword and phrase search, normalization, filtering, browsing, exact matching, similarity measure using edit distance, sorting, etc. to enhancing the CBMR; effort for increasing DB size and usability; and problems in extracting melodies, deleting stop notes in queries, and using solfege as pitch information were found as the results of analysis.

An XML Tag Indexing Method Using on Lexical Similarity (XML 태그를 분류에 따른 가중치 결정)

  • Jeong, Hye-Jin;Kim, Yong-Sung
    • The KIPS Transactions:PartB
    • /
    • v.16B no.1
    • /
    • pp.71-78
    • /
    • 2009
  • For more effective index extraction and index weight determination, studies of extracting indices are carried out by using document content as well as structure. However, most of studies are concentrating in calculating the importance of context rather than that of XML tag. These conventional studies determine its importance from the aspect of common sense rather than verifying that through an objective experiment. This paper, for the automatic indexing by using the tag information of XML document that has taken its place as the standard for web document management, classifies major tags of constructing a paper according to its importance and calculates the term weight extracted from the tag of low weight. By using the weight obtained, this paper proposes a method of calculating the final weight while updating the term weight extracted from the tag of high weight. In order to determine more objective weight, this paper tests the tag that user considers as important and reflects it in calculating the weight by classifying its importance according to the result. Then by comparing with the search performance while using the index weight calculated by applying a method of determining existing tag importance, it verifies effectiveness of the index weight calculated by applying the method proposed in this paper.