• Title/Summary/Keyword: Search Speed

Search Result 793, Processing Time 0.028 seconds

A Design and Implementation of RSS Data Collecting Engine based on Web 2.0 (웹 2.0 기반 RSS 데이터 수집 엔진의 설계 및 구현)

  • Kang, Pil-Gu;Kim, Jae-Hwan;Lee, Sang-Jun;Chae, Jin-Seok
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.11
    • /
    • pp.1496-1506
    • /
    • 2007
  • The environment of web service has changed a great deal due to the progress of internet technology and positive participation of users. The established web service is static and passive, but the recent web service is becoming dynamic and active. Web 2.0 reflects current web service change well. The primary feature of web 2.0 is positive participation of users. Since the size of generated information is becoming larger, it is highly required to share the information fast and correctly. The technology to satisfy this need is web syndication and tagging in web 2.0. The web syndication makes feeds for another site or users to receive the content of web site. In addition, the tagging is the kernel of a information. Many internet users share rapidly the information through tag search. In this paper, we propose the efficient technique to improve the web 2.0 technology such as web syndication and tagging by using the data collection engine. Data collection engine has stored in a database, a user's Web site to use the information. and it has a user's Web site with access to updated data to collect. The experimental results show that our approach can improve the search speed up to 3.14 times better than the existing method and reduce the size of data up to 66% for building associated tags.

  • PDF

Tuple Pruning Using Bloom Filter for Packet Classification (패킷 분류를 위한 블룸 필터 이용 튜플 제거 알고리즘)

  • Kim, So-Yeon;Lim, Hye-Sook
    • Journal of KIISE:Information Networking
    • /
    • v.37 no.3
    • /
    • pp.175-186
    • /
    • 2010
  • Due to the emergence of new application programs and the fast growth of Internet users, Internet routers are required to provide the quality of services according to the class of input packets, which is identified by wire-speed packet classification. For a pre-defined rule set, by performing multi-dimensional search using various header fields of an input packet, packet classification determines the highest priority rule matching to the input packet. Efficient packet classification algorithms have been widely studied. Tuple pruning algorithm provides fast classification performance using hash-based search against the candidate tuples that may include matching rules. Bloom filter is an efficient data structure composed of a bit vector which represents the membership information of each element included in a given set. It is used as a pre-filter determining whether a specific input is a member of a set or not. This paper proposes new tuple pruning algorithms using Bloom filters, which effectively remove unnecessary tuples which do not include matching rules. Using the database known to be similar to actual rule sets used in Internet routers, simulation results show that the proposed tuple pruning algorithm provides faster packet classification as well as consumes smaller memory amount compared with the previous tuple pruning algorithm.

The Efficient Method of Parallel Genetic Algorithm using MapReduce of Big Data (빅 데이터의 MapReduce를 이용한 효율적인 병렬 유전자 알고리즘 기법)

  • Hong, Sung-Sam;Han, Myung-Mook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.5
    • /
    • pp.385-391
    • /
    • 2013
  • Big Data is data of big size which is not processed, collected, stored, searched, analyzed by the existing database management system. The parallel genetic algorithm using the Hadoop for BigData technology is easily realized by implementing GA(Genetic Algorithm) using MapReduce in the Hadoop Distribution System. The previous study that the genetic algorithm using MapReduce is proposed suitable transforming for the GA by MapReduce. However, they did not show good performance because of frequently occurring data input and output. In this paper, we proposed the MRPGA(MapReduce Parallel Genetic Algorithm) using improvement Map and Reduce process and the parallel processing characteristic of MapReduce. The optimal solution can be found by using the topology, migration of parallel genetic algorithm and local search algorithm. The convergence speed of the proposal method is 1.5 times faster than that of the existing MapReduce SGA, and is the optimal solution can be found quickly by the number of sub-generation iteration. In addition, the MRPGA is able to improve the processing and analysis performance of Big Data technology.

Algorithms for Indexing and Integrating MPEG-7 Visual Descriptors (MPEG-7 시각 정보 기술자의 인덱싱 및 결합 알고리즘)

  • Song, Chi-Ill;Nang, Jong-Ho
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.1
    • /
    • pp.1-10
    • /
    • 2007
  • This paper proposes a new indexing mechanism for MPEG-7 visual descriptors, especially Dominant Color and Contour Shape descriptors, that guarantees an efficient similarity search for the multimedia database whose visual meta-data are represented with MPEG-7. Since the similarity metric used in the Dominant Color descriptor is based on Gaussian mixture model, the descriptor itself could be transform into a color histogram in which the distribution of the color values follows the Gauss distribution. Then, the transformed Dominant Color descriptor (i.e., the color histogram) is indexed in the proposed indexing mechanism. For the indexing of Contour Shape descriptor, we have used a two-pass algorithm. That is, in the first pass, since the similarity of two shapes could be roughly measured with the global parameters such as eccentricity and circularity used in Contour shape descriptor, the dissimilar image objects could be excluded with these global parameters first. Then, the similarities between the query and remaining image objects are measured with the peak parameters of Contour Shape descriptor. This two-pass approach helps to reduce the computational resources to measure the similarity of image objects using Contour Shape descriptor. This paper also proposes two integration schemes of visual descriptors for an efficient retrieval of multimedia database. The one is to use the weight of descriptor as a yardstick to determine the number of selected similar image objects with respect to that descriptor, and the other is to use the weight as the degree of importance of the descriptor in the global similarity measurement. Experimental results show that the proposed indexing and integration schemes produce a remarkable speed-up comparing to the exact similarity search, although there are some losses in the accuracy because of the approximated computation in indexing. The proposed schemes could be used to build a multimedia database represented in MPEG-7 that guarantees an efficient retrieval.

Design and Implementation of Telematics Contents Gateway Based on Interoperability (상호운영성 기반의 텔레매틱스 컨텐츠 게이트웨이 설계 및 구현)

  • Kim, Do-Hyun;Min, Kyoung-Wook;Jang, Byung-Tae;Li, Ki-Joune
    • The KIPS Transactions:PartD
    • /
    • v.14D no.2
    • /
    • pp.249-264
    • /
    • 2007
  • As the need for telematics contents services due to the frequent traveling of people is increasing, it is necessary to provide various telematics contents by connecting and integrating current telematics contents which are collected and provided by each individual data provider. However, it is difficult to integrate or exchange the current telematics contents, because the data providers use different telematics contents models. Therefore, we propose a 'telematics contents gateway(TCG); system, which enables to integrate different telematics contents, so that the contents can be interoperable. The TCG can be a solution for several problems in the current telematics contents providing system. First of all, it has been impossible to search the contents without any information about data providers, because of the absence of metadata in the current systems. For this problem, TCG supports a search function based on a web-service technology. Second, TCG provides a common road network model for interoperability, and the model can be a solution to integrate different road network models into the common model. Moreover, integration algorithm for enhancing the correctness of integration will be proposed. In addition, it is designed by multi threads and multi queue structure. The TCG developed with C# on a windows system has been running and we verified that there was no information loss in the integration process. In addition, the speed of content integration and transfer satisfied the requirement of telematics services providers.

The Use of Reinforcement Learning and The Reference Page Selection Method to improve Web Spidering Performance (웹 탐색 성능 향상을 위한 강화학습 이용과 기준 페이지 선택 기법)

  • 이기철;이선애
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.3
    • /
    • pp.331-340
    • /
    • 2002
  • The web world is getting so huge and untractable that without an intelligent information extractor we would get more and more helpless. Conventional web spidering techniques for general purpose search engine may be too slow for the specific search engines, which concentrate only on specific areas or keywords. In this paper a new model for improving web spidering capabilities is suggested and experimented. How to select adequate reference web pages from the initial web Page set relevant to a given specific area (or keywords) can be very important to reduce the spidering speed. Our reference web page selection method DOPS dynamically and orthogonally selects web pages, and it can also decide the appropriate number of reference pages, using a newly defined measure. Even for a very specific area, this method worked comparably well almost at the level of experts. If we consider that experts cannot work on a huge initial page set, and they still have difficulty in deciding the optimal number of the reference web pages, this method seems to be very promising. We also applied reinforcement learning to web environment, and DOPS-based reinforcement learning experiments shows that our method works quite favorably in terms of both the number of hyper links and time.

  • PDF

Enabling Environment for Participation in Information Storage Media Export and Digital Evidence Search Process using IPA (정보저장매체 반출 및 디지털 증거탐색 과정에서의 참여권 보장 환경에 대한 중요도-이행도 분석)

  • Yang, Sang Hee;Lee, Choong C.;Yun, Haejung
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.3
    • /
    • pp.129-143
    • /
    • 2018
  • Recently, the use of digital media such as computers and smart devices has been rapidly increasing, The vast and diverse information contained in the warrant of the investigating agency also includes the one irrelevant to the crime. Therefore, when confiscating the information, the basic rights, defense rights and privacy invasion of the person to be seized have been the center of criticism. Although the investigation agency guarantees the right to participate, it does not have specific guidelines, so they are various by the contexts and environments. In this process, the abuse of the participation right is detrimental to the speed and integrity of the investigation, and there is a side effect that the digital evidence might be destroyed by remote initialization. In this study, we conducted surveys of digital evidence analysts across the country based on four domains and thirty measurement items for enabling environment for participation in information storage media export and digital evidence search process. The difference between the level of importance and the performance was analyzed by the IPA matrix based on process, location, people, and technology dimensions. Seven items belonging to "concentrate here" area are one process-related, three location-related, and three people-related items. This study is meaningful to be a basis for establishing the proper policies and strategies for ensuring participation right, as well as for minimizing the side effects.

A Study on the content analysis of Remote classes according to COVID-19 : Focusing on Kindergarten [Play ON] (코로나 19에 따른 원격수업 콘텐츠 분석연구 : 유치원 [놀이ON]을 중심으로)

  • Nam, ki-won;Choi, jung-hee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.2
    • /
    • pp.69-75
    • /
    • 2022
  • The purpose of this study is to present the direction of the development and utilization of remote class content for infants in a future educational environment through the analysis of remote class content for childhood. The analysis targets are 148 kindergarten [play ON] contents, and the analysis results according to "easy to use," "interest," "educational," "conformity of content," and "technicality" are as follows. Except for the 'conformity of content', there was a large variation in the score of content for each sub-area. In the case of "conformity of content," almost all of them received high scores for content production by incumbent teachers. With this in mind, we have drawn the following conclusions: First, it should be produced at an appropriate speed in consideration of the induction and understanding of childhoods' participation in play in the search and progress process, and it should be possible to play according to the challenges and levels using various strategies under the theme that all childhoods can be interested in. In addition, it was found that it was necessary to select topics, edit videos, and voice support so that childhoods can participate in the process of discovery and search, supporting childhoods' imagination, curiosity, and creative experiences.

Analysis of shopping website visit types and shopping pattern (쇼핑 웹사이트 탐색 유형과 방문 패턴 분석)

  • Choi, Kyungbin;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.85-107
    • /
    • 2019
  • Online consumers browse products belonging to a particular product line or brand for purchase, or simply leave a wide range of navigation without making purchase. The research on the behavior and purchase of online consumers has been steadily progressed, and related services and applications based on behavior data of consumers have been developed in practice. In recent years, customization strategies and recommendation systems of consumers have been utilized due to the development of big data technology, and attempts are being made to optimize users' shopping experience. However, even in such an attempt, it is very unlikely that online consumers will actually be able to visit the website and switch to the purchase stage. This is because online consumers do not just visit the website to purchase products but use and browse the websites differently according to their shopping motives and purposes. Therefore, it is important to analyze various types of visits as well as visits to purchase, which is important for understanding the behaviors of online consumers. In this study, we explored the clustering analysis of session based on click stream data of e-commerce company in order to explain diversity and complexity of search behavior of online consumers and typified search behavior. For the analysis, we converted data points of more than 8 million pages units into visit units' sessions, resulting in a total of over 500,000 website visit sessions. For each visit session, 12 characteristics such as page view, duration, search diversity, and page type concentration were extracted for clustering analysis. Considering the size of the data set, we performed the analysis using the Mini-Batch K-means algorithm, which has advantages in terms of learning speed and efficiency while maintaining the clustering performance similar to that of the clustering algorithm K-means. The most optimized number of clusters was derived from four, and the differences in session unit characteristics and purchasing rates were identified for each cluster. The online consumer visits the website several times and learns about the product and decides the purchase. In order to analyze the purchasing process over several visits of the online consumer, we constructed the visiting sequence data of the consumer based on the navigation patterns in the web site derived clustering analysis. The visit sequence data includes a series of visiting sequences until one purchase is made, and the items constituting one sequence become cluster labels derived from the foregoing. We have separately established a sequence data for consumers who have made purchases and data on visits for consumers who have only explored products without making purchases during the same period of time. And then sequential pattern mining was applied to extract frequent patterns from each sequence data. The minimum support is set to 10%, and frequent patterns consist of a sequence of cluster labels. While there are common derived patterns in both sequence data, there are also frequent patterns derived only from one side of sequence data. We found that the consumers who made purchases through the comparative analysis of the extracted frequent patterns showed the visiting pattern to decide to purchase the product repeatedly while searching for the specific product. The implication of this study is that we analyze the search type of online consumers by using large - scale click stream data and analyze the patterns of them to explain the behavior of purchasing process with data-driven point. Most studies that typology of online consumers have focused on the characteristics of the type and what factors are key in distinguishing that type. In this study, we carried out an analysis to type the behavior of online consumers, and further analyzed what order the types could be organized into one another and become a series of search patterns. In addition, online retailers will be able to try to improve their purchasing conversion through marketing strategies and recommendations for various types of visit and will be able to evaluate the effect of the strategy through changes in consumers' visit patterns.

Development of Simulation Technology Based on 3D Indoor Map for Analyzing Pedestrian Convenience (보행 편의성 분석을 위한 3차원 실내지도 기반의 시뮬레이션 기술 개발)

  • KIM, Byung-Ju;KANG, Byoung-Ju;YOU, So-Young;KWON, Jay-Hyoun
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.20 no.3
    • /
    • pp.67-79
    • /
    • 2017
  • Increasing transportation dependence on the metro system has lead to the convenience of passengers becoming as important as the transportation capacity. In this study, a pedestrian simulator has been developed that can quantitatively assess the pedestrian environment in terms of attributes such as speed and distance. The simulator consists of modules designed for 3D indoor map authoring and algorithmic pedestrian modeling. Module functions for 3D indoor map authoring include 3D spatial modeling, network generation, and evaluation of obtained results. The pedestrian modeling algorithm executes functions such as conducting a path search, allocation of users, and evaluation of level of service (LOS). The primary objective behind developing the said functions is to apply and analyze various scenarios repeatedly, such as before and after the improvement of the pedestrian environment, and to integrate the spatial information database with the dynamic information database. Furthermore, to demonstrate the practical applicability of the proposed simulator in the future, a test-bed was constructed for a currently operational metro station and the quantitative index of the proposed improvement effect was calculated by analyzing the walking speed of pedestrians before and after the improvement of the passage. The possibility of database extension for further analysis has also been discussed in this study.