• Title/Summary/Keyword: query performance

Search Result 950, Processing Time 0.031 seconds

Study on MPI-based parallel sequence similarity search in the LINUX cluster (클러스터 환경에서의 MPI 기반 병렬 서열 유사성 검색에 관한 연구)

  • Hong, Chang-Bum;Cha, Jeoung-Ho;Lee, Sung-Hoon;Shin, Seung-Woo;Park, Keun-Joon;Park, Keun-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.6 s.44
    • /
    • pp.69-78
    • /
    • 2006
  • In the field of the bioinformatics, it plays an important role in predicting functional information or structure information to search similar sequence in biological DB. Biolrgical sequences have been increased dramatically since Human Genome Project. At this point, because the searching speed for the similar sequence is highly regarded as the important factor for predicting function or structure, the SMP(Sysmmetric Multi-Processors) computer or cluster is being used in order to improve the performance of searching time. As the method to improve the searching time of BLAST(Basic Local Alighment Search Tool) being used for the similarity sequence search, We suggest the nBLAST algorithm performing on the cluster environment in this paper. As the nBLAST uses the MPI(Message Passing Interface), the parallel library without modifying the existing BLAST source code, to distribute the query to each node and make it performed in parallel, it is possible to easily make BLAST parallel without complicated procedures such as the configuration. In addition, with the experiment performing the nBLAST in the 28 nodes of LINUX cluster, the enhanced performance according to the increase in the number of the nodes has been confirmed.

  • PDF

Design and Implementation of Web GIS Server Using Node.js (Node.js를 활용한 웹GIS 서버의 설계와 구현)

  • Jun, Sang Hwan;Doh, Kyoung Tae
    • Spatial Information Research
    • /
    • v.21 no.3
    • /
    • pp.45-53
    • /
    • 2013
  • Web GIS, based on the latest web-technology, has evolved to provide efficient and accurate spatial information to users. Furthermore, Web GIS Server has improved the performance constantly to respond user web requests and to offer spatial information service. This research aims to create a designed and implemented Web GIS Server that is named as Nodemap which uses the emergent technology, Node.js, which has been issued for an event-oriented, non-blocking I/O model framework for coding JavaScript on the server development. Basically, NodeMap is Web GIS Server that supports OGC implementation specification. It is designed to process GIS data by using DBMS, which supports spatial index and standard spatial query function. And NodeMap uses Node-Canvas module supported HTML5 canvas to render spatial information on tile map. Lastly, NodeMap uses Express module based connect module framework. NodaMap performance demonstration confirmed a possibility of applying Node.js as a (next/future) Web GIS Server development technology through the benchmarking. Having completed its quality test of NodeMap, this study has shown the compatibility and potential for Node.js as a Web GIS server development technology, and has shown the bright future of internet GIS service.

Implementation of a Spam Message Filtering System using Sentence Similarity Measurements (문장유사도 측정 기법을 통한 스팸 필터링 시스템 구현)

  • Ou, SooBin;Lee, Jongwoo
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.1
    • /
    • pp.57-64
    • /
    • 2017
  • Short message service (SMS) is one of the most important communication methods for people who use mobile phones. However, illegal advertising spam messages exploit people because they can be used without the need for friend registration. Recently, spam message filtering systems that use machine learning have been developed, but they have some disadvantages such as requiring many calculations. In this paper, we implemented a spam message filtering system using the set-based POI search algorithm and sentence similarity without servers. This algorithm can judge whether the input query is a spam message or not using only letter composition without any server computing. Therefore, we can filter the spam message although the input text message has been intentionally modified. We added a specific preprocessing option which aims to enable spam filtering. Based on the experimental results, we observe that our spam message filtering system shows better performance than the original set-based POI search algorithm. We evaluate the proposed system through extensive simulation. According to the simulation results, the proposed system can filter the text message and show high accuracy performance against the text message which cannot be filtered by the 3 major telecom companies.

Cache Replacement Strategies considering Location and Region Properties of Data in Mobile Database Systems (이동 데이타베이스 시스템에서 데이타의 위치와 영역 특성을 고려한 캐쉬 교체 기법)

  • Kim, Ho-Sook;Yong, Hwan-Seung
    • Journal of KIISE:Databases
    • /
    • v.27 no.1
    • /
    • pp.53-63
    • /
    • 2000
  • The mobile computing service market is increasing rapidly due to the development of low-cost wireless network technology and the high-performance mobile computing devices. In recent years, several methods have been proposed to effectively deal with restrictions of the mobile computing environment such as limited bandwidth, frequent disconnection and short-lived batteries. Amongst those methods, much study is being done on the caching method - among the data transmitted from a mobile support station, it selects those that are likely to be accessed in the near future and stores them in the local cache of a mobile host. Existing cache replacement methods have some limitations in efficiency because they do not take into consideration the characteristics of user mobility and spatial attributes of geographical data. In this paper, we show that the value and the semantic of the data, which are stored in the cache of a mobile host, changes according to the movement of the mobile host. We argue it is because data that are geographically near are better suited to provide an answer to a users query in the mobile environment. Also, we define spatial location of geographical data has effect on, using the spatial attributes of data. Finally, we propose two new cache replacement methods that efficiently support user mobility and spatial attributes of data. One is based on the location of data and the other on the meaningful region of data. From the comparative analysis of the previous methods and that they improve the cache hit ratio. Also we show that performance varies according to data density using this, we argue different cache replacement methods are required for regions with varying density of data.

  • PDF

An Efficient Spatial Join Method Using DOT Index (DOT 색인을 이용한 효율적인 공간 조인 기법)

  • Back, Hyun;Yoon, Jee-Hee;Won, Jung-Im;Park, Sang-Hyun
    • Journal of KIISE:Databases
    • /
    • v.34 no.5
    • /
    • pp.420-436
    • /
    • 2007
  • The choice of an effective indexing method is crucial to guarantee the performance of the spatial join operator which is heavily used in geographical information systems. The $R^*$-tree based method is renowned as one of the most representative indexing methods. In this paper, we propose an efficient spatial join technique based on the DOT(Double Transformation) index, and compare it with the spatial Join technique based on the $R^*$-tree index. The DOT index transforms the MBR of an spatial object into a single numeric value using a space filling curve, and builds the $B^+$-tree from a set of numeric values transformed as such. The DOT index is possible to be employed as a primary index for spatial objects. The proposed spatial join technique exploits the regularities in the moving patterns of space filling curves to divide a query region into a set of maximal sub-regions within which space filling curves traverse without interruption. Such division reduces the number of spatial transformations required to perform the spatial join and thus improves the performance of join processing. The experiments with the data sets of various distributions and sizes revealed that the proposed join technique is up to three times faster than the spatial join method based on the $R^*$-tree index.

Dynamic Management of Equi-Join Results for Multi-Keyword Searches (다중 키워드 검색에 적합한 동등조인 연산 결과의 동적 관리 기법)

  • Lim, Sung-Chae
    • The KIPS Transactions:PartA
    • /
    • v.17A no.5
    • /
    • pp.229-236
    • /
    • 2010
  • With an increasing number of documents in the Internet or enterprises, it becomes crucial to efficiently support users' queries on those documents. In that situation, the full-text search technique is accepted in general, because it can answer uncontrolled ad-hoc queries by automatically indexing all the keywords found in the documents. The size of index files made for full-text searches grows with the increasing number of indexed documents, and thus the disk cost may be too large to process multi-keyword queries against those enlarged index files. To solve the problem, we propose both of the index file structure and its management scheme suitable to the processing of multi-keyword queries against a large volume of index files. For this, we adopt the structure of inverted-files, which are widely used in the multi-keyword searches, as a basic index structure and modify it to a hierarchical structure for join operations and ranking operations performed during the query processing. In order to save disk costs based on that index structure, we dynamically store in the main memory the results of join operations between two keywords, if they are highly expected to be entered in users' queries. We also do performance comparisons using a cost model of the disk to show the performance advantage of the proposed scheme.

A New Web Cluster Scheme for Load Balancing among Internet Servers (인터넷 환경에서 서버간 부하 분산을 위한 새로운 웹 클러스터 기법)

  • Kim, Seung-Young;Lee, Seung-Ho
    • The KIPS Transactions:PartC
    • /
    • v.9C no.1
    • /
    • pp.115-122
    • /
    • 2002
  • This paper presents a new web cluster scheme based on dispatcher which does not depend on operating system for server and can examine server's status interactively. Two principal functions are proposed for new web cluster technique. The one is self-controlled load distribution and the other is transaction fail-safe. Self-controlled load distribution function checks response time and status of servers periodically, then it decides where the traffic goes to guarantee rapid response for every query. Transaction fail-safe function can recover lost queries including broken transaction immediately from server errors. Proposed new web cluster scheme is implemented by C language on Unix operating system and compared with legacy web cluster products. On the comparison with broadcast based web cluster, proposed new web cluster results higher performance as more traffic comes. And on the comparison with a round-robin DNS based web cluster, it results similar performance at the case of traffic processing. But when the situation of one server crashed, proposed web cluster processed traffics more reliably without lost queries. So, new web cluster scheme Proposed on this dissertation can give alternative plan about highly increasing traffics and server load due to heavy traffics to build more reliable and utilized services.

Design of the Flexible Buffer Node Technique to Adjust the Insertion/Search Cost in Historical Index (과거 위치 색인에서 입력/검색 비용 조정을 위한 가변 버퍼 노드 기법 설계)

  • Jung, Young-Jin;Ahn, Bu-Young;Lee, Yang-Koo;Lee, Dong-Gyu;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.18D no.4
    • /
    • pp.225-236
    • /
    • 2011
  • Various applications of LBS (Location Based Services) are being developed to provide the customized service depending on user's location with progress of wireless communication technology and miniaturization of personalized device. To effectively process an amount of vehicles' location data, LBS requires the techniques such as vehicle observation, data communication, data insertion and search, and user query processing. In this paper, we propose the historical location index, GIP-FB (Group Insertion tree with Flexible Buffer Node) and the flexible buffer node technique to adjust the cost of data insertion and search. the designed GIP+ based index employs the buffer node and the projection storage to cut the cost of insertion and search. Besides, it adjusts the cost of insertion and search by changing the number of line segments of the buffer node with user defined time interval. In the experiment, the buffer node size influences the performance of GIP-FB by changing the number of non-leaf node of the index. the proposed flexible buffer node is used to adjust the performance of the historical location index depending on the applications of LBS.

Design and Implementation of a Similarity based Plant Disease Image Retrieval using Combined Descriptors and Inverse Proportion of Image Volumes (Descriptor 조합 및 동일 병명 이미지 수량 역비율 가중치를 적용한 유사도 기반 작물 질병 검색 기술 설계 및 구현)

  • Lim, Hye Jin;Jeong, Da Woon;Yoo, Seong Joon;Gu, Yeong Hyeon;Park, Jong Han
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.14 no.6
    • /
    • pp.30-43
    • /
    • 2018
  • Many studies have been carried out to retrieve images using colors, shapes, and textures which are characteristic of images. In addition, there is also progress in research related to the disease images of the crop. In this paper, to be a help to identify the disease occurred in crops grown in the agricultural field, we propose a similarity-based crop disease search system using the diseases image of horticulture crops. The proposed system improves the similarity retrieval performance compared to existing ones through the combination descriptor without using a single descriptor and applied the weight based calculation method to provide users with highly readable similarity search results. In this paper, a total of 13 Descriptors were used in combination. We used to retrieval of disease of six crops using a combination Descriptor, and a combination Descriptor with the highest average accuracy for each crop was selected as a combination Descriptor for the crop. The retrieved result were expressed as a percentage using the calculation method based on the ratio of disease names, and calculation method based on the weight. The calculation method based on the ratio of disease name has a problem in that number of images used in the query image and similarity search was output in a first order. To solve this problem, we used a calculation method based on weight. We applied the test image of each disease name to each of the two calculation methods to measure the classification performance of the retrieval results. We compared averages of retrieval performance for two calculation method for each crop. In cases of red pepper and apple, the performance of the calculation method based on the ratio of disease names was about 11.89% on average higher than that of the calculation method based on weight, respectively. In cases of chrysanthemum, strawberry, pear, and grape, the performance of the calculation method based on the weight was about 20.34% on average higher than that of the calculation method based on the ratio of disease names, respectively. In addition, the system proposed in this paper, UI/UX was configured conveniently via the feedback of actual users. Each system screen has a title and a description of the screen at the top, and was configured to display a user to conveniently view the information on the disease. The information of the disease searched based on the calculation method proposed above displays images and disease names of similar diseases. The system's environment is implemented for use with a web browser based on a pc environment and a web browser based on a mobile device environment.

Knowledge graph-based knowledge map for efficient expression and inference of associated knowledge (연관지식의 효율적인 표현 및 추론이 가능한 지식그래프 기반 지식지도)

  • Yoo, Keedong
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.4
    • /
    • pp.49-71
    • /
    • 2021
  • Users who intend to utilize knowledge to actively solve given problems proceed their jobs with cross- and sequential exploration of associated knowledge related each other in terms of certain criteria, such as content relevance. A knowledge map is the diagram or taxonomy overviewing status of currently managed knowledge in a knowledge-base, and supports users' knowledge exploration based on certain relationships between knowledge. A knowledge map, therefore, must be expressed in a networked form by linking related knowledge based on certain types of relationships, and should be implemented by deploying proper technologies or tools specialized in defining and inferring them. To meet this end, this study suggests a methodology for developing the knowledge graph-based knowledge map using the Graph DB known to exhibit proper functionality in expressing and inferring relationships between entities and their relationships stored in a knowledge-base. Procedures of the proposed methodology are modeling graph data, creating nodes, properties, relationships, and composing knowledge networks by combining identified links between knowledge. Among various Graph DBs, the Neo4j is used in this study for its high credibility and applicability through wide and various application cases. To examine the validity of the proposed methodology, a knowledge graph-based knowledge map is implemented deploying the Graph DB, and a performance comparison test is performed, by applying previous research's data to check whether this study's knowledge map can yield the same level of performance as the previous one did. Previous research's case is concerned with building a process-based knowledge map using the ontology technology, which identifies links between related knowledge based on the sequences of tasks producing or being activated by knowledge. In other words, since a task not only is activated by knowledge as an input but also produces knowledge as an output, input and output knowledge are linked as a flow by the task. Also since a business process is composed of affiliated tasks to fulfill the purpose of the process, the knowledge networks within a business process can be concluded by the sequences of the tasks composing the process. Therefore, using the Neo4j, considered process, task, and knowledge as well as the relationships among them are defined as nodes and relationships so that knowledge links can be identified based on the sequences of tasks. The resultant knowledge network by aggregating identified knowledge links is the knowledge map equipping functionality as a knowledge graph, and therefore its performance needs to be tested whether it meets the level of previous research's validation results. The performance test examines two aspects, the correctness of knowledge links and the possibility of inferring new types of knowledge: the former is examined using 7 questions, and the latter is checked by extracting two new-typed knowledge. As a result, the knowledge map constructed through the proposed methodology has showed the same level of performance as the previous one, and processed knowledge definition as well as knowledge relationship inference in a more efficient manner. Furthermore, comparing to the previous research's ontology-based approach, this study's Graph DB-based approach has also showed more beneficial functionality in intensively managing only the knowledge of interest, dynamically defining knowledge and relationships by reflecting various meanings from situations to purposes, agilely inferring knowledge and relationships through Cypher-based query, and easily creating a new relationship by aggregating existing ones, etc. This study's artifacts can be applied to implement the user-friendly function of knowledge exploration reflecting user's cognitive process toward associated knowledge, and can further underpin the development of an intelligent knowledge-base expanding autonomously through the discovery of new knowledge and their relationships by inference. This study, moreover than these, has an instant effect on implementing the networked knowledge map essential to satisfying contemporary users eagerly excavating the way to find proper knowledge to use.