• Title/Summary/Keyword: Multi-Query

Search Result 253, Processing Time 0.036 seconds

Distorted Image Database Retrieval Using Low Frequency Sub-band of Wavelet Transform (웨이블릿 변환의 저주파수 부대역을 이용한 왜곡 영상 데이터베이스 검색)

  • Park, Ha-Joong;Kim, Kyeong-Jin;Jung, Ho-Youl
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.3 no.1
    • /
    • pp.8-18
    • /
    • 2008
  • In this paper, we propose an efficient algorithm using wavelet transform for still image database retrieval. Especially, it uses only the lowest frequency sub-band in multi-level wavelet transform so that a retrieval system uses a smaller quantity of memory and takes a faster processing time. We extract different textured features, statistical information such as mean, variance and histogram, from low frequency sub-band. Then we measure the distances between the query image and the images in a database in terms of these features. To obtain good retrieval performance, we use the first feature (mean and variance of wavelet coefficients) to filter out most of the unlikely images. The rest of the images are considered to be candidate images. Then we apply the second feature (histogram of wavelet coefficient) to rank all the candidate images. To evaluate the algorithm, we create various distorted image databases using MIT VisTex texture images and PICS natural images. Through simulations, we demonstrate that our method can achieve performance satisfactorily in terms of the retrieval accuracy as well as the both memory requirement and computational complexity. Therefore it is expected to provide good retrieval solution for JPEG-2000 using wavelet transform.

  • PDF

Multi-dimensional Traveling salesman problem using Top-n Skyline query (Top-n 스카이라인 질의를 이용한 다차원 외판원 순회문제)

  • Jin, ChangGyun;Yang, Sevin;Kang, Eunjin;Kim, JiYun;Kim, Jongwan;Oh, Dukshin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.371-374
    • /
    • 2019
  • PDA나 휴대폰 단말로 여러 속성의 데이터를 이용하여 사용자에게 필요한 정보를 제공하는 위치기반 서비스는 물류/운송 정보 서비스, 버스/지하철 노선 안내 서비스 등에 사용된다. 여기에서 제공하는 데이터들을 최적 경로를 구하는 외판원 순회문제 (Traveling Salesman Problem)에 사용한다면 더 정확한 경로 서비스 제공이 가능하다. 하지만 데이터의 수가 많아질수록 비교 횟수가 기하급수적으로 늘어나는 외판원 순회 알고리즘의 특성상 일반 단말기에서 활용하기에는 배터리의 제약이 따른다. 본 논문에서는 이와 같은 단점을 해결하기 위해서 최적 경로의 후보군을 줄일 수 있는 스카이라인 질의를 이용하여 n차원 속성에 대한 최적 경로 알고리즘을 제안한다. 실험에서 정확도와 오차율을 통해 제안한 방식의 유용성을 보였으며 기존방식과 연산시간 차이를 비교하여 다차원방식의 효율성을 나타내었다.

Spanning Tree Aggregation Using Attribute of Service Boundary Line (서비스경계라인 속성을 이용한 스패닝 트리 집단화)

  • Kwon, So-Ra;Jeon, Chang-Ho
    • The KIPS Transactions:PartC
    • /
    • v.18C no.6
    • /
    • pp.441-444
    • /
    • 2011
  • In this study, we present a method for efficiently aggregating network state information. It is especially useful for aggregating links that have both delay and bandwidth in an asymmetric network. Proposed method reduces the information distortion of logical link by integration process after similar measure and grouping of logical links in multi-level topology transformation to reduce the space complexity. It is applied to transform the full mesh topology whose Service Boundary Line (SBL) serves as its logical link into a spanning tree topology. Simulation results show that aggregated information accuracy and query response accuracy are higher than that of other known method.

A Multi-Stage Approach to Secure Digital Image Search over Public Cloud using Speeded-Up Robust Features (SURF) Algorithm

  • AL-Omari, Ahmad H.;Otair, Mohammed A.;Alzwahreh, Bayan N.
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.12
    • /
    • pp.65-74
    • /
    • 2021
  • Digital image processing and retrieving have increasingly become very popular on the Internet and getting more attention from various multimedia fields. That results in additional privacy requirements placed on efficient image matching techniques in various applications. Hence, several searching methods have been developed when confidential images are used in image matching between pairs of security agencies, most of these search methods either limited by its cost or precision. This study proposes a secure and efficient method that preserves image privacy and confidentially between two communicating parties. To retrieve an image, feature vector is extracted from the given query image, and then the similarities with the stored database images features vector are calculated to retrieve the matched images based on an indexing scheme and matching strategy. We used a secure content-based image retrieval features detector algorithm called Speeded-Up Robust Features (SURF) algorithm over public cloud to extract the features and the Honey Encryption algorithm. The purpose of using the encrypted images database is to provide an accurate searching through encrypted documents without needing decryption. Progress in this area helps protect the privacy of sensitive data stored on the cloud. The experimental results (conducted on a well-known image-set) show that the performance of the proposed methodology achieved a noticeable enhancement level in terms of precision, recall, F-Measure, and execution time.

A Study on Protecting Privacy of Machine Learning Models

  • Lee, Younghan;Han, Woorim;Cho, Yungi;Kim, Hyunjun;Paek, Yunheung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.61-63
    • /
    • 2021
  • Machine learning model gained the popularity in recent years as multi-national companies have incorporated machine learning in their services. Such service is called machine learning as a service (MLaSS). Such services are provided to users based on charge-per-query which triggers the motivations for adversaries to steal the trained victim model to reduce the cost of using the service. Therefore, it is important for companies that provide MLaSS to protect their intellectual property (IP) against adversaries. It has been arms race between the attack and defence in a context of the privacy of machine learning models. In this paper, we provide a comprehensive study of recent development in protecting privacy of machine learning models.

A Study on Utilizing Raspberry Pi and Multi-Sensors for Effective Time Management in Shared Spaces (공유 공간에서의 효과적인 사용 시간 관리를 위한 라즈베리 파이와 다중 센서 활용에 관한 연구)

  • Sung Jin Kim;Hyun Bin Jeong;Chae Ryeong Ahn;Hyeon Bin Yang;Da Hyeon Kim;Ju Heon Lee;Jai Soon Baek
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.661-664
    • /
    • 2023
  • 이 프로젝트의 목표는 공유 공간에서의 고객 이용 시간 관리를 향상하기 위해 라즈베리 파이와 센서 기술을 활용한 자동 시간 체크 시스템의 개발입니다. 이 시스템은 로드셀, 진동 감지 센서, 그리고 LiDAR 센서, 이 세 가지 센서를 활용해 의자에서 사용자의 존재 여부를 감지하고, 사람과 물건을 구별하며, 사용자가 의자에서 일어나는 시점을 파악합니다. 특히, 고객이 의자에 앉게 되면 시스템이 자동으로 시간을 체크하여 실시간으로 이용 시간을 측정하게 됩니다. 이렇게 수집된 정보는 웹 기반의 사용자 인터페이스를 통해 제공되어, 이용 시간 관리가 보다 편리해집니다.

  • PDF

Asynchronous Communication Technique for Heavy Data Output Performance Improvement on Multi Tier Online Service Environment (다중 Tier 온라인 서비스 상에서 대량 데이터 출력 성능 향상을 위한 비동기 통신 기법)

  • Sung-Lyong Kim;Jae-Oh Oh;Yoon-Ho Jo;Sang-Keun Lee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.1195-1198
    • /
    • 2008
  • 본 논문은 다중 Tier 상에서 온라인 서비스 대량 데이타 처리를 빠르고 정확하게 클라이언트에 전달하는 기법을 제안한다. Tier 가 많은 온라인 서비스상에서 대량의 데이타를 빠르게 처리하는 데에는 많은 어려움이 있다. Tier 간 지연 시간의 최소화, 네트워크 대역폭를 고려한 트란잭션(Transaction)의 적절한 분할 통신, 이 기종간의 데이타 변환 시 처리속도 개선 등이 해결해야 할 주요한 요건이라고 할 수 있다. 하지만 이러한 문제들이 해결된다고 해서 괄목할 만한 성능의 개선은 쉽게 나타나지 않는다. 그 이유는 바로 Partial Query에 의한 데이타 통신이 꾸준히 반복 발생하기 때문이다. 온라인 서비스의 특성상 대량 데이타는 많은 사용자의 효율적인 트란잭션 처리를 위하여 분할(Partial) 처리되어 통신하는 방식을 기준으로 사용하고 있다. 이러한 방식을 준수 하기 위해서는 데이타 사이즈에 비례하는 반복의 증가가 불가피하다. 그래서 반복 횟수를 줄이는데 포커스를 두고 온라인 서비스 대량 데이타 처리에 대한 성능 데스트를 진행한 결과 반복이 최소화 될수록 성능은 최대한으로 유지되며, 다른 어떤 기술적인 요소를 개선하는 것보다 큰 효과를 볼 수 있음을 알 수 있었다.

A Dynamic Locality Sensitive Hashing Algorithm for Efficient Security Applications

  • Mohammad Y. Khanafseh;Ola M. Surakhi
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.5
    • /
    • pp.79-88
    • /
    • 2024
  • The information retrieval domain deals with the retrieval of unstructured data such as text documents. Searching documents is a main component of the modern information retrieval system. Locality Sensitive Hashing (LSH) is one of the most popular methods used in searching for documents in a high-dimensional space. The main benefit of LSH is its theoretical guarantee of query accuracy in a multi-dimensional space. More enhancement can be achieved to LSH by adding a bit to its steps. In this paper, a new Dynamic Locality Sensitive Hashing (DLSH) algorithm is proposed as an improved version of the LSH algorithm, which relies on employing the hierarchal selection of LSH parameters (number of bands, number of shingles, and number of permutation lists) based on the similarity achieved by the algorithm to optimize searching accuracy and increasing its score. Using several tampered file structures, the technique was applied, and the performance is evaluated. In some circumstances, the accuracy of matching with DLSH exceeds 95% with the optimal parameter value selected for the number of bands, the number of shingles, and the number of permutations lists of the DLSH algorithm. The result makes DLSH algorithm suitable to be applied in many critical applications that depend on accurate searching such as forensics technology.

Dynamic Management of Equi-Join Results for Multi-Keyword Searches (다중 키워드 검색에 적합한 동등조인 연산 결과의 동적 관리 기법)

  • Lim, Sung-Chae
    • The KIPS Transactions:PartA
    • /
    • v.17A no.5
    • /
    • pp.229-236
    • /
    • 2010
  • With an increasing number of documents in the Internet or enterprises, it becomes crucial to efficiently support users' queries on those documents. In that situation, the full-text search technique is accepted in general, because it can answer uncontrolled ad-hoc queries by automatically indexing all the keywords found in the documents. The size of index files made for full-text searches grows with the increasing number of indexed documents, and thus the disk cost may be too large to process multi-keyword queries against those enlarged index files. To solve the problem, we propose both of the index file structure and its management scheme suitable to the processing of multi-keyword queries against a large volume of index files. For this, we adopt the structure of inverted-files, which are widely used in the multi-keyword searches, as a basic index structure and modify it to a hierarchical structure for join operations and ranking operations performed during the query processing. In order to save disk costs based on that index structure, we dynamically store in the main memory the results of join operations between two keywords, if they are highly expected to be entered in users' queries. We also do performance comparisons using a cost model of the disk to show the performance advantage of the proposed scheme.

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.