• Title/Summary/Keyword: Probabilistic search

Search Result 99, Processing Time 0.021 seconds

Incorporating Deep Median Networks for Arabic Document Retrieval Using Word Embeddings-Based Query Expansion

  • Yasir Hadi Farhan;Mohanaad Shakir;Mustafa Abd Tareq;Boumedyen Shannaq
    • Journal of Information Science Theory and Practice
    • /
    • v.12 no.3
    • /
    • pp.36-48
    • /
    • 2024
  • The information retrieval (IR) process often encounters a challenge known as query-document vocabulary mismatch, where user queries do not align with document content, impacting search effectiveness. Automatic query expansion (AQE) techniques aim to mitigate this issue by augmenting user queries with related terms or synonyms. Word embedding, particularly Word2Vec, has gained prominence for AQE due to its ability to represent words as real-number vectors. However, AQE methods typically expand individual query terms, potentially leading to query drift if not carefully selected. To address this, researchers propose utilizing median vectors derived from deep median networks to capture query similarity comprehensively. Integrating median vectors into candidate term generation and combining them with the BM25 probabilistic model and two IR strategies (EQE1 and V2Q) yields promising results, outperforming baseline methods in experimental settings.

Using Skip Lists for Managing Replying Comments Posted on Internet Discussion Boards (스킵리스트를 이용한 인터넷 토론 게시판 댓글 관리)

  • Lee, Yun-Jung;Kim, Eun-Kyung;Cho, Hwan-Gue;Woo, Gyun
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.8
    • /
    • pp.38-50
    • /
    • 2010
  • In recent years, the number of users who are actively express their opinions about Internet articles is more and more growing up, as the use of cyber community such as weblog or Internet discussion board increases. In fact, it is not difficult to find an article with hundreds of comments in famous Internet discussion boards. Most of the weblogs or Internet discussion boards present comments in the form of list and do not yet support even the basic operation such as searching comments. In this paper, we analysed large sets of comments in Internet discussion board named AGORA. It was found that from the result that the distribution of comment writers follows power-law. So we suppose a new search structure of comments using skip lists. The main idea of our approach is to reflect the probabilistic distribution properties of the commenters following the power-law to the data structure. Our empirical results show that the proposed method performs more efficient in searching the nodes with fewer number of comparison operations than logN, which is the theoretical time complexity of general indexed structure such as B-trees or typical skip lists.

Chaotic particle swarm optimization in optimal active control of shear buildings

  • Gharebaghi, Saeed Asil;Zangooeia, Ehsan
    • Structural Engineering and Mechanics
    • /
    • v.61 no.3
    • /
    • pp.347-357
    • /
    • 2017
  • The applications of active control is being more popular nowadays. Several control algorithms have been developed to determine optimum control force. In this paper, a Chaotic Particle Swarm Optimization (CPSO) technique, based on Logistic map, is used to compute the optimum control force of active tendon system. A chaotic exploration is used to search the solution space for optimum control force. The response control of Multi-Degree of Freedom (MDOF) shear buildings, equipped with active tendons, is introduced as an optimization problem, based on Instantaneous Optimal Active Control algorithm. Three MDOFs are simulated in this paper. Two examples out of three, which have been previously controlled using Lattice type Probabilistic Neural Network (LPNN) and Block Pulse Functions (BPFs), are taken from prior works in order to compare the efficiency of the current method. In the present study, a maximum allowable value of control force is added to the original problem. Later, a twenty-story shear building, as the third and more realistic example, is considered and controlled. Besides, the required Central Processing Unit (CPU) time of CPSO control algorithm is investigated. Although the CPU time of LPNN and BPFs methods of prior works is not available, the results show that a full state measurement is necessary, especially when there are more than three control devices. The results show that CPSO algorithm has a good performance, especially in the presence of the cut-off limit of tendon force; therefore, can widely be used in the field of optimum active control of actual buildings.

Design of RFID Air Protocol Filtering and Probabilistic Simulation of Identification Procedure (RFID 무선 프로토콜 필터링의 설계와 확률적 인식 과정 시뮬레이션)

  • Park, Hyun-Sung;Kim, Jong-Deok
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.6B
    • /
    • pp.585-594
    • /
    • 2009
  • Efficient filtering is an important factor in RFID system performance. Because of huge volume of tag data in future ubiquitous environment, if RFID readers transmit tag data without filtering to upper-layer applications, which results in a significant system performance degradation. In this paper, we provide an efficient filtering technique which operates on RFID air protocol. RFID air protocol filtering between tags and a reader has some advantages over filtering in readers and middleware, because air protocol filtering reduces the volume of filtering work before readers and middleware start filtering. Exploiting the air protocol filtering advantage, we introduce a geometrical algorithm for generating air protocol filters and verify their performance through simulation with analytical time models. Results of dense RFID reader environment show that air protocol filtering algorithms reduce almost a half of the total filtering time when compared to the results of linear search.

Fire-Smoke Detection Based on Video using Dynamic Bayesian Networks (동적 베이지안 네트워크를 이용한 동영상 기반의 화재연기감지)

  • Lee, In-Gyu;Ko, Byung-Chul;Nam, Jae-Yeol
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.4C
    • /
    • pp.388-396
    • /
    • 2009
  • This paper proposes a new fire-smoke detection method by using extracted features from camera images and pattern recognition technique. First, moving regions are detected by analyzing the frame difference between two consecutive images and generate candidate smoke regions by applying smoke color model. A smoke region generally has a few characteristics such as similar color, simple texture and upward motion. From these characteristics, we extract brightness, wavelet high frequency and motion vector as features. Also probability density functions of three features are generated using training data. Probabilistic models of smoke region are then applied to observation nodes of our proposed Dynamic Bayesian Networks (DBN) for considering time continuity. The proposed algorithm was successfully applied to various fire-smoke tasks not only forest smokes but also real-world smokes and showed better detection performance than previous method.

Reliability Based Design Optimization for the Pressure Recovery of Supersonic Double-Wedge Inlet (이중 쐐기형 초음속 흡입구의 압력회복률에 대한 신뢰성 기반 최적설계)

  • Lee, Chang-Hyuck;Ahn, Joong-Ki;Bae, Hyo-Gil;Kwon, Jang-Hyuk
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.38 no.11
    • /
    • pp.1067-1074
    • /
    • 2010
  • In this study, RBDO(Reliability Based Design Optimization) was performed for a supersonic double-wedge inlet. By considering uncertainty of design with given design space, the pressure recovery was transformed into the probabilistic constraint while the inlet drag was considered as a deterministic objective function. To save computational analysis cost and to search good design space, Latin-Hypercube design of experiment and the Kriging model were incorporated and then RBDO was performed. Monte-Carlo simulation was performed to verify the accuracy of AFORM(Advanced First Order Reliability Method). It was found that AFORM result agreed very well with the Monte-Carlo simulation result. The system reliability was guaranteed by considering uncertainty of the design variables. In case of considering diverse uncertainty of system design, RBDO was found to be useful.

A Stochastic Transit Assignment Model for Intercity Rail Network (지역간 철도의 확률적 통행배정모형 구측 연구)

  • Kwon, Yong-Seok;Kim, Kyoung-Tae;Lim, Chong-Hoon
    • Journal of the Korean Society for Railway
    • /
    • v.12 no.4
    • /
    • pp.488-498
    • /
    • 2009
  • The characteristics of intercity rail network are different from those of public transit network in urban area. In this paper, we proposed a new transit assignment model which is generalized form of deterministic assignment model by introducing line selection probability on route section. This model consider various characteristics of intercity rail and simplify network expansion for appling search algorithms developed in road assignment model. We showed the model availability by comparing with existing models using virtual networks. The tests on a small scale network show that this model is superior to existing models for predicting intercity rail demand.

Efficient Management of Statistical Information of Keywords on E-Catalogs (전자 카탈로그에 대한 효율적인 색인어 통계 정보 관리 방법)

  • Lee, Dong-Joo;Hwang, In-Beom;Lee, Sang-Goo
    • The Journal of Society for e-Business Studies
    • /
    • v.14 no.4
    • /
    • pp.1-17
    • /
    • 2009
  • E-Catalogs which describe products or services are one of the most important data for the electronic commerce. E-Catalogs are created, updated, and removed in order to keep up-to-date information in e-Catalog database. However, when the number of catalogs increases, information integrity is violated by the several reasons like catalog duplication and abnormal classification. Catalog search, duplication checking, and automatic classification are important functions to utilize e-Catalogs and keep the integrity of e-Catalog database. To implement these functions, probabilistic models that use statistics of index words extracted from e-Catalogs had been suggested and the feasibility of the methods had been shown in several papers. However, even though these functions are used together in the e-Catalog management system, there has not been enough consideration about how to share common data used for each function and how to effectively manage statistics of index words. In this paper, we suggest a method to implement these three functions by using simple SQL supported by relational database management system. In addition, we use materialized views to reduce the load for implementing an application that manages statistics of index words. This brings the efficiency of managing statistics of index words by putting database management systems optimize statistics updating. We showed that our method is feasible to implement three functions and effective to manage statistics of index words with empirical evaluation.

  • PDF

Enhancing the Satisfaction Value of User Group Using Meteorological Forecast Information: Focused on the Precipitation Forecast (기상예보 정보 사용자 그룹의 만족가치 제고 방안: 강수예보를 중심으로)

  • Kim, In-Gyum;Jung, Jihoon;Kim, Jeong-Yun;Shin, Jinho;Kim, Baek-Jo;Lee, Ki-Kwang
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.11
    • /
    • pp.382-395
    • /
    • 2013
  • The providers of meteorological information want to know the level of satisfaction of forecast users with their services. To provide better service, meteorological communities of each nation are administering a survey on satisfaction of forecast users. However, most researchers provided these users with simple questionnaires and the respondents had to choose one answer among different satisfaction levels. So, the results of this kind of survey have low explanation power and are difficult to use in developing strategy of forecast service. In this study, instead of cost-loss concept, we applied satisfaction-dissatisfaction concept to the $2{\times}2$ contingency table, which is a useful tool to evaluate value of forecast, and estimated satisfaction value of 24h precipitation forecasts in Shanghai, China and Seoul, Korea. Moreover, not only the individual satisfaction value of forecast but the user group's satisfaction value was evaluated. As for the result, it is effective to enhance forecast accuracy to improve the satisfaction value of deterministic forecast user group, but in the case of probabilistic forecast, it is important to know the level of dissatisfaction of user group and distribution of probability threshold of forecast users. These results can help meteorological communities to search for a solution which can provide better satisfaction value to forecast users.