• Title/Summary/Keyword: Web search engines

Search Result 210, Processing Time 0.046 seconds

A Search Efficiency Improvement Method using Internal Contiguity in Query Terms (질의 내부 단어 인접도를 이용한 검색 효율 향상 기법)

  • Yoon, Soung-Woong;Chae, Jin-Ki;Lee, Sang-Hoon
    • Journal of KIISE:Databases
    • /
    • v.35 no.2
    • /
    • pp.192-198
    • /
    • 2008
  • It is difficult to get relevant information on vast Web data. Search engines summarize and store Web information and show the ranked lists based on user queries affected by relative importance and user-adaptation. But these have limitation with showing user-intended information at the top priority. User intention is presented in general within query itself. In this paper, we propose the selective rankup methodology of user-intended search results based on weighting internal contiguity in query terms. With experimental results, we can find user-intended results with 75.8% probability using this simple method only, and efficiency of rerank proposed outperforms ordinary case by $13{\sim}20%$.

Semantic Conceptual Relational Similarity Based Web Document Clustering for Efficient Information Retrieval Using Semantic Ontology

  • Selvalakshmi, B;Subramaniam, M;Sathiyasekar, K
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.9
    • /
    • pp.3102-3119
    • /
    • 2021
  • In the modern rapid growing web era, the scope of web publication is about accessing the web resources. Due to the increased size of web, the search engines face many challenges, in indexing the web pages as well as producing result to the user query. Methodologies discussed in literatures towards clustering web documents suffer in producing higher clustering accuracy. Problem is mitigated using, the proposed scheme, Semantic Conceptual Relational Similarity (SCRS) based clustering algorithm which, considers the relationship of any document in two ways, to measure the similarity. One is with the number of semantic relations of any document class covered by the input document and the second is the number of conceptual relation the input document covers towards any document class. With a given data set Ds, the method estimates the SCRS measure for each document Di towards available class of documents. As a result, a class with maximum SCRS is identified and the document is indexed on the selected class. The SCRS measure is measured according to the semantic relevancy of input document towards each document of any class. Similarly, the input query has been measured for Query Relational Semantic Score (QRSS) towards each class of documents. Based on the value of QRSS measure, the document class is identified, retrieved and ranked based on the QRSS measure to produce final population. In both the way, the semantic measures are estimated based on the concepts available in semantic ontology. The proposed method had risen efficient result in indexing as well as search efficiency also has been improved.

Design and Implementation of Web Directory Engine Using Dynamic Category Hierarchy (동적분류에 의한 주제별 웹 검색엔진의 설계 및 구현)

  • Choi Bum-Ghi;Park Sun;Park Tae-Su;Song Jae-Won;Lee Ju-Hong
    • Journal of Internet Computing and Services
    • /
    • v.7 no.2
    • /
    • pp.71-80
    • /
    • 2006
  • In web search engines, there are two main methods: directory searching and keyword searching. Keyword searching shows high recall rate but tends to come up with too many search results to find which users want to see the pages. Directory searching has also a difficulty to find the pages that users want in case of selecting improper category without knowing the exact category, that is, it shows high precision rates but low recall rates. We designed and implemented a new web search engine to resolve the problems of directory search method. It regards a category as a fuzzy set which contains keywords and calculate the degree of inclusion between categories. The merit of this method is to enhance the recall rate of directory searching by expanding subcategories on the basis of similarity.

  • PDF

Personalized Bookmark Search Word Recommendation System based on Tag Keyword using Collaborative Filtering (협업 필터링을 활용한 태그 키워드 기반 개인화 북마크 검색 추천 시스템)

  • Byun, Yeongho;Hong, Kwangjin;Jung, Keechul
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.11
    • /
    • pp.1878-1890
    • /
    • 2016
  • Web 2.0 has features produced the content through the user of the participation and share. The content production activities have became active since social network service appear. The social bookmark, one of social network service, is service that lets users to store useful content and share bookmarked contents between personal users. Unlike Internet search engines such as Google and Naver, the content stored on social bookmark is searched based on tag keyword information and unnecessary information can be excluded. Social bookmark can make users access to selected content. However, quick access to content that users want is difficult job because of the user of the participation and share. Our paper suggests a method recommending search word to be able to access quickly to content. A method is suggested by using Collaborative Filtering and Jaccard similarity coefficient. The performance of suggested system is verified with experiments that compare by 'Delicious' and "Feeltering' with our system.

A Preliminary Examination on the Multimedia Information Needs and Web Searches of College Students in Korea

  • Chung, Eun-Kyung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.44 no.4
    • /
    • pp.95-114
    • /
    • 2010
  • Multimedia searching is an important activity on the Web, especially among the younger generation. The purpose of this study aims to examine college students’ multimedia information needs and searching on the Internet. While there is a clear pattern among students with respect to their multimedia uses, searching sources, relevance criteria and searching barriers, some differences exist especially according to searching of different multimedia types such as image, audio and video. For multimedia uses, information/data-focused uses are frequently found in image and video, while the use of audio is mainly for object-focused searches. As multimedia searching sources, audio and video files present a similar pattern of being high in media specific searching sources and low in generic search engines. Browsing through related blogs and homepages is an important part of searching for media files accounting for approximately 20% of total search for each media. The relevance criteria used by study participants when search for image files was primarily concerned with topicality while the contextual and media quality in the audio and video types are also considered important. Searching barriers for audio and video files are categorized into three broad aspects, including access and search quality, preview limitations and collection limitations, while obstacles for image files searching include access difficulties and low qualities of various collection.

An Evaluation of Twitter Ranking Using the Retweet Information (재전송 정보를 활용한 트위터 랭킹의 정확도 평가)

  • Chang, Jae-Young
    • The Journal of Society for e-Business Studies
    • /
    • v.17 no.2
    • /
    • pp.73-85
    • /
    • 2012
  • Recently, as Social Network Services(SNS), such as Twitter, Facebook, are becoming more popular, much research has been doing actively. However, since SNS has been launched recently, related researches are also infant level. Especially, search engines serviced in web potals simply show the postings in order of upload time. Searching the postings in Twitter should be different from web search, which is based on traditional TF-IDF. In this paper, we present the new method of searching and ranking the interesting postings in Twitter. In proposed method, we utilize the frequency of retweets as a major factor for estimating the quality of postings. It can be an important criteria since users tend to retweet the valuable postings. Experimental results show that proposed method can be applied successfully in Twitter search system.

Clustering Representative Annotations for Image Browsing (이미지 브라우징 처리를 위한 전형적인 의미 주석 결합 방법)

  • Zhou, Tie-Hua;Wang, Ling;Lee, Yang-Koo;Ryu, Keun-Ho
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2010.06c
    • /
    • pp.62-65
    • /
    • 2010
  • Image annotations allow users to access a large image database with textual queries. But since the surrounding text of Web images is generally noisy. an efficient image annotation and retrieval system is highly desired. which requires effective image search techniques. Data mining techniques can be adopted to de-noise and figure out salient terms or phrases from the search results. Clustering algorithms make it possible to represent visual features of images with finite symbols. Annotationbased image search engines can obtains thousands of images for a given query; but their results also consist of visually noise. In this paper. we present a new algorithm Double-Circles that allows a user to remove noise results and characterize more precise representative annotations. We demonstrate our approach on images collected from Flickr image search. Experiments conducted on real Web images show the effectiveness and efficiency of the proposed model.

  • PDF

The Interactive Voice Services based on VoiceXML (VoiceXML 기반 음성인식시스템을 이용한 서비스 개발)

  • Kim Hak-Gyoon;Kim Eun-Hyang;Kim Jae-In;Koo Myoung-Wan
    • MALSORI
    • /
    • no.43
    • /
    • pp.113-125
    • /
    • 2002
  • As there are needs to search the Web information via wire or wireless telephones, VoiceXML forum was established to develop and promote the Voice eXtensible Markup Language (VoiceXML). VoiceXML simplifies the creation of personalized interactive voice response services on the Web, and allows voice and phone access to information on Web sites, call center databases. Also, it can utilize the Web-based technologies, such as CGI(Common Gateway Interface) scripts. In this paper, we have developed the voice portal service platform based on VoiceXML called TeleGateway. It enables integration of voice services with data services using the Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) engines. Also, we have showed the various services on voice portal services.

  • PDF

PDFindexer: Distributed PDF Indexing system using MapReduce

  • Murtazaev, JAziz;Kihm, Jang-Su;Oh, Sangyoon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.4 no.1
    • /
    • pp.13-17
    • /
    • 2012
  • Indexing allows converting raw document collection into easily searchable representation. Web searching by Google or Yahoo provides subsecond response time which is made possible by efficient indexing of web-pages over the entire Web. Indexing process gets challenging when the scale gets bigger. Parallel techniques, such as MapReduce framework can assist in efficient large-scale indexing process. In this paper we propose PDFindexer, system for indexing scientific papers in PDF using MapReduce programming model. Unlike Web search engines, our target domain is scientific papers, which has pre-defined structure, such as title, abstract, sections, references. Our proposed system enables parsing scientific papers in PDF recreating their structure and performing efficient distributed indexing with MapReduce framework in a cluster of nodes. We provide the overview of the system, their components and interactions among them. We discuss some issues related with the design of the system and usage of MapReduce in parsing and indexing of large document collection.

A Semantic Search System based on Basic Ontology of Traditional Korean Medicine (한의 기초 온톨로지 기반 시맨틱 검색 시스템)

  • Kim, Sang-Kyun;Jang, Hyun-Chul;Kim, Jin-Hyun;Kim, Chul;Yea, Sang-Jun;Song, Mi-Young
    • Korean Journal of Oriental Medicine
    • /
    • v.17 no.2
    • /
    • pp.57-62
    • /
    • 2011
  • We in this paper propose a semantic search system using the basic ontology in Korean medicine field. The basic ontology provides a formalization of medicinal materials, formulas, and diseases of Korean medicine. Recently, many studies for the semantic search system have been proposed. However, they do not support the semantic search and reasoning in the domain of Korean medicine because they do not have the Korean medicine ontology. Our system provides the semantic search features of semantic keyword recommendation, associated information browsing, and ontology reasoning based on the basic ontology. In addition, they also have the features of ontology search of a form of table and graph, synonym search, and external Open API supports. The general search engines usually provide search results for the simple keyword, while our system can also provide the associated information with respect to search results by using ontology so that can recommend more exact results to users.