• Title/Summary/Keyword: Web Search Engine

Search Result 249, Processing Time 0.022 seconds

A Keyword Search Model based on the Collected Information of Web Users (웹 사용자 누적 사용정보 기반의 키워드 검색 모델)

  • Yoon, Sung-Hee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.7 no.4
    • /
    • pp.777-782
    • /
    • 2012
  • This paper proposes a technique for improving performance using word senses and user feedback in web information retrieval, compared with the retrieval based on ambiguous user query and index. Disambiguation using query word senses can eliminating the irrelevant pages from the search result. According to semantic categories of nouns which are used as index for retrieval, we build the word sense knowledge-base and categorize the web pages. It can improve the precision of retrieval system with user feedback deciding the query sense and information seeking behavior to pages.

Odysseus/Parallel-OOSQL: A Parallel Search Engine using the Odysseus DBMS Tightly-Coupled with IR Capability (오디세우스/Parallel-OOSQL: 오디세우스 정보검색용 밀결합 DBMS를 사용한 병렬 정보 검색 엔진)

  • Ryu, Jae-Joon;Whang, Kyu-Young;Lee, Jae-Gil;Kwon, Hyuk-Yoon;Kim, Yi-Reun;Heo, Jun-Suk;Lee, Ki-Hoon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.412-429
    • /
    • 2008
  • As the amount of electronic documents increases rapidly with the growth of the Internet, a parallel search engine capable of handling a large number of documents are becoming ever important. To implement a parallel search engine, we need to partition the inverted index and search through the partitioned index in parallel. There are two methods of partitioning the inverted index: 1) document-identifier based partitioning and 2) keyword-identifier based partitioning. However, each method alone has the following drawbacks. The former is convenient in inserting documents and has high throughput, but has poor performance for top h query processing. The latter has good performance for top-k query processing, but is inconvenient in inserting documents and has low throughput. In this paper, we propose a hybrid partitioning method to compensate for the drawback of each method. We design and implement a parallel search engine that supports the hybrid partitioning method using the Odysseus DBMS tightly coupled with information retrieval capability. We first introduce the architecture of the parallel search engine-Odysseus/parallel-OOSQL. We then show the effectiveness of the proposed system through systematic experiments. The experimental results show that the query processing time of the document-identifier based partitioning method is approximately inversely proportional to the number of blocks in the partition of the inverted index. The results also show that the keyword-identifier based partitioning method has good performance in top-k query processing. The proposed parallel search engine can be optimized for performance by customizing the methods of partitioning the inverted index according to the application environment. The Odysseus/parallel OOSQL parallel search engine is capable of indexing, storing, and querying 100 million web documents per node or tens of billions of web documents for the entire system.

A Study on the Organizing Directory for Internet Directory Search Engines (인터넷 검색엔진의 디렉토리 구성에 관한 연구)

  • 신동민
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.2
    • /
    • pp.143-164
    • /
    • 2001
  • The purpose of this study is to suggest the guidelines for organizing and maintaining their subject directory search engines which serve effective search results of web documents. The methods of this study are to review and analyze some directory search engines for finding problems of them and review literatures concerned with classification theory and previous studies. As results, this study suggests the guidelines for preparing the systematic subject directory scheme that is adapted to the internet directory search engines related to general andlor special subject fields through using the above guidelines.

  • PDF

Dynamic Classification of Categories in Web Search Environment (웹 검색 환경에서 범주의 동적인 분류)

  • Choi Bum-Ghi;Lee Ju-Hong;Park Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.7
    • /
    • pp.646-654
    • /
    • 2006
  • Directory searching and index searching methods are two main methods in web search engines. Both of the methods are applied to most of the well-known Internet search engines, which enable users to choose the other method if they are not satisfied with results shown by one method. That is, Index searching tends to come up with too many search results, while directory searching has a difficulty in selecting proper categories, frequently mislead to false ones. In this paper, we propose a novel method in which a category hierarchy is dynamically constructed. To do this, a category is regarded as a fuzzy set which includes keywords. Similarly extensible subcategories of a category can be found using fuzzy relational products. The merit of this method is to enhance the recall rate of directory search by expanding subcategories on the basis of similarity.

Analysis and Design for the System of Korean Web Document Classification (웹문서분류체계의 분석 및 새로운 설계)

  • Nam Young-Joon
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.32 no.3
    • /
    • pp.207-230
    • /
    • 1998
  • Because of a rapid increase of information available through web site, a user often falls into confusion of which web sites should be visited for his information needs. If a web site search engine can classify web sites according to their subject or topics, it can help the user to determine which web sites are worth accessing and thus to easily acquire relevant information. In this study, I propose new classifying system with a two level hierarchy and 57 items.

  • PDF

Improving Twitter Search Function Using Twitter API (트위터 API를 활용한 트위터 검색 기능 개선)

  • Nam, Yong-Wook;Kim, Yong-Hyuk
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.8 no.3
    • /
    • pp.879-886
    • /
    • 2018
  • The basic search engine on Twitter shows not only tweets that contain search keywords, but also all articles written by users with nicknames containing search keywords. Since the tweets unrelated to the search keyword are exposed as search results, it is inconvenient to many users who want to search only tweets that include the keyword. To solve this inconvenience, this study improved the search function of Twitter by developing an algorithm that searches only tweets that contain search keywords. The improved functionality is implemented as a Web service using ASP.NET MVC5 and is available to many users. We used a powerful collection method in C# to retrieve the results of an object, and it was also possible to output them according to the number of 'retweets' or 'favorites'. If the number of retrieved numbers is less than a given number, we also added an exclusion filter function. Thus, sorting search results by the number of retweets or favorites, user can quickly search for opinions that are of interest to many users. It is expected that many users and data analysts will find the developed function convenient to search on Twitter.

Development of Yóukè Mining System with Yóukè's Travel Demand and Insight Based on Web Search Traffic Information (웹검색 트래픽 정보를 활용한 유커 인바운드 여행 수요 예측 모형 및 유커마이닝 시스템 개발)

  • Choi, Youji;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.155-175
    • /
    • 2017
  • As social data become into the spotlight, mainstream web search engines provide data indicate how many people searched specific keyword: Web Search Traffic data. Web search traffic information is collection of each crowd that search for specific keyword. In a various area, web search traffic can be used as one of useful variables that represent the attention of common users on specific interests. A lot of studies uses web search traffic data to nowcast or forecast social phenomenon such as epidemic prediction, consumer pattern analysis, product life cycle, financial invest modeling and so on. Also web search traffic data have begun to be applied to predict tourist inbound. Proper demand prediction is needed because tourism is high value-added industry as increasing employment and foreign exchange. Among those tourists, especially Chinese tourists: Youke is continuously growing nowadays, Youke has been largest tourist inbound of Korea tourism for many years and tourism profits per one Youke as well. It is important that research into proper demand prediction approaches of Youke in both public and private sector. Accurate tourism demands prediction is important to efficient decision making in a limited resource. This study suggests improved model that reflects latest issue of society by presented the attention from group of individual. Trip abroad is generally high-involvement activity so that potential tourists likely deep into searching for information about their own trip. Web search traffic data presents tourists' attention in the process of preparation their journey instantaneous and dynamic way. So that this study attempted select key words that potential Chinese tourists likely searched out internet. Baidu-Chinese biggest web search engine that share over 80%- provides users with accessing to web search traffic data. Qualitative interview with potential tourists helps us to understand the information search behavior before a trip and identify the keywords for this study. Selected key words of web search traffic are categorized by how much directly related to "Korean Tourism" in a three levels. Classifying categories helps to find out which keyword can explain Youke inbound demands from close one to far one as distance of category. Web search traffic data of each key words gathered by web crawler developed to crawling web search data onto Baidu Index. Using automatically gathered variable data, linear model is designed by multiple regression analysis for suitable for operational application of decision and policy making because of easiness to explanation about variables' effective relationship. After regression linear models have composed, comparing with model composed traditional variables and model additional input web search traffic data variables to traditional model has conducted by significance and R squared. after comparing performance of models, final model is composed. Final regression model has improved explanation and advantage of real-time immediacy and convenience than traditional model. Furthermore, this study demonstrates system intuitively visualized to general use -Youke Mining solution has several functions of tourist decision making including embed final regression model. Youke Mining solution has algorithm based on data science and well-designed simple interface. In the end this research suggests three significant meanings on theoretical, practical and political aspects. Theoretically, Youke Mining system and the model in this research are the first step on the Youke inbound prediction using interactive and instant variable: web search traffic information represents tourists' attention while prepare their trip. Baidu web search traffic data has more than 80% of web search engine market. Practically, Baidu data could represent attention of the potential tourists who prepare their own tour as real-time. Finally, in political way, designed Chinese tourist demands prediction model based on web search traffic can be used to tourism decision making for efficient managing of resource and optimizing opportunity for successful policy.

An Implementation and Design Web-Based Instruction-Learning System Using Web Agent (웹 에이전트를 이용한 웹기반 교수-학습 시스템의 설계 및 개발)

  • Kim, Kap-Su;Lee, Keon-Min
    • Journal of The Korean Association of Information Education
    • /
    • v.5 no.1
    • /
    • pp.69-78
    • /
    • 2001
  • Recently, the current trend for computer based learning is moving from CAI environment to WBI environment. Most web documents for WBI learning are collected by aid of search engine. Instructors use those documents as learning materials after they evaluate availability of retrieved web documents. But, this method has the following problems. First, we search repeatedly the web documents selected by instructor. Second, there is a need for another course of instruction design in order to suggest the web documents for learner. Third, it is very difficult to analyze for relevance between the web documents and test results. In this work, we suggest WAILS(Web Agent Instruction Learning System) that retrieves web documents for WBI learning and guides learning course for learners. WAILS collects web documents for WBI learning by aid of web agent. Then, instructors can evaluate them and suggest to learners by using instruction-learning generating machine. Instructors retrieve web documents and the instruction-learning design at the same time. This can facilitate WBI learning.

  • PDF

A Survey of Information Searches on Internet (인터넷에서 정보 탐색에 대한 연구 조사)

  • 강병주;백혜승;최기선
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 1997.08a
    • /
    • pp.37-53
    • /
    • 1997
  • The huge size of Internet does not allow ordinary information seekers to search information with ease. Now, it is almost impossible to navigate the ocean of information without effective search tools. Web search engine has been the most effective technology for information retrieval on WWW. But recently, the need for new search tools on WWW or Internet has increased drastically. Currently, there are many on-going researches on the related topics. In this survey, we categorize the new search tools into four types: monitoring systems, filtering systems, browsing assistant systems, recommending systems. These example systems are examined. We are especially interested in WWW information filtering. It is studied how to apply the information filtering techniques to WWW, The application is not so straightforward like Email, Newswire filtering systems. As a result of this study, a simple WWW information filtering system is proposed.

  • PDF

A Study of Personalized Information Retrieval (개인화 정보 검색에 대한 연구)

  • Kim, Tae-Hwan;Jeon, Ho-Chul;Choi, Joong-Min
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.683-687
    • /
    • 2008
  • Many search algorithms have been implemented by many researchers on the world wide web. One of the best algorithms is Google using PageRank technology, PageRank approach, computes the number of inlink of each documents then represents documents in order of many inlink. But It is difficult to find the results that user needs. Because this method finds documents not valueable for a person but valueable for public, this paper propose a personalized search engine mixed public with personal worth to solve this problem.

  • PDF