• Title/Summary/Keyword: Search engines

Search Result 437, Processing Time 0.029 seconds

Implementation of Search Engine to Minimize Traffic Using Blockchain-Based Web Usage History Management System

  • Yu, Sunghyun;Yeom, Cheolmin;Won, Yoojae
    • Journal of Information Processing Systems
    • /
    • v.17 no.5
    • /
    • pp.989-1003
    • /
    • 2021
  • With the recent increase in the types of services provided by Internet companies, collection of various types of data has become a necessity. Data collectors corresponding to web services profit by collecting users' data indiscriminately and providing it to the associated services. However, the data provider remains unaware of the manner in which the data are collected and used. Furthermore, the data collector of a web service consumes web resources by generating a large amount of web traffic. This traffic can damage servers by causing service outages. In this study, we propose a website search engine that employs a system that controls user information using blockchains and builds its database based on the recorded information. The system is divided into three parts: a collection section that uses proxy, a management section that uses blockchains, and a search engine that uses a built-in database. This structure allows data sovereigns to manage their data more transparently. Search engines that use blockchains do not use internet bots, and instead use the data generated by user behavior. This avoids generation of traffic from internet bots and can, thereby, contribute to creating a better web ecosystem.

RCGA-Based Tuning of the PID Controller for Marine Gas Turbine Engines (RCGA에 기초한 선박 가스터빈 엔진용 PID 제어기의 동조)

  • So Myung-Ok;Jung Byung-Gun;Jin Gang-Gyoo;Jin Sun-Ho;Lee Yun-Hyung
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.29 no.1
    • /
    • pp.116-123
    • /
    • 2005
  • The PID controllers have been widely accepted in many industrial systems due to their robust performance in a wide range of operating conditions and their functional simplicity To implement a PID controller, its three parameters must be determined for the given plant. Conventional tuning methods are mainly based on experience and experiment and are lack of systematic procedure Recently. to overcome drawbacks of conventional tuning methods, genetic algorithms have been used, In this paper a real-coded genetic algorithm is employed to search for the optimal parameters of the PID controller for speed control of marine gas turbine engines. Simulation results show the effectiveness of the proposed scheme.

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.

A Tensor Space Model based Semantic Search Technique (텐서공간모델 기반 시멘틱 검색 기법)

  • Hong, Kee-Joo;Kim, Han-Joon;Chang, Jae-Young;Chun, Jong-Hoon
    • The Journal of Society for e-Business Studies
    • /
    • v.21 no.4
    • /
    • pp.1-14
    • /
    • 2016
  • Semantic search is known as a series of activities and techniques to improve the search accuracy by clearly understanding users' search intent without big cognitive efforts. Usually, semantic search engines requires ontology and semantic metadata to analyze user queries. However, building a particular ontology and semantic metadata intended for large amounts of data is a very time-consuming and costly task. This is why commercialization practices of semantic search are insufficient. In order to resolve this problem, we propose a novel semantic search method which takes advantage of our previous semantic tensor space model. Since each term is represented as the 2nd-order 'document-by-concept' tensor (i.e., matrix), and each concept as the 2nd-order 'document-by-term' tensor in the model, our proposed semantic search method does not require to build ontology. Nevertheless, through extensive experiments using the OHSUMED document collection and SCOPUS journal abstract data, we show that our proposed method outperforms the vector space model-based search method.

Mining Search Keywords for Improving the Accuracy of Entity Search (엔터티 검색의 정확성을 높이기 위한 검색 키워드 마이닝)

  • Lee, Sun Ku;On, Byung-Won;Jung, Soo-Mok
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.9
    • /
    • pp.451-464
    • /
    • 2016
  • Nowadays, entity search such as Google Product Search and Yahoo Pipes has been in the spotlight. The entity search engines have been used to retrieve web pages relevant with a particular entity. However, if an entity (e.g., Chinatown movie) has various meanings (e.g., Chinatown movies, Chinatown restaurants, and Incheon Chinatown), then the accuracy of the search result will be decreased significantly. To address this problem, in this article, we propose a novel method that quantifies the importance of search queries and then offers the best query for the entity search, based on Frequent Pattern (FP)-Tree, considering the correlation between the entity relevance and the frequency of web pages. According to the experimental results presented in this paper, the proposed method (59% in the average precision) improved the accuracy five times, compared to the traditional query terms (less than 10% in the average precision).

A Study on Removal Request of Exposed Personal Information (노출된 개인정보의 삭제 요청에 관한 연구)

  • Jung, Bo-Reum;Jang, Byeong-Wook;Kim, In-Seok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.15 no.6
    • /
    • pp.37-42
    • /
    • 2015
  • Although online search engine service provide a convenient means to search for information on the World Wide Web, it also poses a risk of disclosing privacy. Regardless of such risk, most of users are neither aware of their personal information being exposed on search results nor how to redress the issue by requesting removal of information. According to the 2015 parliamentary inspection of government offices, many government agencies were criticized for mishandling of personal information and its leakage on online search engine such as Google. Considering the fact that the personal information leakage via online search engine has drawn the attention at the government level, the online search engine and privacy issue needs to be rectified. This paper, by examining current online search engines, studies the degree of personal information exposure on online search results and its underlying issues. Lastly, based on research result, the paper provides a sound policy and direction to the removal of exposed personal information with respect to search engine service provider and user respectively.

Searching Patents Effectively in terms of Keyword Distributions (키워드 분포를 고려한 효과적 특허검색기법)

  • Lee, Wookey;Song, Justin Jongsu;Kang, Michael Mingu
    • Journal of Information Technology and Architecture
    • /
    • v.9 no.3
    • /
    • pp.323-331
    • /
    • 2012
  • With the advancement of the area of knowledge and information, Intellectual Property, especially, patents have captured attention more and more emergent. The increasing need for efficient way of patent information search has been essential, but the prevailing patent search engines have included too many noises for the results due to the Boolean models. This has occasioned too much time for the professional experts to investigate the results manually. In this paper, we reveal the differences between the conventional document search and patent search and analyze the limitations of existing patent search. Furthermore, we propose a specialized in patent search, so that the relationship between the keywords within each document and their significance within each patent document search keyword can be identified. Which in turn, the keywords and the relationships have been appointed a ranking for this patent in the upper ranks and the noise in the data sub-ranked. Therefore this approach is proposed to significantly reduce noise ratio of the data from the search results. Finally, in, we demonstrate the superiority of the proposed methodology by comparing the Kipris dataset.

Does the general public have concerns with dental anesthetics?

  • Razon, Jonathan;Mascarenhas, Ana Karina
    • Journal of Dental Anesthesia and Pain Medicine
    • /
    • v.21 no.2
    • /
    • pp.113-118
    • /
    • 2021
  • Background: Consumers and patients in the last two decades have increasingly turned to various internet search engines including Google for information. Google Trends records searches done using the Google search engine. Google Trends is free and provides data on search terms and related queries. One recent study found a large public interest in "dental anesthesia". In this paper, we further explore this interest in "dental anesthesia" and assess if any patterns emerge. Methods: In this study, Google Trends and the search term "dental pain" was used to record the consumer's interest over a five-year period. Additionally, using the search term "Dental anesthesia," a top ten related query list was generated. Queries are grouped into two sections, a "top" category and a "rising" category. We then added additional search term such as: wisdom tooth anesthesia, wisdom tooth general anesthesia, dental anesthetics, local anesthetic, dental numbing, anesthesia dentist, and dental pain. From the related queries generated from each search term, repeated themes were grouped together and ranked according to the total sum of their relative search frequency (RSF) values. Results: Over the five-year time period, Google Trends data show that there was a 1.5% increase in the search term "dental pain". Results of the related queries for dental anesthesia show that there seems to be a large public interest in how long local anesthetics last (Total RSF = 231) - even more so than potential side effects or toxicities (Total RSF = 83). Conclusion: Based on these results it is recommended that clinicians clearly advice their patients on how long local anesthetics last to better manage patient expectations.

The Relationship between Internet Search Volumes and Stock Price Changes: An Empirical Study on KOSDAQ Market (개별 기업에 대한 인터넷 검색량과 주가변동성의 관계: 국내 코스닥시장에서의 산업별 실증분석)

  • Jeon, Saemi;Chung, Yeojin;Lee, Dongyoup
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.81-96
    • /
    • 2016
  • As the internet has become widespread and easy to access everywhere, it is common for people to search information via online search engines such as Google and Naver in everyday life. Recent studies have used online search volume of specific keyword as a measure of the internet users' attention in order to predict disease outbreaks such as flu and cancer, an unemployment rate, and an index of a nation's economic condition, and etc. For stock traders, web search is also one of major information resources to obtain data about individual stock items. Therefore, search volume of a stock item can reflect the amount of investors' attention on it. The investor attention has been regarded as a crucial factor influencing on stock price but it has been measured by indirect proxies such as market capitalization, trading volume, advertising expense, and etc. It has been theoretically and empirically proved that an increase of investors' attention on a stock item brings temporary increase of the stock price and the price recovers in the long run. Recent development of internet environment enables to measure the investor attention directly by the internet search volume of individual stock item, which has been used to show the attention-induced price pressure. Previous studies focus mainly on Dow Jones and NASDAQ market in the United States. In this paper, we investigate the relationship between the individual investors' attention measured by the internet search volumes and stock price changes of individual stock items in the KOSDAQ market in Korea, where the proportion of the trades by individual investors are about 90% of the total. In addition, we examine the difference between industries in the influence of investors' attention on stock return. The internet search volume of stocks were gathered from "Naver Trend" service weekly between January 2007 and June 2015. The regression model with the error term with AR(1) covariance structure is used to analyze the data since the weekly prices in a stock item are systematically correlated. The market capitalization, trading volume, the increment of trading volume, and the month in which each trade occurs are included in the model as control variables. The fitted model shows that an abnormal increase of search volume of a stock item has a positive influence on the stock return and the amount of the influence varies among the industry. The stock items in IT software, construction, and distribution industries have shown to be more influenced by the abnormally large internet search volume than the average across the industries. On the other hand, the stock items in IT hardware, manufacturing, entertainment, finance, and communication industries are less influenced by the abnormal search volume than the average. In order to verify price pressure caused by investors' attention in KOSDAQ, the stock return of the current week is modelled using the abnormal search volume observed one to four weeks ahead. On average, the abnormally large increment of the search volume increased the stock return of the current week and one week later, and it decreased the stock return in two and three weeks later. There is no significant relationship with the stock return after 4 weeks. This relationship differs among the industries. An abnormal search volume brings particularly severe price reversal on the stocks in the IT software industry, which are often to be targets of irrational investments by individual investors. An abnormal search volume caused less severe price reversal on the stocks in the manufacturing and IT hardware industries than on average across the industries. The price reversal was not observed in the communication, finance, entertainment, and transportation industries, which are known to be influenced largely by macro-economic factors such as oil price and currency exchange rate. The result of this study can be utilized to construct an intelligent trading system based on the big data gathered from web search engines, social network services, and internet communities. Particularly, the difference of price reversal effect between industries may provide useful information to make a portfolio and build an investment strategy.

A Retrieval Technique of Personal Information in a Web Environment (웹 환경에서의 개인정보 검색기법)

  • Seo, Young-Duk;Chang, Jae-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.15 no.4
    • /
    • pp.145-151
    • /
    • 2015
  • Since we use internet every day, the internet privacy has become important. We need to find out what kinds of personal information is exposed to the internet and to eliminate the exposed information. However, it is not efficient to search the personal information using only fragmentary clues in web search engines because the ranking results are not relevant to the exposure degree of personal information. In this paper, we introduced a personal information retrieval system and proposed a process to remove private data from the web easily. We also compared our proposed method with previous methods by evaluating the search performance.