• Title/Summary/Keyword: Web data

Search Result 5,605, Processing Time 0.036 seconds

A Document Collection Method for More Accurate Search Engine (정확도 높은 검색 엔진을 위한 문서 수집 방법)

  • Ha, Eun-Yong;Gwon, Hui-Yong;Hwang, Ho-Yeong
    • The KIPS Transactions:PartA
    • /
    • v.10A no.5
    • /
    • pp.469-478
    • /
    • 2003
  • Internet information search engines using web robots visit servers conneted to the Internet periodically or non-periodically. They extract and classify data collected according to their own method and construct their database, which are the basis of web information search engines. There procedure are repeated very frequently on the Web. Many search engine sites operate this processing strategically to become popular interneet portal sites which provede users ways how to information on the web. Web search engine contacts to thousands of thousands web servers and maintains its existed databases and navigates to get data about newly connected web servers. But these jobs are decided and conducted by search engines. They run web robots to collect data from web servers without knowledge on the states of web servers. Each search engine issues lots of requests and receives responses from web servers. This is one cause to increase internet traffic on the web. If each web server notify web robots about summary on its public documents and then each web robot runs collecting operations using this summary to the corresponding documents on the web servers, the unnecessary internet traffic is eliminated and also the accuracy of data on search engines will become higher. And the processing overhead concerned with web related jobs on web servers and search engines will become lower. In this paper, a monitoring system on the web server is designed and implemented, which monitors states of documents on the web server and summarizes changes of modified documents and sends the summary information to web robots which want to get documents from the web server. And an efficient web robot on the web search engine is also designed and implemented, which uses the notified summary and gets corresponding documents from the web servers and extracts index and updates its databases.

An Efficient Schema Extracting Technique Using DTD in XML Documents (DTD를 이용한 XML문서의 효율적인 스키마 추출 기법)

  • Ahn, Sung-Eun;Choi, Hwang-Kyu
    • Journal of Industrial Technology
    • /
    • v.21 no.A
    • /
    • pp.141-146
    • /
    • 2001
  • XML is fast emerging as the dominant standard to represent and exchange data in the Web. As the amount of data available in the Web has increased dramatically in recent years, the data resides in different forms ranging from semi-structured data to highly structured data in relational database. As semi-structured data will be represented by XML, XML will increase the ability of semi-structured data. In this paper, we propose an idea for extracting schema in XML document using DTD.

  • PDF

Web Log Analysis Using Support Vector Regression

  • Jun, Sung-Hae;Lim, Min-Taik;Jorn, Hong-Seok;Hwang, Jin-Soo;Park, Seong-Yong;Kim, Jee-Yun;Oh, Kyung-Whan
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.1
    • /
    • pp.61-77
    • /
    • 2003
  • Due to the wide expansion of the internet, people can freely get information what they want with lesser efforts. However without adequate forms or rules to follow, it is getting more and more difficult to get necessary information. Because of seemingly chaotic status of the current web environment, it is sometimes called "Dizzy web" The user should wander from page to page to get necessary information. Therefore we need to construct system which properly recommends appropriate information for general user. The representative research field for this system is called Recommendation System(RS), The collaborative recommendation system is one of the RS. It was known to perform better than the other systems. When we perform the web user modeling or other web-mining tasks, the continuous feedback data is very important and frequently used. In this paper, we propose a collaborative recommendation system which can deal with the continuous feedback data and tried to construct the web page prediction system. We use a sojourn time of a user as continuous feedback data and combine the traditional model-based algorithm framework with the Support Vector Regression technique. In our experiments, we show the accuracy of our system and the computing time of page prediction compared with Pearson's correlation algorithm.algorithm.

Automatic Extraction Method of Control Point Based on Geospatial Web Service (지리공간 웹 서비스 기반의 기준점 자동추출 기법 연구)

  • Lee, Young Rim
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.22 no.2
    • /
    • pp.17-24
    • /
    • 2014
  • This paper proposes an automatic extraction method of control point based on Geospatial Web Service. The proposed method consists of 3 steps. 1) The first step is to acquires reference data using the Geospatial Web Service. 2) The second step is to finds candidate control points in reference data and the target image by SURF algorithm. 3) By using RANSAC algorithm, the final step is to filters the correct matching points of candidate control points as final control points. By using the Geospatial Web Service, the proposed method increases operation convenience, and has the more extensible because of following the OGC Standard. The proposed method has been tested for SPOT-1, SPOT-5, IKONOS satellite images and has been used military standard data as reference data. The proposed method yielded a uniform accuracy under RMSE 5 pixel. The experimental results proved the capabilities of continuous improvement in accuracy depending on the resolution of target image, and showed the full potential of the proposed method for military purpose.

An Adaptive Web Surfing System for Supporting Autonomous Navigation (자동항해를 지원하는 적응형 웹 서핑 시스템)

  • 국형준
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.439-446
    • /
    • 2004
  • To design a user-adaptive web surfing system, we nay take the approach to divide the whole process into three phases; collecting user data, processing the data to construct and improve the user profile, and adapting to the user by applying the user profile. We have designed three software agents. Each privately works in each phase and they collaboratively support adaptive web surfing. They are IIA(Interactive Interface Agent), UPA(User Profile Agent), and ANA(Autonomous Navigation Agent). IIA provides the user interface, which collects data and performs mechanical navigation support. UPA processes the collected user data to build and update the user profile while user is web-surfing. ANA provides an autonomous navigation mode in which it automatically recommends web pages that are selected based on the user profile. The proposed approach and design method, through extensions and refinements, may be used to build a practical adaptive web surfing system.

Pre-Processing of Query Logs in Web Usage Mining

  • Abdullah, Norhaiza Ya;Husin, Husna Sarirah;Ramadhani, Herny;Nadarajan, Shanmuga Vivekanada
    • Industrial Engineering and Management Systems
    • /
    • v.11 no.1
    • /
    • pp.82-86
    • /
    • 2012
  • In For the past few years, query log data has been collected to find user's behavior in using the site. Many researches have studied on the usage of query logs to extract user's preference, recommend personalization, improve caching and pre-fetching of Web objects, build better adaptive user interfaces, and also to improve Web search for a search engine application. A query log contain data such as the client's IP address, time and date of request, the resources or page requested, status of request HTTP method used and the type of browser and operating system. A query log can offer valuable insight into web site usage. A proper compilation and interpretation of query log can provide a baseline of statistics that indicate the usage levels of website and can be used as tool to assist decision making in management activities. In this paper we want to discuss on the tasks performed of query logs in pre-processing of web usage mining. We will use query logs from an online newspaper company. The query logs will undergo pre-processing stage, in which the clickstream data is cleaned and partitioned into a set of user interactions which will represent the activities of each user during their visits to the site. The query logs will undergo essential task in pre-processing which are data cleaning and user identification.

A Study on Querying Method for RDF Data in XML Database (RDF 데이터 관리를 위한 효율적인 질의 처리에 관한 연구)

  • Hwang NamGoong;Kim Yong
    • Journal of Korean Library and Information Science Society
    • /
    • v.37 no.3
    • /
    • pp.415-431
    • /
    • 2006
  • The semantic web was proposed as the next generation web technology. In the environment of the semantic web, resources as well defined and related with each other semantically, the RDF supports this basic mechanism. To establish and develop the semantic web. the basic technologies related to RDF data must be pre-established. In this research, we develop methods for storing and querying RDF data using an XML database system. As using XML database, XML data and RDF data can be integrated and efficiently managed. We construct and evaluate a system applying the proposed method to store and search data, we compared the query processing performance on our system with that of an existing system. The experiment result show that our system processes queries more efficiently.

  • PDF

APPLICATION OF HIGH RESOLUTION SATELLITE IMAGERY ON X3D-BASED SEMANTIC WEB USING SMART GRAPHICS

  • Kim, Hak-Hoon;Lee, Kiwon
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.586-589
    • /
    • 2006
  • High resolution satellite imagery is regarded as one of the important data sets to engineering application, as well as conventional scientific application. However, despite this general view, there are a few target applications using this information. In this study, the possibility for the future wide uses in associated with smart graphics of this information is investigated. The concept of smart graphics can be termed intelligent graphics with XML-based structure and knowledge related to semantic web, which is a useful component for the data dissemination framework model in a multi-layered web-based application. In the first step in this study, high resolution imagery is transformed to GML (Geographic Markup Language)-based structure with attribute schema and geo-references. In the second, this information is linked with GIS data sets, and this fused data set is represented in the X3D (eXtensible 3D), ISO-based web 3D graphic standard, with styling attributes, in the next stop. The main advantages of this approach using GML and X3D are the flourished representations of a source data according to user/clients’ needs and structured 3D visualization linked with other XML-based application. As for the demonstration of this scheme, 3D urban modelling case with actual data sets is presented.

  • PDF

A Comparative Study on Data Input Design of E-business Websites (E-business 웹사이트에서의 데이터 입력디자인에 관한 비교 연구)

  • 정홍인
    • Archives of design research
    • /
    • v.17 no.1
    • /
    • pp.127-134
    • /
    • 2004
  • The purpose of this study was to compare data input interfaces used in e-business applications on the web and find optimal input design characteristics. Basic data entry tools such as a pull down menu, list, text input box, and radio button were examined by inputting data into a simulated hotel room reservation web site. Experimental results indicated that the text input box was most efficient for experts or experienced operators when there are more than four menu-items and pull down menu was considered most satisfactory, simplest, and easier to use for novices or unexperienced users. A simple list was determined to be the best for the input of binary data considering user's satisfaction, simplicity, and flexibility but radio button was evaluated best for the ease to use. Design guide lines of this study can be applied to build a usable interactive web sites and increase economic efficiency.

  • PDF

Data Analysis Platform Construct of Fault Prediction and Diagnosis of RCP(Reactor Coolant Pump) (원자로 냉각재 펌프 고장예측진단을 위한 데이터 분석 플랫폼 구축)

  • Kim, Ju Sik;Jo, Sung Han;Jeoung, Rae Hyuck;Cho, Eun Ju;Na, Young Kyun;You, Ki Hyun
    • Journal of Information Technology Services
    • /
    • v.20 no.3
    • /
    • pp.1-12
    • /
    • 2021
  • Reactor Coolant Pump (RCP) is core part of nuclear power plant to provide the forced circulation of reactor coolant for the removal of core heat. Properly monitoring vibration of RCP is a key activity of a successful predictive maintenance and can lead to a decrease in failure, optimization of machine performance, and a reduction of repair and maintenance costs. Here, we developed real-time RCP Vibration Analysis System (VAS) that web based platform using NoSQL DB (Mongo DB) to handle vibration data of RCP. In this paper, we explain how to implement digital signal process of vibration data from time domain to frequency domain using Fast Fourier transform and how to design NoSQL DB structure, how to implement web service using Java spring framework, JavaScript, High-Chart. We have implement various plot according to standard of the American Society of Mechanical Engineers (ASME) and it can show on web browser based on HTML 5. This data analysis platform shows a upgraded method to real-time analyze vibration data and easily uses without specialist. Furthermore to get better precision we have plan apply to additional machine learning technology.