• Title/Summary/Keyword: 자동 수집

Search Result 1,077, Processing Time 0.025 seconds

An Adaptive Web Surfing System for Supporting Autonomous Navigation (자동항해를 지원하는 적응형 웹 서핑 시스템)

  • 국형준
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.439-446
    • /
    • 2004
  • To design a user-adaptive web surfing system, we nay take the approach to divide the whole process into three phases; collecting user data, processing the data to construct and improve the user profile, and adapting to the user by applying the user profile. We have designed three software agents. Each privately works in each phase and they collaboratively support adaptive web surfing. They are IIA(Interactive Interface Agent), UPA(User Profile Agent), and ANA(Autonomous Navigation Agent). IIA provides the user interface, which collects data and performs mechanical navigation support. UPA processes the collected user data to build and update the user profile while user is web-surfing. ANA provides an autonomous navigation mode in which it automatically recommends web pages that are selected based on the user profile. The proposed approach and design method, through extensions and refinements, may be used to build a practical adaptive web surfing system.

Recognition Model of the Vehicle Type usig Clustering Methods (클러스터링 방법을 이용한 차종인식 모형)

  • Jo, Hyeong-Gi;Min, Jun-Yeong;Choe, Jong-Uk
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.2
    • /
    • pp.369-380
    • /
    • 1996
  • Inductive Loop Detector(ILD) has been commonly used in collecting traffic data such as occupancy time and non-occupancy time. From the data, the traffic volume and type of passing vehicle is calculated. To provide reliable data for traffic control and plan, accuracy is required in type recognition which can be utilized to determine split of traffic signal and to provide forecasting data of queue-length for over-saturation control. In this research, a new recognition model issuggested for recognizing typeof vehicle from thecollected data obtained through ILD systems. Two clustering methods, based on statistical algorithms, and one neural network clustering method were employed to test the reliability and occuracy for the methods. In a series of experiments, it was found that the new model can greatly enhance the reliability and accuracy of type recongition rate, much higher than conventional approa-ches. The model modifies the neural network clustering method and enhances the recongition accuracy by iteratively applying the algorithm until no more unclustered data remains.

  • PDF

Crawling algorithm design and experiment for automatic deep web document collection (심층 웹 문서 자동 수집을 위한 크롤링 알고리즘 설계 및 실험)

  • Yun-Jeong, Kang;Min-Hye, Lee;Dong-Hyun, Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.27 no.1
    • /
    • pp.1-7
    • /
    • 2023
  • Deep web collection means entering a query in a search form and collecting response results. It is estimated that the information possessed by the deep web has about 450 to 550 times more information than the statically constructed surface web. The static method does not show the changed information until the web page is refreshed, but the dynamic web page method updates the necessary information in real time and provides real-time information without reloading the web page, but crawler has difficulty accessing the updated information. Therefore, there is a need for a way to automatically collect information on these deep webs using a crawler. Therefore, this paper proposes a method of utilizing scripts as general links, and for this purpose, an algorithm that can utilize client scripts like regular URLs is proposed and experimented. The proposed algorithm focused on collecting web information by menu navigation and script execution instead of the usual method of entering data into search forms.

Design and Construction Strategy for Disaster and Safety Record Information Resources Archives Based on Automatic Acquisition (자동수집 기반 재난안전 기록정보자원 아카이브 설계 및 구축전략)

  • Han, Hui Jeong;Gang, Ju-Yeon;Kim, Yong;Oh, Hyo-Jung
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.17 no.4
    • /
    • pp.127-154
    • /
    • 2017
  • Large-scale and complex disasters have frequently occurred all over the world recently, and they are repeated every year. Accordingly, the need for the systematic management and utilization of raw-data and processing information in the past has increased. For this purpose, this study proposed a construction strategy for disaster and safety record information resources archives based on automatic acquisition, which can be used as a hub for disaster and safety information. Based on local and foreign case studies, several consideration factors for building the archives are determined. Finally, this study proposed four steps for constructing the archives. These are as follows: 1) complete enumeration survey of the disaster and safety record information resources, 2) automatic acquisition possibility study, 3) selection of the resources for preservation, and 4) automatic acquisition of metadata. The construction of the archives proposed in this study will facilitate integrated management, sharing, and use of scattered information resources on disaster and safety.

Performance Analysis of a Korean Word Autocomplete System and New Evaluation Metrics (한국어 단어 자동완성 시스템의 성능 분석 및 새로운 평가 방법)

  • Lee, Songwook
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.39 no.6
    • /
    • pp.656-661
    • /
    • 2015
  • The goal of this paper is to analyze the performance of a word autocomplete system for mobile devices such as smartphones, tablets, and PCs. The proposed system automatically completes a partially typed string into a full word, reducing the time and effort required by a user to enter text on these devices. We collect a large amount of data from Twitter and develop both unigram and bigram dictionaries based on the frequency of words. Using these dictionaries, we analyze the performance of the word autocomplete system and devise a keystroke profit rate and recovery rate as new evaluation metrics that better describe the characteristics of the word autocomplete problem compared to previous measures such as the mean reciprocal rank or recall.

Design and Implementation of Automatic Control System in Room using Sensor (센서를 이용한 자동 실내 온도 제어시스템 설계 및 구현)

  • Jeong, Gyu-Tae;Lee, Eun-Jin;Kim, Heung-Soo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.05a
    • /
    • pp.326-328
    • /
    • 2015
  • Function of the window of the building is an environment of the room through the entry of solar radiation. It is vulnerable to significant energy conservation off the thermal efficiency. Also this summer, cooling costs are weighted because of excessive solar radiation. In this paper, we develop a windows automatic control system to use the indoor environmental information, such as home temperature, humidity, light intensity, solar radiation. The system collects the indoor environment information using a variety of sensor, using the collected information, and controls the motor to the system to control the window.

  • PDF

Comparison of Performance Factors for Automatic Classification of Records Utilizing Metadata (메타데이터를 활용한 기록물 자동분류 성능 요소 비교)

  • Young Bum Gim;Woo Kwon Chang
    • Journal of the Korean Society for information Management
    • /
    • v.40 no.3
    • /
    • pp.99-118
    • /
    • 2023
  • The objective of this study is to identify performance factors in the automatic classification of records by utilizing metadata that contains the contextual information of records. For this study, we collected 97,064 records of original textual information from Korean central administrative agencies in 2022. Various classification algorithms, data selection methods, and feature extraction techniques are applied and compared with the intent to discern the optimal performance-inducing technique. The study results demonstrated that among classification algorithms, Random Forest displayed higher performance, and among feature extraction techniques, the TF method proved to be the most effective. The minimum data quantity of unit tasks had a minimal influence on performance, and the addition of features positively affected performance, while their removal had a discernible negative impact.

다중 서버 구조를 갖는 Web 기반 음성 수집 시스템

  • 홍문기;강선미;장문수
    • Proceedings of the KSLP Conference
    • /
    • 2003.11a
    • /
    • pp.230-232
    • /
    • 2003
  • 음성에 관련된 연구분야에 있어서 음성 데이터 수집의 중요성은 매우 크다. 개발된 인식기나 분석기의 성능이 좋다 하더라도 실험에 사용된 음성 데이터의 질과 양에 따라서 실험 결과를 확정짓기가 어려운 점이 있다. 대개의 경우 음성 수집은 오프라인으로 이루어지는데, 실험에서 요구되는 특정 수집자에 대해서 일정 기간과 정해진 장소에서 반복 수집하는 것은 어려움이 많이 따른다. 그러므로 본 연구에서는 Web을 이용하여 음성 데이터 수집자로 하여금 다양한 시간과 장소에서 자유롭게 음성을 수집할 수 있도록 하였다. 이에 대하여 수집된 음성 데이터의 크기가 커짐에 따른 통신상에서 종종 발생하는 문제점을 개선하려는 목적으로 다중 서버를 두어 수집된 데이터는 지역 서버에 일단 저장되었다가 적절한 상황에서 메인 서버로 자동 전송하는 시스템을 구축하였다. 본 시스템은 서로 다른 실험에서 수집되는 데이터를 수집 지역서버를 지정해 줌으로서 수집자가 원하는 특정 지역 서버에서 별도로 관리할 수 있도록 구성되어 있다. 시간, 위치의 제약 없이 인터넷이 연결된 장소에서는 음성을 수집할 수 있고, 웹상 ActiveX 프로그램을 제공함으로써 일관된 끝점처리 및 잡음처리 기능을 반영할 수 있다. 또한 다양한 응용에 적절한 수집기의 인터페이스를 관리자 모드에서 변경하여 사용할 수 있도록 함으로서 넓은 층에서의 활용도를 높였다. (중략)

  • PDF

Performance Improvement Methods of a Spoken Chatting System Using SVM (SVM을 이용한 음성채팅시스템의 성능 향상 방법)

  • Ahn, HyeokJu;Lee, SungHee;Song, YeongKil;Kim, HarkSoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.6
    • /
    • pp.261-268
    • /
    • 2015
  • In spoken chatting systems, users'spoken queries are converted to text queries using automatic speech recognition (ASR) engines. If the top-1 results of the ASR engines are incorrect, these errors are propagated to the spoken chatting systems. To improve the top-1 accuracies of ASR engines, we propose a post-processing model to rearrange the top-n outputs of ASR engines using a ranking support vector machine (RankSVM). On the other hand, a number of chatting sentences are needed to train chatting systems. If new chatting sentences are not frequently added to training data, responses of the chatting systems will be old-fashioned soon. To resolve this problem, we propose a data collection model to automatically select chatting sentences from TV and movie scenarios using a support vector machine (SVM). In the experiments, the post-processing model showed a higher precision of 4.4% and a higher recall rate of 6.4% compared to the baseline model (without post-processing). Then, the data collection model showed the high precision of 98.95% and the recall rate of 57.14%.

Automatic Payload Signature Update System for Classification of Recent Network Applications (최신 네트워크 응용 분류를 위한 자동화 페이로드 시그니쳐 업데이트 시스템)

  • Shim, Kyu-Seok;Goo, Young-Hoon;Lee, Sung-Ho;Sija, Baraka D.;Kim, Myung-Sup
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.1
    • /
    • pp.98-107
    • /
    • 2017
  • In these days, the increase of applications that highly use network resources has revealed the limitations of the current research phase from the traffic classification for network management. Various researches have been conducted to solutions for such limitations. The representative study is automatic finding of the common pattern of traffic. However, since the study of automatic signature generation is a semi-automatic system, users should collect the traffic. Therefore, these limitations cause problems in the traffic collection step leading to untrusted accuracy of the signature verification process because it does not contain any of the generated signature. In this paper, we propose an automated traffic collection, signature management, signature generation and signature verification process to overcome the limitations of the automatic signature update system. By applying the proposed method in the campus network, actual traffic signatures maintained the completeness with no false-positive.