• Title/Summary/Keyword: Data collection framework

Search Result 241, Processing Time 0.022 seconds

Design and Implementation of a Web Crawler System for Collection of Structured and Unstructured Data (정형 및 비정형 데이터 수집을 위한 웹 크롤러 시스템 설계 및 구현)

  • Bae, Seong Won;Lee, Hyun Dong;Cho, DaeSoo
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.2
    • /
    • pp.199-209
    • /
    • 2018
  • Recently, services provided to consumers are increasingly being combined with big data such as low-priced shopping, customized advertisement, and product recommendation. With the increasing importance of big data, the web crawler that collects data from the web has also become important. However, there are two problems with existing web crawlers. First, if the URL is hidden from the link, it can not be accessed by the URL. The second is the inefficiency of fetching more data than the user wants. Therefore, in this paper, through the Casper.js which can control the DOM in the headless brwoser, DOM event is generated by accessing the URL to the hidden link. We also propose an intelligent web crawler system that allows users to make steps to fine-tune both Structured and unstructured data to bring only the data they want. Finally, we show the superiority of the proposed crawler system through the performance evaluation results of the existing web crawler and the proposed web crawler.

Development of Evaluation Perspective and Criteria for the DataON Platform

  • Kim, Suntae
    • Journal of Information Science Theory and Practice
    • /
    • v.8 no.2
    • /
    • pp.68-78
    • /
    • 2020
  • This study is a preliminary study to develop an evaluation framework necessary for evaluating the DataON platform. The first objective is to examine expert perceptions of the level of DataON platform construction. The second objective is to evaluate the importance, stability, and usability of DataON platform features over OpenAIRE features. The third objective is to derive weights from the evaluation perspective for future DataON platform evaluation. The fourth objective is to examine the preferences of experts in each evaluation perspective and to derive unbiased evaluation criteria. This study used a survey method for potential stakeholders of the DataON platform. The survey included 12 professionals with at least 10 years of experience in the field. The 57 overall functions and services were measured at 3.1 out of 5 for importance. Stability was -0.07 point and usability was measured as -0.05 point. The 42 features and services scored 3.04 points in importance. Stability was -0.58 points and usability was -0.51 points. In particular, the stability and usability scores of the 42 functions and services provided as of 2018 were higher than the total functions were, which is attributed to the stable and user-friendly improvement after development. In terms of the weight of the evaluation point, the collection quality has the highest weight of 27%. Interface usability is then weighted 22%. Subsequently, service quality is weighted 19%, and finally system performance efficiency and user feedback solicitation are equally weighted 16%.

Markerless camera pose estimation framework utilizing construction material with standardized specification

  • Harim Kim;Heejae Ahn;Sebeen Yoon;Taehoon Kim;Thomas H.-K. Kang;Young K. Ju;Minju Kim;Hunhee Cho
    • Computers and Concrete
    • /
    • v.33 no.5
    • /
    • pp.535-544
    • /
    • 2024
  • In the rapidly advancing landscape of computer vision (CV) technology, there is a burgeoning interest in its integration with the construction industry. Camera calibration is the process of deriving intrinsic and extrinsic parameters that affect when the coordinates of the 3D real world are projected onto the 2D plane, where the intrinsic parameters are internal factors of the camera, and extrinsic parameters are external factors such as the position and rotation of the camera. Camera pose estimation or extrinsic calibration, which estimates extrinsic parameters, is essential information for CV application at construction since it can be used for indoor navigation of construction robots and field monitoring by restoring depth information. Traditionally, camera pose estimation methods for cameras relied on target objects such as markers or patterns. However, these methods, which are marker- or pattern-based, are often time-consuming due to the requirement of installing a target object for estimation. As a solution to this challenge, this study introduces a novel framework that facilitates camera pose estimation using standardized materials found commonly in construction sites, such as concrete forms. The proposed framework obtains 3D real-world coordinates by referring to construction materials with certain specifications, extracts the 2D coordinates of the corresponding image plane through keypoint detection, and derives the camera's coordinate through the perspective-n-point (PnP) method which derives the extrinsic parameters by matching 3D and 2D coordinate pairs. This framework presents a substantial advancement as it streamlines the extrinsic calibration process, thereby potentially enhancing the efficiency of CV technology application and data collection at construction sites. This approach holds promise for expediting and optimizing various construction-related tasks by automating and simplifying the calibration procedure.

Prototype-based Classifier with Feature Selection and Its Design with Particle Swarm Optimization: Analysis and Comparative Studies

  • Park, Byoung-Jun;Oh, Sung-Kwun
    • Journal of Electrical Engineering and Technology
    • /
    • v.7 no.2
    • /
    • pp.245-254
    • /
    • 2012
  • In this study, we introduce a prototype-based classifier with feature selection that dwells upon the usage of a biologically inspired optimization technique of Particle Swarm Optimization (PSO). The design comprises two main phases. In the first phase, PSO selects P % of patterns to be treated as prototypes of c classes. During the second phase, the PSO is instrumental in the formation of a core set of features that constitute a collection of the most meaningful and highly discriminative coordinates of the original feature space. The proposed scheme of feature selection is developed in the wrapper mode with the performance evaluated with the aid of the nearest prototype classifier. The study offers a complete algorithmic framework and demonstrates the effectiveness (quality of solution) and efficiency (computing cost) of the approach when applied to a collection of selected data sets. We also include a comparative study which involves the usage of genetic algorithms (GAs). Numerical experiments show that a suitable selection of prototypes and a substantial reduction of the feature space could be accomplished and the classifier formed in this manner becomes characterized by low classification error. In addition, the advantage of the PSO is quantified in detail by running a number of experiments using Machine Learning datasets.

Application of NANDA and HHCC to Classification of Nursing Diagnosis in a Hospital-Based Home Health Care (일개 종합병원중심 가정간호 간호진단분류를 위한 NANDA와 HHCC의 적용 비교)

  • Lee, Jin Kyung;Park, Hyeoun Ae
    • Korean Journal of Adult Nursing
    • /
    • v.12 no.4
    • /
    • pp.507-516
    • /
    • 2000
  • This study examines that North American Nursing Diagnosis Association(NANDA) and Home Health Care Classification(HHCC) is appropriate to classify home health care client's nursing problems and suggests a modified nursing diagnosis classification system. Two hundred and forty-nine clients' records at a general hospital were reviewed and nursing problems were diagnosed according to each classification system. Results of this study are as follows. The major client's medical diagnosis are pregnancy, childbirth and puerperium, malignant neoplasm, and benign neoplasm. Of four hundred and sixty-three nursing problems, all nursing problems made a diagnos according to HHCC, while three hundred and eighty-five made a diagnosis according to NANDA. The HHCC diagnosis included 78 more nursing problems than NANDA. The discrepancy in the results may indicate a significant advantage to HHCC diagnosis because HHCC nomenclature was created empirically from hard data. However, this may be due to limitations in the data collection method so determination of which classification system is more useful is difficult to judge. However, nursing components of the HHCC are more concrete and clearer than human response patterns of the NANDA. Also the HHCC facilitates the documentation of patient care by computer, while using a conceptual framework consisting of 20 Care Components based on the nursing process: assessment, diagnosis, outcome identification, planning, implementation and evaluation. Accordingly, the practical application of HHCC is more useful than NANDA. Limitations of this study include a retrospective data collecting method and universality of samples. Further research for various samples that use prospective data collection method is recommended.

  • PDF

Jointly Image Topic and Emotion Detection using Multi-Modal Hierarchical Latent Dirichlet Allocation

  • Ding, Wanying;Zhu, Junhuan;Guo, Lifan;Hu, Xiaohua;Luo, Jiebo;Wang, Haohong
    • Journal of Multimedia Information System
    • /
    • v.1 no.1
    • /
    • pp.55-67
    • /
    • 2014
  • Image topic and emotion analysis is an important component of online image retrieval, which nowadays has become very popular in the widely growing social media community. However, due to the gaps between images and texts, there is very limited work in literature to detect one image's Topics and Emotions in a unified framework, although topics and emotions are two levels of semantics that often work together to comprehensively describe one image. In this work, a unified model, Joint Topic/Emotion Multi-Modal Hierarchical Latent Dirichlet Allocation (JTE-MMHLDA) model, which extends previous LDA, mmLDA, and JST model to capture topic and emotion information at the same time from heterogeneous data, is proposed. Specifically, a two level graphical structured model is built to realize sharing topics and emotions among the whole document collection. The experimental results on a Flickr dataset indicate that the proposed model efficiently discovers images' topics and emotions, and significantly outperform the text-only system by 4.4%, vision-only system by 18.1% in topic detection, and outperforms the text-only system by 7.1%, vision-only system by 39.7% in emotion detection.

  • PDF

Innovation in Telecom Services -Framework and Analysis Based on the Case of International Pre-paid Calling Cards in Japan

  • Kumiko, Miyazaki;Wiggers, Edmar
    • Journal of Technology Innovation
    • /
    • v.13 no.2
    • /
    • pp.45-70
    • /
    • 2005
  • Much work on innovation has focused on the manufacturing sector. In this paper, we propose a framework for analyzing innovation in services centred on capability and technology integration. We illustrate the theoretical points made by conducting a case study on an international telephone communications provider Brastel, which introduced significant innovations in international calling services, in the form of rechargeable pre-paid calling card, through effective application of standard IT. Brastel is situated against its main competitors, considering two dimensions of price and service breadth and convenience. A novel technique for measuring competitiveness based on price and service index is introduced. The following competitors were selected: NTT, KDDI, Japan Telecom, Fusion Communications, J-Call / World Link, G-Call, ASP Check, Primus, QuickPhone, and MCI. To create the service index, factors such as ease of use, convenience, number of languages in which the services are available, and additional features were taken into account. The company itself and the rechargeable card innovation were analyzed through in-depth interviews and data collection. It was shown that a competitive advantage was maintained through internal and external capabilities.

  • PDF

Spatial-temporal texture features for 3D human activity recognition using laser-based RGB-D videos

  • Ming, Yue;Wang, Guangchao;Hong, Xiaopeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.3
    • /
    • pp.1595-1613
    • /
    • 2017
  • The IR camera and laser-based IR projector provide an effective solution for real-time collection of moving targets in RGB-D videos. Different from the traditional RGB videos, the captured depth videos are not affected by the illumination variation. In this paper, we propose a novel feature extraction framework to describe human activities based on the above optical video capturing method, namely spatial-temporal texture features for 3D human activity recognition. Spatial-temporal texture feature with depth information is insensitive to illumination and occlusions, and efficient for fine-motion description. The framework of our proposed algorithm begins with video acquisition based on laser projection, video preprocessing with visual background extraction and obtains spatial-temporal key images. Then, the texture features encoded from key images are used to generate discriminative features for human activity information. The experimental results based on the different databases and practical scenarios demonstrate the effectiveness of our proposed algorithm for the large-scale data sets.

A Study on the Analysis of the Regional Agricultural Environment (지역단위 농업환경 분석을 위한 연구)

  • Heo, Jang
    • Korean Journal of Organic Agriculture
    • /
    • v.9 no.4
    • /
    • pp.27-54
    • /
    • 2001
  • This paper aims to provide a basic framework to make a regional plan for the environment-friendly agriculture. To prepare the regional plan is mandated by the Environment-friendly Agriculture Promotion Act of 1998. Here is proposed the input/output analysis framework, which includes the shifts of fertilizers, herbicides, pesticides, and livestock manures Basically, the discharged amount of polluted elements means the difference between the amount of the elements entered into the crop and livestock sectors and the amount of the elements absorbed or used by the crop and/or livestock. A few suggestions are offered for better regional environment-friendly agricultural plan. The foremost important thing is to establish a data collection system. The \"Green Accounting System\" is suggested. It is also crucial to create a standard guideline or manual which Provides detailed procedures to follow in making the plan by the local planners. More fundamentally, many experts on the regional planning will be demanded in the near future. Some compound model which links, for instance, the forestry, the livestock sector, and the crop sector, needs to be devised. Finally, it is argued here that more elaborated model will work as an integrated environmental improvement plan which embraces living environment as well as agricultural environment.vironment.

  • PDF

A Study on the Development of Artificial Intelligence Crop Environment Control Framework

  • Guangzhi Zhao
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.2
    • /
    • pp.144-156
    • /
    • 2023
  • Smart agriculture is a rapidly growing field that seeks to optimize crop yields and reduce risk through the use of advanced technology. A key challenge in this field is the need to create a comprehensive smart farm system that can effectively monitor and control the growth environment of crops, particularly when cultivating new varieties. This is where fuzzy theory comes in, enabling the collection and analysis of external environmental factors to generate a rule-based system that considers the specific needs of each crop variety. By doing so, the system can easily set the optimal growth environment, reducing trial and error and the user's risk burden. This is in contrast to existing systems where parameters need to be changed for each breed and various factors considered. Additionally, the type of house used affects the environmental control factors for crops, making it necessary to adapt the system accordingly. While developing such a framework requires a significant investment of labour and time, the benefits are numerous and can lead to increased productivity and profitability in the field of smart agriculture. We developed an AI platform for optimal control of facility houses by integrating data from mushroom crops and environmental factors, and analysing the correlation between optimal control conditions and yield. Our experiments demonstrated significant performance improvement compared to the existing system.