• Title, Summary, Keyword: Data validation

Search Result 2,647, Processing Time 0.039 seconds

Validation Data Augmentation for Improving the Grading Accuracy of Diabetic Macular Edema using Deep Learning (딥러닝을 이용한 당뇨성황반부종 등급 분류의 정확도 개선을 위한 검증 데이터 증강 기법)

  • Lee, Tae Soo
    • Journal of Biomedical Engineering Research
    • /
    • v.40 no.2
    • /
    • pp.48-54
    • /
    • 2019
  • This paper proposed a method of validation data augmentation for improving the grading accuracy of diabetic macular edema (DME) using deep learning. The data augmentation technique is basically applied in order to secure diversity of data by transforming one image to several images through random translation, rotation, scaling and reflection in preparation of input data of the deep neural network (DNN). In this paper, we apply this technique in the validation process of the trained DNN, and improve the grading accuracy by combining the classification results of the augmented images. To verify the effectiveness, 1,200 retinal images of Messidor dataset was divided into training and validation data at the ratio 7:3. By applying random augmentation to 359 validation data, $1.61{\pm}0.55%$ accuracy improvement was achieved in the case of six times augmentation (N=6). This simple method has shown that the accuracy can be improved in the N range from 2 to 6 with the correlation coefficient of 0.5667. Therefore, it is expected to help improve the diagnostic accuracy of DME with the grading information provided by the proposed DNN.

A Visual Approach for Data-Intensive Workflow Validation

  • Park, Minjae;Ahn, Hyun;Kim, Kwanghoon Pio
    • Journal of Internet Computing and Services
    • /
    • v.17 no.5
    • /
    • pp.43-49
    • /
    • 2016
  • This paper presents a workflow validation method for data-intensive graphical workflow models using real-time workflow tracing mode on data-intensive workflow designer. In order to model and validate workflows, we try to divide as modes have editable mode and tracing mode on data-intensive workflow designer. We could design data-intensive workflow using drag and drop in editable-mode, otherwise we could not design but view and trace workflow model in tracing mode. We would like to focus on tracing-mode for workflow validation, and describe how to use workflow tracing on data-intensive workflow model designer. Especially, it is support data centered operation about control logics and exchange variables on workflow runtime for workflow tracing.

OVERVIEW OF KOMPSAT APPLICATION PRODUCT VALIDATION SITE AND THE RELATED ACTIVITIES

  • Lee, Kwang-Jae;Youn, Bo-Yeol;Kim, Duk-Jin;Kim, Youn-Soo
    • Proceedings of the KSRS Conference
    • /
    • /
    • pp.122-125
    • /
    • 2007
  • In recent years, there has been an increasing demand for improved accuracy and reliability of Earth Observation Satellite (EOS) data. Most of the data users in the field of remote sensing require understanding of product accuracy and uncertainty. Especially, EOS application products should be validated for practical application in the field. In order to evaluate the availability and applicability of application products, it will be necessary to establish a systematic validation system including techniques, equipments, ground truth data, etc. The Product Validation Site (PVS) for generation and validation of KOMPSAT application products was designed and established with various in-situ equipment and dataset. This paper presents the status of PVS and summarizes some results from experiment studies at PVS.

  • PDF

Comparison of the Cluster Validation Techniques using Gene Expression Data (유전자 발현 자료를 이용한 군집 타당성분석 기법 비교)

  • Jeong, Yun-Kyoung;Baek, Jang-Sun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • /
    • pp.63-76
    • /
    • 2006
  • Several clustering algorithms to analyze gene expression data and cluster validation techniques that assess the quality of their outcomes, have been suggested, but evaluations of these cluster validation techniques have seldom been implemented. In this paper we compared various cluster validity indices for simulation data and real genomic data, and found that Dunn's index is more effective and robust through small simulations and with real gene expression data.

  • PDF

Design of an Algorithm for the Validation of SCL in Digital Substations

  • Jang, B.T.;Alidu, A.;Kim, N.D.
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.3 no.2
    • /
    • pp.89-97
    • /
    • 2017
  • The substation is a critical node in the power network where power is transformed in the power generation, transmission and distribution system. The IEC 61850 is a global standard which proposes efficient substation automation by defining interoperable communication and data modelling techniques. In order to achieve this level of interoperability and automation, the IEC 61850 (Part 6) defines System Configuration description Language (SCL). The SCL is an XML based file format for defining the abstract model of primary and secondary substation equipment, communications systems and also the relationship between them. It enables the interoperable exchange of data during substation engineering by standardizing the description of applications at different stages of the engineering process. To achieve the seamless interoperability, multi-vendor devices are required to adhere completely to the IEC 61850. This paper proposes an efficient algorithm required for verifying the interoperability of multi-vendor devices by checking the adherence of the SCL file to specifications of the standard. Our proposed SCL validation algorithm consists of schema validation and other functionalities including information model validation using UML data model, the Vendor Defined Extension model validation, the User Defined Rule validation and the IED Engineering Table (IET) consistency validation. It also integrates the standard UCAIUG (Utility Communication Architecture International Users Group) Procedure validation for quality assurance testing. Our proposed algorithm is not only flexible and efficient in terms of ensuring interoperable functionality of tested devices, it is also convenient for use by system integrators and test engineers.

Comparison of the Cluster Validation Methods for High-dimensional (Gene Expression) Data (고차원 (유전자 발현) 자료에 대한 군집 타당성분석 기법의 성능 비교)

  • Jeong, Yun-Kyoung;Baek, Jang-Sun
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.1
    • /
    • pp.167-181
    • /
    • 2007
  • Many clustering algorithms and cluster validation techniques for high-dimensional gene expression data have been suggested. The evaluations of these cluster validation techniques have, however, seldom been implemented. In this paper we compared various cluster validity indices for low-dimensional simulation data and real gene expression data, and found that Dunn's index is the most effective and robust, Silhouette index is next and Davies-Bouldin index is the bottom among the internal measures. Jaccard index is much more effective than Goodman-Kruskal index and adjusted Rand index among the external measures.

Basic Principles of the Validation for Good Laboratory Practice Institutes

  • Cho, Kyu-Hyuk;Kim, Jin-Sung;Jeon, Man-Soo;Lee, Kyu-Hong;Chung, Moon-Koo;Song, Chang-Woo
    • Toxicological Research
    • /
    • v.25 no.1
    • /
    • pp.1-8
    • /
    • 2009
  • Validation specifies and coordinates all relevant activities to ensure compliance with good laboratory practices (GLP) according to suitable international standards. This includes validation activities of past, present and future for the best possible actions to ensure the integrity of non-clinical laboratory data. Recently, validation has become increasingly important, not only in good manufacturing practice (GMP) institutions but also in GLP facilities. In accordance with the guideline for GLP regulations, all equipments used to generate, measure, or assess data should undergo validation to ensure that this equipment is of appropriate design and capacity and that it will consistently function as intended. Therefore, the implantation of validation processes is considered to be an essential step in a global institution. This review describes the procedures and documentations required for validation of GLP. It introduces basic elements such as the validation master plan, risk assessment, gap analysis, design qualification, installation qualification, operational qualification, performance qualification, calibration, traceability, and revalidation.

Multiclass LS-SVM ensemble for large data

  • Hwang, Hyungtae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1557-1563
    • /
    • 2015
  • Multiclass classification is typically performed using the voting scheme method based on combining binary classifications. In this paper we propose multiclass classification method for large data, which can be regarded as the revised one-vs-all method. The multiclass classification is performed by using the hat matrix of least squares support vector machine (LS-SVM) ensemble, which is obtained by aggregating individual LS-SVM trained on each subset of whole large data. The cross validation function is defined to select the optimal values of hyperparameters which affect the performance of multiclass LS-SVM proposed. We obtain the generalized cross validation function to reduce computational burden of cross validation function. Experimental results are then presented which indicate the performance of the proposed method.

CROSS- VALIDATION OF LANDSLIDE SUSCEPTIBILITY MAPPING IN KOREA

  • LEE SARO
    • Proceedings of the KSRS Conference
    • /
    • /
    • pp.291-293
    • /
    • 2004
  • The aim of this study was to cross-validate a spatial probabilistic model of landslide likelihood ratios at Boun, Janghung and Yongin, in Korea, using a Geographic Information System (GIS). Landslide locations within the study areas were identified by interpreting aerial photographs, satellite images and field surveys. Maps of the topography, soil type, forest cover, lineaments and land cover were constructed from the spatial data sets. The 14 factors that influence landslide occurrence were extracted from the database and the likelihood ratio of each factor was computed. 'Landslide susceptibility maps were drawn for these three areas using likelihood ratios derived not only from the data for that area but also using the likelihood ratios calculated from each of the other two areas (nine maps in all) as a cross-check of the validity of the method For validation and cross-validation, the results of the analyses were compared, in each study area, with actual landslide locations. The validation and cross-validation of the results showed satisfactory agreement between the susceptibility map and the existing landslide locations.

  • PDF

Introduction to the Validation Module Design for CMDPS Baseline Products

  • Kim, Shin-Young;Chung, Chu-Yong;Ou, Mi-Lim
    • Proceedings of the KSRS Conference
    • /
    • /
    • pp.146-148
    • /
    • 2007
  • CMDPS (COMS Meteorological Data Processing System) is the operational meteorological products extraction system for data observed from COMS (Communication, Ocean and Meteorological Satellite) meteorological imager. CMDPS baseline products consist of 16 parameters including cloud information, water vapor products, surface information, environmental products and atmospheric motion vector. Additionally, CMDPS includes the function of calibration monitoring, and validation mechanism of the baseline products. The main objective of CMDPS validation module development is near-real time monitoring for the accuracy and reliability of the whole CMDPS products. Also, its long time validation statistics are used for upgrade of CMDPS such as algorithm parameter tuning and retrieval algorithm modification. This paper introduces the preliminary design on CMDPS validation module.

  • PDF