• Title/Summary/Keyword: Data Accuracy

Search Result 11,414, Processing Time 0.043 seconds

Finding Unexpected Test Accuracy by Cross Validation in Machine Learning

  • Yoon, Hoijin
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.12spc
    • /
    • pp.549-555
    • /
    • 2021
  • Machine Learning(ML) splits data into 3 parts, which are usually 60% for training, 20% for validation, and 20% for testing. It just splits quantitatively instead of selecting each set of data by a criterion, which is very important concept for the adequacy of test data. ML measures a model's accuracy by applying a set of validation data, and revises the model until the validation accuracy reaches on a certain level. After the validation process, the complete model is tested with the set of test data, which are not seen by the model yet. If the set of test data covers the model's attributes well, the test accuracy will be close to the validation accuracy of the model. To make sure that ML's set of test data works adequately, we design an experiment and see if the test accuracy of model is always close to its validation adequacy as expected. The experiment builds 100 different SVM models for each of six data sets published in UCI ML repository. From the test accuracy and its validation accuracy of 600 cases, we find some unexpected cases, where the test accuracy is very different from its validation accuracy. Consequently, it is not always true that ML's set of test data is adequate to assure a model's quality.

A study on the Effective Use of Environmental Information System - focused on the accuracy of raw data - (환경정보체계의 효과적 이용에 관한 고찰 - 원자료의 정확성을 중심으로 -)

  • Lee, Kyoo-Seock
    • Journal of Environmental Impact Assessment
    • /
    • v.7 no.2
    • /
    • pp.27-35
    • /
    • 1998
  • In Korea, the initial installation of GIS requires lots of cost, time, and human efforts. If the accuracy of GIS data does not meet the certain standard for use, the system may not work as expected. So, it needs to be investigated for the accuracy of raw data. However, there is little study for the accuracy of raw data in Korea. Therefore, the purpose of this study is to review the data accuracy of raw data - geologic map, 1:5,000 and 1:25,000 scale topographic map, forest stand map, degree of green naturality(DGN) map, and detailed survey data of DGN map-, which are to be used in Environmental Information System(EIS) in Korea. After this study, some errors in data were surveyed and following conclusions were derived. (1) There is no map data, e. g, wildlife habitat map. (2) Some data are misinterpreted depending on the location in the geologic map. (3) Some data are not updated properly after change of topography in the topographic map or the elevation and location is different depending on the scale.. (4) Some data are not edited properly in the forest stand map, e. g. two attributes in one polygon. (5) DGN classification system does not reflect the characteristic of Korean vegetation community. So, it needs to be refined and restructured.

  • PDF

Validation Data Augmentation for Improving the Grading Accuracy of Diabetic Macular Edema using Deep Learning (딥러닝을 이용한 당뇨성황반부종 등급 분류의 정확도 개선을 위한 검증 데이터 증강 기법)

  • Lee, Tae Soo
    • Journal of Biomedical Engineering Research
    • /
    • v.40 no.2
    • /
    • pp.48-54
    • /
    • 2019
  • This paper proposed a method of validation data augmentation for improving the grading accuracy of diabetic macular edema (DME) using deep learning. The data augmentation technique is basically applied in order to secure diversity of data by transforming one image to several images through random translation, rotation, scaling and reflection in preparation of input data of the deep neural network (DNN). In this paper, we apply this technique in the validation process of the trained DNN, and improve the grading accuracy by combining the classification results of the augmented images. To verify the effectiveness, 1,200 retinal images of Messidor dataset was divided into training and validation data at the ratio 7:3. By applying random augmentation to 359 validation data, $1.61{\pm}0.55%$ accuracy improvement was achieved in the case of six times augmentation (N=6). This simple method has shown that the accuracy can be improved in the N range from 2 to 6 with the correlation coefficient of 0.5667. Therefore, it is expected to help improve the diagnostic accuracy of DME with the grading information provided by the proposed DNN.

Effect of Input Data Video Interval and Input Data Image Similarity on Learning Accuracy in 3D-CNN

  • Kim, Heeil;Chung, Yeongjee
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.2
    • /
    • pp.208-217
    • /
    • 2021
  • 3D-CNN is one of the deep learning techniques for learning time series data. However, these three-dimensional learning can generate many parameters, requiring high performance or having a significant impact on learning speed. We will use these 3D-CNNs to learn hand gesture and find the parameters that showed the highest accuracy, and then analyze how the accuracy of 3D-CNN varies through input data changes without any structural changes in 3D-CNN. First, choose the interval of the input data. This adjusts the ratio of the stop interval to the gesture interval. Secondly, the corresponding interframe mean value is obtained by measuring and normalizing the similarity of images through interclass 2D cross correlation analysis. This experiment demonstrates that changes in input data affect learning accuracy without structural changes in 3D-CNN. In this paper, we proposed two methods for changing input data. Experimental results show that input data can affect the accuracy of the model.

Accuracy Analysis of Road Surveying and Construction Inspection of Underpass Section using Mobile Mapping System

  • Park, Joon Kyu;Um, Dae Yong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.39 no.2
    • /
    • pp.103-111
    • /
    • 2021
  • MMS (Mobile Mapping System) is being used for HD (High Definition) map construction because it enables fast and accurate data construction, and it is receiving a lot of attention. However, research on the use of MMS in the construction field is insufficient. In this study, road surveying and inspection of construction structures were performed using MMS. Through data acquisition and processing using MMS, point cloud data for the study site was created, and the accuracy was evaluated by comparing with traditional surveying methods. The accuracy analysis results showed a maximum of 0.096m, 0.091m, and 0.093m in the X, Y, and H directions, respectively. Each RMSE was 0.012m, 0.015m, and 0.006m. These result satisfy the accuracy of topographic surveying in the general survey work regulation, indicating that construction surveying using MMS is possible. In addition, a 3D model was created using the design data for the underpass road, and the inspection was performed by comparing it with the MMS data. Through inspection results, deviations in construction can be visually confirmed for the entire underground roadway. The traditional method takes 6 hours for the 4.5km section of the target area, but MMS can significantly shorten the data acquisition time to 0.5 hours. Accurate 3D data is essential data as basic data for future smart construction. With MMS, you can increase the efficiency of construction sites with fast data collection and accuracy.

Improving Data Accuracy Using Proactive Correlated Fuzzy System in Wireless Sensor Networks

  • Barakkath Nisha, U;Uma Maheswari, N;Venkatesh, R;Yasir Abdullah, R
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.9
    • /
    • pp.3515-3538
    • /
    • 2015
  • Data accuracy can be increased by detecting and removing the incorrect data generated in wireless sensor networks. By increasing the data accuracy, network lifetime can be increased parallel. Network lifetime or operational time is the time during which WSN is able to fulfill its tasks by using microcontroller with on-chip memory radio transceivers, albeit distributed sensor nodes send summary of their data to their cluster heads, which reduce energy consumption gradually. In this paper a powerful algorithm using proactive fuzzy system is proposed and it is a mixture of fuzzy logic with comparative correlation techniques that ensure high data accuracy by detecting incorrect data in distributed wireless sensor networks. This proposed system is implemented in two phases there, the first phase creates input space partitioning by using robust fuzzy c means clustering and the second phase detects incorrect data and removes it completely. Experimental result makes transparent of combined correlated fuzzy system (CCFS) which detects faulty readings with greater accuracy (99.21%) than the existing one (98.33%) along with low false alarm rate.

The Measurements of Data Accuracy and Error Detection in DEM using GRASS and Arc/Info (GRASS와 Arc/Info를 이용한 DEM 데이터의 정확도와 에러 측정)

  • Cho, Sung-Min
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.1 no.1
    • /
    • pp.3-7
    • /
    • 1998
  • The issue of data accuracy brings a different perspective to the issue of GIS modeling, calls into a question the usefulness of data models such as DEM. Accuracy can be determined by randomly checking positional and attribute accuracy within a GIS data layer. With the increasing availability of DEM and the software capable of processing them, it is worthwhile to call attention for data accuracy and error analysis as GIS application depends on the priori established spatial data. The purpose of this paper was to investigate methods for data accuracy measurement and error detection methodology with two types of DEM's: 1 to 24,000 and 1 to 250,000 DEM released by U.S. Geological Survey. Another emphasis was given to the development of methodology for processing DEM's to create Arc/Info and GRASS layers. Data accuracy analysis with DEM was applied to a 250 sq.km area and an error was detected at a scale of 1:24,000 DEM. There were two possible reasons for this error: gross errors and blunders.

A Study on Accuracy Estimation of Service Model by Cross-validation and Pattern Matching

  • Cho, Seongsoo;Shrestha, Bhanu
    • International journal of advanced smart convergence
    • /
    • v.6 no.3
    • /
    • pp.17-21
    • /
    • 2017
  • In this paper, the service execution accuracy was compared by ontology based rule inference method and machine learning method, and the amount of data at the point when the service execution accuracy of the machine learning method becomes equal to the service execution accuracy of the rule inference was found. The rule inference, which measures service execution accuracy and service execution accuracy using accumulated data and pattern matching on service results. And then machine learning method measures service execution accuracy using cross validation data. After creating a confusion matrix and measuring the accuracy of each service execution, the inference algorithm can be selected from the results.

Development of Personal-Credit Evaluation System Using Real-Time Neural Learning Mechanism

  • Park, Jong U.;Park, Hong Y.;Yoon Chung
    • The Journal of Information Technology and Database
    • /
    • v.2 no.2
    • /
    • pp.71-85
    • /
    • 1995
  • Many research results conducted by neural network researchers have claimed that the classification accuracy of neural networks is superior to, or at least equal to that of conventional methods. However, in series of neural network classifications, it was found that the classification accuracy strongly depends on the characteristics of training data set. Even though there are many research reports that the classification accuracy of neural networks can be different, depending on the composition and architecture of the networks, training algorithm, and test data set, very few research addressed the problem of classification accuracy when the basic assumption of data monotonicity is violated, In this research, development project of automated credit evaluation system is described. The finding was that arrangement of training data is critical to successful implementation of neural training to maintain monotonicity of the data set, for enhancing classification accuracy of neural networks.

  • PDF

A study on the Effective Use of Environmental Information System in Korea - focused on the accuracy of raw data - (환경정보체계 구축의 효과적 이용 - 원자료의 정확성을 중심으로 -)

  • 이규석
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 1998.11a
    • /
    • pp.34-36
    • /
    • 1998
  • In Korea, the initial installation of GIS requires lots of cost, time, and human efforts, If the accuracy of GIS data does not meet the certain standard for use, the system may not work as expected. So, it needs to be investigated for the accuracy of raw data. However, there is little study for the accuracy of raw data in Korea. Therefore, the purpose of this study is to review the data accuracy of raw data - geologic map, 1:5,000 and 1:25,000 scale topographic map, forest stand map, degree of green naturality(DGN) map, and detailed survey data of DGN map - for fulfilling the expected use in Korea. After this study, some errors in data were surveyed and following conclusions were derived. (1) There is a lack of data, e. g, wildlife habitat map. (2) Some data are misinterpreted depending on the location in the geologic map. (3) Some data are not updated after change of topography in the topographic map. (4) Some data are not edited properly in the forest stand map. (4) DGN classification system does not reflect the characteristic of Korean vegetation community. So, it needs to be refined and restructured.

  • PDF