• Title/Summary/Keyword: analysis data

Search Result 85,083, Processing Time 0.093 seconds

Method of Processing the Outliers and Missing Values of Field Data to Improve RAM Analysis Accuracy (RAM 분석 정확도 향상을 위한 야전운용 데이터의 이상값과 결측값 처리 방안)

  • Kim, In Seok;Jung, Won
    • Journal of Applied Reliability
    • /
    • v.17 no.3
    • /
    • pp.264-271
    • /
    • 2017
  • Purpose: Field operation data contains missing values or outliers due to various causes of the data collection process, so caution is required when utilizing RAM analysis results by field operation data. The purpose of this study is to present a method to minimize the RAM analysis error of the field data to improve the accuracy. Methods: Statistical methods are presented for processing of the outliers and the missing values of the field operating data, and after analyzing the RAM, the differences between before and after applying the technique are discussed. Results: The availability is estimated to be lower by 6.8 to 23.5% than that before processing, and it is judged that the processing of the missing values and outliers greatly affect the RAM analysis result. Conclusion: RAM analysis of OO weapon system was performed and suggestions for improvement of RAM analysis were presented through comparison with the new and current method. Data analysis results without appropriate treatment of error values may result in incorrect conclusions leading to inappropriate decisions and actions.

A Study on the Classification of Variables Affecting Smartphone Addiction in Decision Tree Environment Using Python Program

  • Kim, Seung-Jae
    • International journal of advanced smart convergence
    • /
    • v.11 no.4
    • /
    • pp.68-80
    • /
    • 2022
  • Since the launch of AI, technology development to implement complete and sophisticated AI functions has continued. In efforts to develop technologies for complete automation, Machine Learning techniques and deep learning techniques are mainly used. These techniques deal with supervised learning, unsupervised learning, and reinforcement learning as internal technical elements, and use the Big-data Analysis method again to set the cornerstone for decision-making. In addition, established decision-making is being improved through subsequent repetition and renewal of decision-making standards. In other words, big data analysis, which enables data classification and recognition/recognition, is important enough to be called a key technical element of AI function. Therefore, big data analysis itself is important and requires sophisticated analysis. In this study, among various tools that can analyze big data, we will use a Python program to find out what variables can affect addiction according to smartphone use in a decision tree environment. We the Python program checks whether data classification by decision tree shows the same performance as other tools, and sees if it can give reliability to decision-making about the addictiveness of smartphone use. Through the results of this study, it can be seen that there is no problem in performing big data analysis using any of the various statistical tools such as Python and R when analyzing big data.

Comparison of Methods for Reducing the Dimension of Compositional Data with Zero Values

  • Song, Taeg-Youn;Choi, Byung-Jin
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.4
    • /
    • pp.559-569
    • /
    • 2012
  • Compositional data consist of compositions that are non-negative vectors of proportions with the unit-sum constraint. In disciplines such as petrology and archaeometry, it is fundamental to statistically analyze this type of data. Aitchison (1983) introduced a log-contrast principal component analysis that involves logratio transformed data, as a dimension-reduction technique to understand and interpret the structure of compositional data. However, the analysis is not usable when zero values are present in the data. In this paper, we introduce 4 possible methods to reduce the dimension of compositional data with zero values. Two real data sets are analyzed using the methods and the obtained results are compared.

A Study on Building Energy Consumption Pattern Analysis Using Data Mining (데이터 마이닝을 이용한 건물 에너지 사용량 패턴 분석에 대한 연구)

  • Jung, Ki-Taek;Yoon, Sung-Min;Moon, Hyeun-Jun;Yeo, Wook-Hyun
    • KIEAE Journal
    • /
    • v.12 no.2
    • /
    • pp.77-82
    • /
    • 2012
  • Data mining is to discover problems in the large amounts of data. Also, data mining trying to find the cause of the problem and the structure. Building energy consumption patterns, the amount of data is infinite. Also, the patterns have a lot of direct and indirect effects. Discussion is needed about the correlation. This work looking for the cause of energy consumption. As a result, energy management can find out the issue. Building energy analysis utilizing data mining techniques to predict energy consumption. And the results are as follows: 1) Using data mining technique, We classified complicated data to several patterns and gained meaningful informations from them. 2) Using cluster analysis, We classified building energy consumption data of residents and analyzed characters of patterns.

A Study on the Principal Component Analysis of Anthropometric Data (인체계측치(人體計測値)의 주성분분석(主成分分析)에 관한 연구(硏究))

  • Lee, Sang-Do;Jeong, Jung-Hui;Kim, Geuk-Bae
    • Journal of the Ergonomics Society of Korea
    • /
    • v.2 no.1
    • /
    • pp.3-11
    • /
    • 1983
  • Anthropometric data is most basic materials in the all studies related with it. Therefore, in anthropometric data, not only consideration of the state of variance, but more various analysis is needed. This study selected the 13 parts that properly show a whole characteristics of human body and, anthropometric data were obtained through the actual measurements for male and female workers who were engaged in production factory. And, to interpret anthropometric data, principal component analysis of multivariate analysis methods was applied.

  • PDF

The analysis of flight data of B747-400 aircraft with Missed Approach (B747-400 항공기의 Missed Approach 비행자료 분석)

  • Shin, D.W.;Park, J.H.;Eun, H.B.
    • Journal of the Korean Society for Aviation and Aeronautics
    • /
    • v.11 no.2
    • /
    • pp.93-107
    • /
    • 2003
  • This study is performed to secure the safety of civil aviation by establishing systematic analysis ability of Flight Data Recorder. Through this study, readouting UFDR(Universal Flight Data Recorder) to personal computer, flight data numerical analysis and regulations of Missed Approach. In the analysis, the flight data of B747-400 model aircraft with Missed Approach in San Francisco(KSFO) was selected.

  • PDF

Proposed Data Literacy Competency Framework through Literature Analysis

  • Hyo-suk Kang;Suntae Kim
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.14 no.3
    • /
    • pp.115-140
    • /
    • 2024
  • With the advent of the Fourth Industrial Revolution and the era of big data, the ability to handle data has become essential. This has heightened the importance and necessity of data literacy competencies. The purpose of this study is to propose a framework for data literacy competencies. To achieve this goal, data literacy frameworks from eight countries and twelve pieces of literature on data literacy competencies were analyzed and synthesized, resulting in five categories and twenty-three competencies. The five categories are: data understanding and ethics, data collection and management, data analysis and evaluation, data utilization, and data governance and systems. It is hoped that the data literacy competency framework proposed in this study will serve as a foundational resource for policies, curricula, and the enhancement of individual data literacy competencies.

A study on unstructured text mining algorithm through R programming based on data dictionary (Data Dictionary 기반의 R Programming을 통한 비정형 Text Mining Algorithm 연구)

  • Lee, Jong Hwa;Lee, Hyun-Kyu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.20 no.2
    • /
    • pp.113-124
    • /
    • 2015
  • Unlike structured data which are gathered and saved in a predefined structure, unstructured text data which are mostly written in natural language have larger applications recently due to the emergence of web 2.0. Text mining is one of the most important big data analysis techniques that extracts meaningful information in the text because it has not only increased in the amount of text data but also human being's emotion is expressed directly. In this study, we used R program, an open source software for statistical analysis, and studied algorithm implementation to conduct analyses (such as Frequency Analysis, Cluster Analysis, Word Cloud, Social Network Analysis). Especially, to focus on our research scope, we used keyword extract method based on a Data Dictionary. By applying in real cases, we could find that R is very useful as a statistical analysis software working on variety of OS and with other languages interface.

MapReduce-Based Partitioner Big Data Analysis Scheme for Processing Rate of Log Analysis (로그 분석 처리율 향상을 위한 맵리듀스 기반 분할 빅데이터 분석 기법)

  • Lee, Hyeopgeon;Kim, Young-Woon;Park, Jiyong;Lee, Jin-Woo
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.5
    • /
    • pp.593-600
    • /
    • 2018
  • Owing to the advancement of Internet and smart devices, access to various media such as social media became easy; thus, a large amount of big data is being produced. Particularly, the companies that provide various Internet services are analyzing the big data by using the MapReduce-based big data analysis techniques to investigate the customer preferences and patterns and strengthen the security. However, with MapReduce, when the big data is analyzed by defining the number of reducer objects generated in the reduce stage as one, the processing rate of big data analysis decreases. Therefore, in this paper, a MapReduce-based split big data analysis method is proposed to improve the log analysis processing rate. The proposed method separates the reducer partitioning stage and the analysis result combining stage and improves the big data processing rate by decreasing the bottleneck phenomenon by generating the number of reducer objects dynamically.

Automatic Cross-calibration of Multispectral Imagery with Airborne Hyperspectral Imagery Using Spectral Mixture Analysis

  • Yeji, Kim;Jaewan, Choi;Anjin, Chang;Yongil, Kim
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.33 no.3
    • /
    • pp.211-218
    • /
    • 2015
  • The analysis of remote sensing data depends on sensor specifications that provide accurate and consistent measurements. However, it is not easy to establish confidence and consistency in data that are analyzed by different sensors using various radiometric scales. For this reason, the cross-calibration method is used to calibrate remote sensing data with reference image data. In this study, we used an airborne hyperspectral image in order to calibrate a multispectral image. We presented an automatic cross-calibration method to calibrate a multispectral image using hyperspectral data and spectral mixture analysis. The spectral characteristics of the multispectral image were adjusted by linear regression analysis. Optimal endmember sets between two images were estimated by spectral mixture analysis for the linear regression analysis, and bands of hyperspectral image were aggregated based on the spectral response function of the two images. The results were evaluated by comparing the Root Mean Square Error (RMSE), the Spectral Angle Mapper (SAM), and average percentage differences. The results of this study showed that the proposed method corrected the spectral information in the multispectral data by using hyperspectral data, and its performance was similar to the manual cross-calibration. The proposed method demonstrated the possibility of automatic cross-calibration based on spectral mixture analysis.