• Title/Summary/Keyword: Multi-Variate Data Analysis

Search Result 73, Processing Time 0.028 seconds

Multi-Variate Tabular Data Processing and Visualization Scheme for Machine Learning based Analysis: A Case Study using Titanic Dataset (기계 학습 기반 분석을 위한 다변량 정형 데이터 처리 및 시각화 방법: Titanic 데이터셋 적용 사례 연구)

  • Juhyoung Sung;Kiwon Kwon;Kyoungwon Park;Byoungchul Song
    • Journal of Internet Computing and Services
    • /
    • v.25 no.4
    • /
    • pp.121-130
    • /
    • 2024
  • As internet and communication technology (ICT) is improved exponentially, types and amount of available data also increase. Even though data analysis including statistics is significant to utilize this large amount of data, there are inevitable limits to process various and complex data in general way. Meanwhile, there are many attempts to apply machine learning (ML) in various fields to solve the problems according to the enhancement in computational performance and increase in demands for autonomous systems. Especially, data processing for the model input and designing the model to solve the objective function are critical to achieve the model performance. Data processing methods according to the type and property have been presented through many studies and the performance of ML highly varies depending on the methods. Nevertheless, there are difficulties in deciding which data processing method for data analysis since the types and characteristics of data have become more diverse. Specifically, multi-variate data processing is essential for solving non-linear problem based on ML. In this paper, we present a multi-variate tabular data processing scheme for ML-aided data analysis by using Titanic dataset from Kaggle including various kinds of data. We present the methods like input variable filtering applying statistical analysis and normalization according to the data property. In addition, we analyze the data structure using visualization. Lastly, we design an ML model and train the model by applying the proposed multi-variate data process. After that, we analyze the passenger's survival prediction performance of the trained model. We expect that the proposed multi-variate data processing and visualization can be extended to various environments for ML based analysis.

Feature Selecting and Classifying Integrated Neural Network Algorithm for Multi-variate Classification (다변량 데이터의 분류 성능 향상을 위한 특질 추출 및 분류 기법을 통합한 신경망 알고리즘)

  • Yoon, Hyun-Soo;Baek, Jun-Geol
    • IE interfaces
    • /
    • v.24 no.2
    • /
    • pp.97-104
    • /
    • 2011
  • Research for multi-variate classification has been studied through two kinds of procedures which are feature selection and classification. Feature Selection techniques have been applied to select important features and the other one has improved classification performances through classifier applications. In general, each technique has been independently studied, however consideration of the interaction between both procedures has not been widely explored which leads to a degraded performance. In this paper, through integrating these two procedures, classification performance can be improved. The proposed model takes advantage of KBANN (Knowledge-Based Artificial Neural Network) which uses prior knowledge to learn NN (Neural Network) as training information. Each NN learns characteristics of the Feature Selection and Classification techniques as training sets. The integrated NN can be learned again to modify features appropriately and enhance classification performance. This innovative technique is called ALBNN (Algorithm Learning-Based Neural Network). The experiments' results show improved performance in various classification problems.

A Propose of New Classification Indication about Work of Art through Numeric and Multivariate Data Analysis - Focused on the Specialist - (예술작품의 수치화와 다변량분석에 의한 새로운 분류 제안 - 전문가를 중심으로 -)

  • Suh, Myung-Ae;Ree, Sang-Bok
    • Journal of Korean Society for Quality Management
    • /
    • v.35 no.4
    • /
    • pp.67-77
    • /
    • 2007
  • We tried new interpreting about the work of art in this paper. The work of art respects the intention of the artist to make it and interprets intention until now. After critics distinguish by a period, an area that they set to philosophical thought which is the time and interpreted. We set to each one subjectivity and interpreted between artist to make the work of art and appreciator. But in this paper, we tied various criteria which appreciates the work of art. We tried so that we presented the intimacy each other newly. Otherwise we tied with the subjectivity of the individual and are the try to be an objectification low through statistical technique. We looked into the culture and art in the introduction and explain the discussion about the work of art interpreting which the main subject. We set the category 6 area, and explain an each criteria explanation and assessment method. We tried to propose new interpreting as the intimacy to be multi-variate data analysis result of the assessment analysis.

Relationships Between Multiple Intelligences and Affective Factors in Children's Learning (아동의 다중지능과 학습의 정의적 요인의 관계)

  • Jung, Hye Young;Lee, Kyeong Hwa
    • Korean Journal of Child Studies
    • /
    • v.28 no.5
    • /
    • pp.253-267
    • /
    • 2007
  • This study examined the relationships between multiple intelligences as cognitive factors and affective factors of learning motivation and academic self-concept. The data were collected from 276 4th grade elementary school students and analyzed by correlation, multi-variate analysis, and step-wise multiple regression. Results were that (1) multiple intelligences, learning motivation, and academic self-concept had statistically significant correlations among themselves. Multi-variate analysis showed that intra-personal intelligence explained 58.6% of the linear combination of learning motivation and academic self-concept. (2) Intra-personal intelligence explained 29% to 58% of learning motivation and its sub-factors of achievement motivation, internal locus of control, self-efficacy, and self-regulation. (3) Intra-personal intelligence, logical-mathematical intelligence, musical intelligence, and inter-personal intelligence were explanatory variables for academic self-concept and its sub-factors.

  • PDF

Rock TBM design model derived from the multi-variate regression analysis of TBM driving data (TBM 굴진자료의 다변량 회귀분석에 의한 암반대응형 TBM의 설계모델 도출)

  • Chang, Soo-Ho;Choi, Soon-Wook;Lee, Gyu-Phil;Bae, Gyu-Jin
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.13 no.6
    • /
    • pp.531-555
    • /
    • 2011
  • This study aims to derive the statistical models for the estimation of the required specifications of a rock TBM as well as for its cutterhead design suitable for a given rock mass condition. From a series of multi-variate regression analysis of 871 TBM driving data and 51 linear rock cutting test results, the optimum models were newly proposed to consider a variety of rock properties and mechanical cutting conditions. When the derived models were applied to two domestic shield tunnels, their predictions of cutter penetration depth, cutter acting forces and cutter spacing were very close to real TBM driving data, showing their high applicability.

Explanatory Analysis for South Korea's Political Website Linking - Statistical Aspects

  • Choi, Kyoung-Ho;Park, Han-Woo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.899-911
    • /
    • 2005
  • This paper conducts an explanatory analysis of the web sphere produced by National Assemblymen in South Korea, using some statistical methods. First, some descriptive metrics were employed. Next, the traditional methods of multi-variate analyses, multidimensional scaling and corresponding analysis, were applied to the data. Finally, cross-sectional data were compared to examine a change over time.

  • PDF

A Study on the Relationship of Air Pollution and Meteorological Factors : Focusing at Kwanghwamun in Seoul (대기오염농도와 기상인자의 관련성 연구: 서울 광화문지점을 중심으로)

  • 신찬기;한진석;김윤신
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.8 no.4
    • /
    • pp.213-220
    • /
    • 1992
  • Simple correlation analysis, factor analysis, and multi-variate analysis have been performed to analyze the relationship between air pollution and meteorological factors for air pollution and meteorological data measured at Kwanghwamun in Seoul during the period of one year(January 1990 $\sim$ December 1990). As a result of simple correlation and factor analysis, $SO_2$, TSP and CO concentrations have shown high negative correlation with temperature and among these indicating that these are related with pollutant emission trend based upon heating fuel usage. Ozone has a good corrleation with solar radiation and relative humidity to have a closed relation with $O_3$ generation reaction mechanism. The result of multi-variate correlation analysis shows that the concentration of $SO_2$ and CO are adequate for correlation model with ambient temperature and wind speed and $O_3$ concentrations are adequate for that with solar radiation and wind speed. $SO_2$ and CO levels are considered to be affected first of all by heating fuel usage as a emssion source and wind speed as a dispersion effect. The $SO_2$ concentration in the condition that the temperature fall below zero is explained by multilicative model with wind speed, only one variable.

  • PDF

Multi-variate Empirical Mode Decomposition (MEMD) for ambient modal identification of RC road bridge

  • Mahato, Swarup;Hazra, Budhaditya;Chakraborty, Arunasis
    • Structural Monitoring and Maintenance
    • /
    • v.7 no.4
    • /
    • pp.283-294
    • /
    • 2020
  • In this paper, an adaptive MEMD based modal identification technique for linear time-invariant systems is proposed employing multiple vibration measurements. Traditional empirical mode decomposition (EMD) suffers from mode-mixing during sifting operations to identify intrinsic mode functions (IMF). MEMD performs better in this context as it considers multi-channel data and projects them into a n-dimensional hypercube to evaluate the IMFs. Using this technique, modal parameters of the structural system are identified. It is observed that MEMD has superior performance compared to its traditional counterpart. However, it still suffers from mild mode-mixing in higher modes where the energy contents are low. To avoid this problem, an adaptive filtering scheme is proposed to decompose the interfering modes. The Proposed modified scheme is then applied to vibrations of a reinforced concrete road bridge. Results presented in this study show that the proposed MEMD based approach coupled with the filtering technique can effectively identify the parameters of the dominant modes present in the structural response with a significant level of accuracy.

The Correlation Analysis of Physical Characteristics on Human Sensibility Space (감성적 의미공간상의 물리특성간 상관분석)

  • 김정만;김병극
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.22 no.52
    • /
    • pp.241-246
    • /
    • 1999
  • In this study, to specify an evaluation of human sensibility, the types of color, intensity of illuminations and lights consisting work environmental condition are decided, and image data from examining the change of human sensibility followed by changes of the above three conditions are obtained. Using the factor analysis and quantification theory in multi-variate analysis type of Sensibility Ergonomics, determinating the structure of factors, specifying the relations of environmental conditions and factors can be done so that the structure of image on human sensibility space with the change of environmental conditions is analyzed.

  • PDF

Dynamics Analysis of a Small Training Boat ant Its Optimal Control

  • Nakatani, Toshihiko;End, Makoto;Yamamoto, Keiichiro;Kanda, Taishi
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.342-345
    • /
    • 2005
  • This paper describes dynamics analysis of a small training boat and a new type of ship's autopilot not only to keep her course but also to reduce her roll motion. Firstly, statistical analysis through multi-variate auto regressive model is carried out using the real data collected from the sea trial on an actual small training boat Sazanami after the navigational system of the boat was upgraded. It is shown that the roll motion is strongly influenced by the rudder motion and it is suggested that there is a possibility of reducing the roll motion by controlling the rudder order properly. Based on this observation, a new type of ship's autopilot that takes the roll motion into account is designed using the muti-variate modern control theory. Lastly, digital simulations by white noise are carried out in order to evaluate the proposed system and a typical result is demonstrated. As results of simulations, the proposed autopilot had good performance compared with the original data.

  • PDF