• Title/Summary/Keyword: method: data analysis

Search Result 22,301, Processing Time 0.05 seconds

Street Fashion Information Analysis System Design Using Data Fusion

  • Park, Hee-Chang;Park, Hye-Won
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2005.10a
    • /
    • pp.35-45
    • /
    • 2005
  • Data fusion is method to combination data. The purpose of this study is to design and implementation for street fashion information analysis system using data fusion. It can offer variety and actually information because it can fuse image data and survey data for street fashion. Data fusion method exists exact matching method, judgemental matching method, probability matching method, statistical matching method, data linking method, etc. In this study, we use exact matching method. Our system can be visual information analysis of customer's viewpoint because it can analyze both each data and fused data for image data and survey data.

  • PDF

Analyzing Operation Deviation in the Deasphalting Process Using Multivariate Statistics Analysis Method

  • Park, Joo-Hwang;Kim, Jong-Soo;Kim, Tai-Suk
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.7
    • /
    • pp.858-865
    • /
    • 2014
  • In the case of system like MES, various sensors collect the data in real time and save it as a big data to monitor the process. However, if there is big data mining in distributed computing system, whole processing process can be improved. In this paper, system to analyze the cause of operation deviation was built using the big data which has been collected from deasphalting process at the two different plants. By applying multivariate statistical analysis to the big data which has been collected through MES(Manufacturing Execution System), main cause of operation deviation was analyzed. We present the example of analyzing the operation deviation of deasphalting process using the big data which collected from MES by using multivariate statistics analysis method. As a result of regression analysis of the forward stepwise method, regression equation has been found which can explain 52% increase of performance compare to existing model. Through this suggested method, the existing petrochemical process can be replaced which is manual analysis method and has the risk of being subjective according to the tester. The new method can provide the objective analysis method based on numbers and statistic.

Long and Short Wave Radiation and Correlation Analysis Between Downtown and Suburban Area(II) - Study on Correlation Analysis Method of Radiation Data - (도심부와 교외지역의 장·단파 복사와 상관도 분석 (II) - 관측 자료의 상관도 분석기법에 관한 연구 -)

  • Choi, Dong-Ho;Lee, Bu-Yong;Oh, Ho-Yeop
    • Journal of the Korean Solar Energy Society
    • /
    • v.33 no.4
    • /
    • pp.101-110
    • /
    • 2013
  • The propose of this study is to understand the phenomenon of radiation and comparison of analysis of two methods. One is analysis method of same-time data and the another is analysis method of rank data. We confirmed that two methods of correlation analysis had the effectiveness and suitability. The followings are main results from this study. 1) The seasonal correlation coefficient of long and short-wave radiation is higher in winter than in summer because of high humidity in the summer season can makes easily cloud in the sky locally. 2) According to analysis method, there is big difference in correlation coefficient from 0.494(Analysis method of same-time data) to 0.967(Analysis method of rank data) with short-wave radiation by the location during summer. These results have significant value in solar radiation research and analysis. It has explored a new way for solar radiation research of analysis method as well.

Comparison of Sentiment Analysis from Large Twitter Datasets by Naïve Bayes and Natural Language Processing Methods

  • Back, Bong-Hyun;Ha, Il-Kyu
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.4
    • /
    • pp.239-245
    • /
    • 2019
  • Recently, effort to obtain various information from the vast amount of social network services (SNS) big data generated in daily life has expanded. SNS big data comprise sentences classified as unstructured data, which complicates data processing. As the amount of processing increases, a rapid processing technique is required to extract valuable information from SNS big data. We herein propose a system that can extract human sentiment information from vast amounts of SNS unstructured big data using the naïve Bayes algorithm and natural language processing (NLP). Furthermore, we analyze the effectiveness of the proposed method through various experiments. Based on sentiment accuracy analysis, experimental results showed that the machine learning method using the naïve Bayes algorithm afforded a 63.5% accuracy, which was lower than that yielded by the NLP method. However, based on data processing speed analysis, the machine learning method by the naïve Bayes algorithm demonstrated a processing performance that was approximately 5.4 times higher than that by the NLP method.

Comparing Methodology of Building Energy Analysis - Comparative Analysis from steady-state simulation to data-driven Analysis - (건물에너지 분석 방법론 비교 - Steady-state simulation에서부터 Data-driven 방법론의 비교 분석 -)

  • Cho, Sooyoun;Leigh, Seung-Bok
    • KIEAE Journal
    • /
    • v.17 no.5
    • /
    • pp.77-86
    • /
    • 2017
  • Purpose: Because of the growing concern over fossil fuel use and increasing demand for greenhouse gas emission reduction since the 1990s, the building energy analysis field has produced various types of methods, which are being applied more often and broadly than ever. A lot of research products have been actively proposed in the area of the building energy simulation for over 50 years around the world. However, in the last 20 years, there have been only a few research cases where the trend of building energy analysis is examined, estimated or compared. This research aims to investigate a trend of the building energy analysis by focusing on methodology and characteristics of each method. Method: The research papers addressing the building energy analysis are classified into two types of method: engineering analysis and algorithm estimation. Especially, EPG(Energy Performance Gap), which is the limit both for the existing engineering method and the single algorithm-based estimation method, results from comparing data of two different levels- in other words, real time data and simulation data. Result: When one or more ensemble algorithms are used, more accurate estimations of energy consumption and performance are produced, and thereby improving the problem of energy performance gap.

A Bayesian Approach for the Analysis of Times to Multiple Events : An Application on Healthcare Data (다사건 시계열 자료 분석을 위한 베이지안 기반의 통계적 접근의 응용)

  • Seok, Junhee;Kang, Yeong Seon
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.39 no.4
    • /
    • pp.51-69
    • /
    • 2014
  • Times to multiple events (TMEs) are a major data type in large-scale business and medical data. Despite its importance, the analysis of TME data has not been well studied because of the analysis difficulty from censoring of observation. To address this difficulty, we have developed a Bayesian-based multivariate survival analysis method, which can successfully estimate the joint probability density of survival times. In this work, we extended this method for the analysis of precedence, dependency and causality among multiple events. We applied this method to the electronic health records of 2,111 patients in a children's hospital in the US and the proposed analysis successfully shows the relation between times to two types of hospital visits for different medical issues. The overall result implies the usefulness of the multivariate survival analysis method in large-scale big data in a variety of areas including marketing, human resources, and e-commerce. Lastly, we suggest our future research directions based multivariate survival analysis method.

Customer Classification Method for Household Appliances Industries with a Large Number of Incomplete Data (다수의 결측치가 존재하는 가전업 고객 데이터 활용을 위한 고객분류기법의 개발)

  • Chang, Young-Soon;Seo, Jong-Hyen
    • IE interfaces
    • /
    • v.19 no.1
    • /
    • pp.86-96
    • /
    • 2006
  • Some customer data of manufacturing industries have a large number of incomplete data set due to the customer's infrequent purchasing behavior and the limitation of customer profile data gathered from sales representatives. So that, most sophisticated data analysis methods may not be applied directly. This paper proposes a heuristic data analysis method to classify customers in household appliances industries. The proposed PD (percent of difference) method can be used for the discriminant analysis of incomplete customer data with simple mathematical calculations. The method is composed of variable distribution estimation step, PD measure and cluster score evaluation steps, variable impact construction step, and segment assignment step. A real example is also presented.

Evaluation Method of Quality of Service in Telecommunications Using Logit Model (로짓모형을 이용한 통신 서비스품질 평가방법)

  • Cho, Jae-Gyeun;Ahn, Hae-Sook
    • IE interfaces
    • /
    • v.15 no.2
    • /
    • pp.209-217
    • /
    • 2002
  • Quality of Service(QoS) in the telecommunications can be evaluated by analyzing the opinion data which result from the surveyed opinions of respondents and quantify subjective satisfaction on the QoS from the customers' viewpoints. For analyzing the opinion data, MOS(mean opinion score) method and Cumulative Probability Curve method are often used. The methods are based on the scoring method, and therefore, have the intrinsic deficiency due to the assignment of arbitrary scores. In this paper, we propose an analysis method of the opinion data using logit models which can be used to analyze the ordinal categorical data without assigning arbitrary scores to customers' opinion, and develop an analysis procedure considering the usage of procedures provided by SAS(Statistical Analysis System) statistical package. By the proposed method, we can estimate the relationship between customer satisfaction and network performance parameters, and provide guidelines for network planning. In addition, the proposed method is compared with Cumulative Probability Curve method with respect to prediction errors.

A Study on Numerical Method for Motion Analysis of Cylindrical Cam with Translate Follower (병진운동용 원통캠기구의 운동해석을 위한 수치해석법 연구)

  • 김상진;신중호;김대원;박세환
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2002.05a
    • /
    • pp.719-722
    • /
    • 2002
  • Cylindrical cam mechanisms are used commonly in many automatic machinery. But the cylindrical cam is very difficult to design and manufacture the shape. The motion analysis of the cylindrical cam can check the accuracy between designed data and manufactured data of the cam shape and can reproduce without the cam design data. The motion analysis of the cylindrical cam consists of displacement analysis, velocity analysis and acceleration analysis. This paper performs the motion analysis of a cylindrical cam with translating follower by using a relative velocity method and a central difference method. The displacement is calculated by using the central difference method and the velocity is calculated by the relative velocity method. The relative velocity method is defined by the relative motion between follower and cam at a center of a follower roller. The central difference method is derived in the 3 dimensional space.

  • PDF

A Study on Error of Frequence Rainfall Estimates Using Random Variate (무작위변량을 이용한 강우빈도분석시 내외삽오차에 관한 연구)

  • Chai, Han Kyu;Eam, Ki Ok
    • Journal of Industrial Technology
    • /
    • v.20 no.A
    • /
    • pp.159-167
    • /
    • 2000
  • In the study rainfall frequency analysis attemped the many specific property data record duration it is differance from occur to error-term and probability ditribution of concern manifest. error-term analysis of method are fact sample data using method in other hand it is not appear to be fault that sample data of number to be small random variates. Therefore, day-rainfall data: to randomicity consider of this study sample data to the Monte Carlo method by randomize after data recode duration of form was choice method which compared an assumed maternal distribution from splitting frequency analysis consequence. In the conclusion, frequency analysis of chuncheon region rainfall appeared samll RMSE to the Gamma II distribution. In the rainfall frequency analysis estimate RMSE using random variates great transform, RMSE is appear that return period increasing little by little RMSE incresed and data number incresing to RMSE decreseing.

  • PDF