• 제목/요약/키워드: analysis data

검색결과 85,083건 처리시간 0.082초

건설사고 분석을 위한 텍스트 마이닝 기반 데이터 전처리 및 사고유형 분석 (Text mining-based Data Preprocessing and Accident Type Analysis for Construction Accident Analysis)

  • 윤영근;이재윤;오태근
    • 한국안전학회지
    • /
    • 제37권2호
    • /
    • pp.18-27
    • /
    • 2022
  • Construction accidents are difficult to prevent because several different types of activities occur simultaneously. The current method of accident analysis only indicates the number of occurrences for one or two variables and accidents have not reduced as a result of safety measures that focus solely on individual variables. Even if accident data is analyzed to establish appropriate safety measures, it is difficult to derive significant results due to a large number of data variables, elements, and qualitative records. In this study, in order to simplify the analysis and approach this complex problem logically, data preprocessing techniques, such as latent class cluster analysis (LCCA) and predictor importance were used to discover the most influential variables. Finally, the correlation was analyzed using an alluvial flow diagram consisting of seven variables and fourteen elements based on accident data. The alluvial diagram analysis using reduced variables and elements enabled the identification of accident trends into four categories. The findings of this study demonstrate that complex and diverse construction accident data can yield relevant analysis results, assisting in the prevention of accidents.

Exploring COVID-19 in mainland China during the lockdown of Wuhan via functional data analysis

  • Li, Xing;Zhang, Panpan;Feng, Qunqiang
    • Communications for Statistical Applications and Methods
    • /
    • 제29권1호
    • /
    • pp.103-125
    • /
    • 2022
  • In this paper, we analyze the time series data of the case and death counts of COVID-19 that broke out in China in December, 2019. The study period is during the lockdown of Wuhan. We exploit functional data analysis methods to analyze the collected time series data. The analysis is divided into three parts. First, the functional principal component analysis is conducted to investigate the modes of variation. Second, we carry out the functional canonical correlation analysis to explore the relationship between confirmed and death cases. Finally, we utilize a clustering method based on the Expectation-Maximization (EM) algorithm to run the cluster analysis on the counts of confirmed cases, where the number of clusters is determined via a cross-validation approach. Besides, we compare the clustering results with some migration data available to the public.

Review of Data-Driven Multivariate and Multiscale Methods

  • Park, Cheolsoo
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제4권2호
    • /
    • pp.89-96
    • /
    • 2015
  • In this paper, time-frequency analysis algorithms, empirical mode decomposition and local mean decomposition, are reviewed and their applications to nonlinear and nonstationary real-world data are discussed. In addition, their generic extensions to complex domain are addressed for the analysis of multichannel data. Simulations of these algorithms on synthetic data illustrate the fundamental structure of the algorithms and how they are designed for the analysis of nonlinear and nonstationary data. Applications of the complex version of the algorithms to the synthetic data also demonstrate the benefit of the algorithms for the accurate frequency decomposition of multichannel data.

농촌유역 물순환 해석을 위한 웹기반 자료 전처리 및 모형 연계 기법 개발 (Web-Based Data Processing and Model Linkage Techniques for Agricultural Water-Resource Analysis)

  • 박지훈;강문성;송정헌;전상민;김계웅;류정훈
    • 한국농공학회논문집
    • /
    • 제57권5호
    • /
    • pp.101-111
    • /
    • 2015
  • Establishment of appropriate data in certain formats is essential for agricultural water cycle analysis, which involves complex interactions and uncertainties such as climate change, social & economic change, and watershed environmental change. The main objective of this study was to develop web-based Data processing and Model linkage Techniques for Agricultural Water-Resource analysis (AWR-DMT). The developed techniques consisted of database development, data processing technique, and model linkage technique. The watershed of this study was the upper Cheongmi stream and Geunsam-Ri. The database was constructed using MS SQL with data code, watershed characteristics, reservoir information, weather station information, meteorological data, processed data, hydrological data, and paddy field information. The AWR-DMT was developed using Python. Processing technique generated probable rainfall data using non-stationary frequency analysis and evapotranspiration data. Model linkage technique built input data for agricultural watershed models, such as the TANK and Agricultural Watershed Supply (AWS). This study might be considered to contribute to the development of intelligent watercycle analysis by developing data processing and model linkage techniques for agricultural water-resource analysis.

Vbox와 PC-Crash를 활용한 EDR 기록정보의 신뢰성 평가 (Reliability Evaluation of EDR Data Using PC-Crash & Vbox)

  • 박종찬;김종혁;오원택;최지훈;박종진
    • 한국자동차공학회논문집
    • /
    • 제25권3호
    • /
    • pp.317-325
    • /
    • 2017
  • The EDR(Event Data Recorder) is a part of the ACU(Airbag Control Unit) functions mounted on a vehicle. EDR data have pre-crash data and post-crash data. Pre-crash data are recorded within 5 sec from time zero(AE) with 0.5 sec resolution, and reveal vehicle speed, engine rotation speed, throttle opening, brake pedal operation, acceleration pedal position and steering angle, etc. Using this EDR data, the investigation of a traffic accident can become more objective and scientific. Crash tests of three vehicles equipped with EDR function had been performed successfully. Evaluation of EDR data reliability had also been performed using Vbox and PC-Crash's sequence table function. Based on the results, we could confirm EDR data's reliability and availability for Traffic Accident Analysis by the series of this process.

빅데이터 분석을 통한 한국과 미국의 스타벅스 비교 분석 (A Comparison of Starbucks between South Korea and U.S.A. through Big Data Analysis)

  • 조아라;김학선
    • 한국조리학회지
    • /
    • 제23권8호
    • /
    • pp.195-205
    • /
    • 2017
  • The purpose of this study was to compare the Starbucks in South Korea with Starbucks in U.S.A through the semantic network analysis of big data by collecting online data with SCTM(Smart Crawling & Text Mining) program which was developed by big data research institute at Kyungsung University, a data collecting and processing program. The data collection period was from January 1st 2014 to December 7th 2017, and packaged Netdraw along with UCINET 6.0 were utilized for data analysis and visualization. After performing CONCOR(convergence of iterated correlation) analysis and centrality analysis, this study illustrated the current characteristics of Starbucks for Korea and U.S.A reflected by the social network and the differences between Korea and U.S.A. Since the Starbucks was greatly developed, especially in Korea. this study also was supposed to provide significant and social-network oriented suggestions for Starbucks USA, Starbucks Korea and also the whole coffee industry. Also this study revealed that big data analytics can generate new insights into variables that have been extensively studied in existing hospitality literature. In addition, implications for theory and practice as well as directions for future research are discussed.

Performance evaluation of principal component analysis for clustering problems

  • Kim, Jae-Hwan;Yang, Tae-Min;Kim, Jung-Tae
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제40권8호
    • /
    • pp.726-732
    • /
    • 2016
  • Clustering analysis is widely used in data mining to classify data into categories on the basis of their similarity. Through the decades, many clustering techniques have been developed, including hierarchical and non-hierarchical algorithms. In gene profiling problems, because of the large number of genes and the complexity of biological networks, dimensionality reduction techniques are critical exploratory tools for clustering analysis of gene expression data. Recently, clustering analysis of applying dimensionality reduction techniques was also proposed. PCA (principal component analysis) is a popular methd of dimensionality reduction techniques for clustering problems. However, previous studies analyzed the performance of PCA for only full data sets. In this paper, to specifically and robustly evaluate the performance of PCA for clustering analysis, we exploit an improved FCBF (fast correlation-based filter) of feature selection methods for supervised clustering data sets, and employ two well-known clustering algorithms: k-means and k-medoids. Computational results from supervised data sets show that the performance of PCA is very poor for large-scale features.

빅데이터 분석과 헬스케어에 대한 동향 (A review of big data analytics and healthcare)

  • 문석재;이남주
    • 한국응용과학기술학회지
    • /
    • 제37권1호
    • /
    • pp.76-82
    • /
    • 2020
  • Big data analysis in healthcare research seems to be a necessary strategy for the convergence of sports science and technology in the era of the Fourth Industrial Revolution. The purpose of this study is to provide the basic review to secure the diversity of big data and healthcare convergence by discussing the concept, analysis method, and application examples of big data and by exploring the application. Text mining, data mining, opinion mining, process mining, cluster analysis, and social network analysis is currently used. Identifying high-risk factor for a certain condition, determining specific health determinants for diseases, monitoring bio signals, predicting diseases, providing training and treatments, and analyzing healthcare measurements would be possible via big data analysis. As a further work, the big data characteristics provide very appropriate basis to use promising software platforms for development of applications that can handle big data in healthcare and even more in sports science.

데이터 스칼라십: 데이터 저널과 데이터 리포지토리를 중심으로 (Data Scholarship: Data Journals and Data Repositories)

  • 박형주
    • 문화기술의 융합
    • /
    • 제10권1호
    • /
    • pp.443-451
    • /
    • 2024
  • 본 연구는 데이터 스칼라십을 이해하기 위하여 데이터 논문으로 색인되는 저널의 지적 구조를 분석 및 시각화하고 데이터 리포지토리의 운영을 비교하였다. 동료 평가(peer review) 유형을 살펴보고, 공동 출현 분석(co-occurence analysis) 및 네트워크 분석(network analysis)을 실시하였다. WoS에 데이터 논문으로 색인되는 상위 10위 저널은 전통적인 유형과 데이터 논문 유형을 혼재해서 발간하고 있었다. DCI에 색인되는 데이터 리포지토리는 대부분 북미 및 유럽 국가에서 운영하고 있다. 국내의 데이터 리포지토리는 대부분 연구원에서 운영하고 있다. 본 연구의 결과는 데이터 저널, 데이터 리포지토리 등 데이터 스칼라십의 관행을 이해하는 데 도움이 되기를 바란다.

고객 데이터 통합과 CRM성과간의 구조적 관련성 (The Structural Relationship of Customer Data Integration and CRM Performances)

  • 강재정;문태수
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제15권3호
    • /
    • pp.87-106
    • /
    • 2006
  • The customer-focused enterprise is interested in integrating every record of an interaction with a customer. This study is to investigate the structural relationship of data integration customer analysis capability, marketing & sales capability, customer service capability, and CRM performance. 205 survey data were collected from the company which implemented the CRM package. SEM analysis shows that data integration has influence on the CRM performance through the improvement of customer analysis capability, marketing 8t sales capability, and customer service capability. The revised model for further goodness-fitting model shows that data integration has influence on the improvement of customer analysis capability, marketing & sales capability, and customer service capability. but customer analysis capability has indirect influence on CRM performance through the improvement of marketing & sales capability, customer service capability.

  • PDF