• Title/Summary/Keyword: Data analysis study

Search Result 62,164, Processing Time 0.075 seconds

Comparative Study of Prediction Performance and Variable Importance in SEM-ANN Two-stage Analysis (SEM-ANN 2단계 분석에서 예측성능과 변수중요도의 비교연구)

  • Sun-Dong Kwon;Yi Zhao;Hua-Long Fang
    • Journal of Information Technology Applications and Management
    • /
    • v.31 no.1
    • /
    • pp.11-25
    • /
    • 2024
  • The purpose of this study is to investigate the improvement of prediction performance and changes in variable importance in SEM-ANN two-stage analysis. 366 cosmetics repurchase-related survey data were analyzed and the results were presented. The results of this study are summarized as follows. First, in SEM-ANN two-stage analysis, SEM and ANN models were trained with train data and predicted with test data, respectively, and the R2 was showed. As a result, the prediction performance was doubled from SEM 0.3364 to ANN 0.6836. Looking at this degree of R2 improvement as the effect size f2 of Cohen (1988), it corresponds to a very large effect at 110%. Second, as a result of comparing changes in normalized variable importance through SEM-ANN two-stage analysis, variables with high importance in SEM were also found to have high importance in ANN, but variables with little or no importance in SEM became important in ANN. This study is meaningful in that it increased the validity of the comparison by using the same learning and evaluation method in the SEM-ANN two-stage analysis. This study is meaningful in that it compared the degree of improvement in prediction performance and the change in variable importance through SEM-ANN two-stage analysis.

A Public Open Civil Complaint Data Analysis Model to Improve Spatial Welfare for Residents - A Case Study of Community Welfare Analysis in Gangdong District - (거주민 공간복지 향상을 위한 공공 개방 민원 데이터 분석 모델 - 강동구 공간복지 분석 사례를 중심으로 -)

  • Shin, Dongyoun
    • Journal of KIBIM
    • /
    • v.13 no.3
    • /
    • pp.39-47
    • /
    • 2023
  • This study aims to introduce a model for enhancing community well-being through the utilization of public open data. To objectively assess abstract notions of residential satisfaction, text data from complaints is analyzed. By leveraging accessible public data, costs related to data collection are minimized. Initially, relevant text data containing civic complaints is collected and refined by removing extraneous information. This processed data is then combined with meaningful datasets and subjected to topic modeling, a text mining technique. The insights derived are visualized using Geographic Information System (GIS) and Application Programming Interface (API) data. The efficacy of this analytical model was demonstrated in the Godeok/Gangil area. The proposed methodology allows for comprehensive analysis across time, space, and categories. This flexible approach involves incorporating specific public open data as needed, all within the overarching framework.

A Big Data-Driven Business Data Analysis System: Applications of Artificial Intelligence Techniques in Problem Solving

  • Donggeun Kim;Sangjin Kim;Juyong Ko;Jai Woo Lee
    • The Journal of Bigdata
    • /
    • v.8 no.1
    • /
    • pp.35-47
    • /
    • 2023
  • It is crucial to develop effective and efficient big data analytics methods for problem-solving in the field of business in order to improve the performance of data analytics and reduce costs and risks in the analysis of customer data. In this study, a big data-driven data analysis system using artificial intelligence techniques is designed to increase the accuracy of big data analytics along with the rapid growth of the field of data science. We present a key direction for big data analysis systems through missing value imputation, outlier detection, feature extraction, utilization of explainable artificial intelligence techniques, and exploratory data analysis. Our objective is not only to develop big data analysis techniques with complex structures of business data but also to bridge the gap between the theoretical ideas in artificial intelligence methods and the analysis of real-world data in the field of business.

A Study on Curriculum Development for Big Data Driven Digital Marketer (빅데이터 기반 디지털 마케터 전문가 양성을 위한 교육과정 개발 관련 연구)

  • Yi, Myongho
    • Journal of Digital Convergence
    • /
    • v.19 no.5
    • /
    • pp.105-115
    • /
    • 2021
  • Many services are provided through big data analysis in various fields such as individuals, private sectors, and governments. There is a growing interest in training data scientists to provide these services. Particularly, interest in big data-based marketing curriculum is high. This study analyzed the domestic and foreign university big data-based marketing-related curriculum to utilize vast and diverse types of information from a marketing perspective in the era of big data. As a result of the analysis of 3,523 subjects related to digital marketing, big data marketing, data analysis, and developers collected according to the analysis criteria, it was analyzed that the specialized curriculum for training data scientists required in the era of the fourth industrial revolution was not appropriate. It is expected that the proposed curriculum in this study will be useful for the development of digital marketing and big data-based marketing curriculum.

Uncertainty Analysis of Hyung San River Discharge due to the methods of Discharge Measurement (유량측정방법에 따른 형산강유량의 불확실도 분석)

  • Seo, Kyu-Woo;Kim, Su-Hyun;Kim, Dai-Gon
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2005.05b
    • /
    • pp.1538-1542
    • /
    • 2005
  • This study is to secure more accurate data of the discharge on the measurement by gaining a reliable hydrological data through the comparison the present method of measuring them and the other way that is based ISO. This study suggests the applicable measurement method of the discharge that has reliance through general elements and the analysis of uncertainty by comparing and assaying the data of the Hyung San River that is measured by the present standard. The result of this study makes us realize that we should complement the measurement method of the discharge securing the reliable and accurate hydrological data Hydrological data is very important things to perform domestic river works or install some structure in river or coast. Securing reliable and accurate hydro-data and making a thesis should go on in other to do any construction in river or coast.

  • PDF

Conversations about Open Data on Twitter

  • Jalali, Seyed Mohammad Jafar;Park, Han Woo
    • International Journal of Contents
    • /
    • v.13 no.1
    • /
    • pp.31-37
    • /
    • 2017
  • Using the network analysis method, this study investigates the communication structure of Open Data on the Twitter sphere. It addresses the communication path by mapping influential activities and comparing the contents of tweets about Open Data. In the years 2015 and 2016, the NodeXL software was applied to collect tweets from the Twitter network, containing the term "opendata". The structural patterns of social media communication were analyzed through several network characteristics. The results indicate that the most common activities on the Twitter network are related to the subjects such as new applications and new technologies in Open Data. The study is the first to focus on the structural and informational pattern of Open Data based on social network analysis and content analysis. It will help researchers, activists, and policy-makers to come up with a major realization of the pattern of Open Data through Twitter.

A Study on Satisfaction Survey Based on Regression Analysis to Improve Curriculum for Big Data Education (빅데이터 양성 교육 교과과정 개선을 위한 회귀분석 기반의 만족도 조사에 관한 연구)

  • Choi, Hyun
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.22 no.6
    • /
    • pp.749-756
    • /
    • 2019
  • Big data is structured and unstructured data that is so difficult to collect, store, and so on due to the huge amount of data. Many institutions, including universities, are building student convergence systems to foster talents for data science and AI convergence, but there is an absolute lack of research on what kind of education is needed and what kind of education is required for students. Therefore, in this paper, after conducting the correlation analysis based on the questionnaire on basic surveys and courses to improve the curriculum by grasping the satisfaction and demands of the participants in the "2019 Big Data Youth Talent Training Course" held at K University, Regression analysis was performed. As a result of the study, the higher the satisfaction level, the satisfaction with class or job connection, and the self-development, the more positive the evaluation of program efficiency.

Understanding the Food Hygiene of Cruise through the Big Data Analytics using the Web Crawling and Text Mining

  • Shuting, Tao;Kang, Byongnam;Kim, Hak-Seon
    • Culinary science and hospitality research
    • /
    • v.24 no.2
    • /
    • pp.34-43
    • /
    • 2018
  • The objective of this study was to acquire a general and text-based awareness and recognition of cruise food hygiene through big data analytics. For the purpose, this study collected data with conducting the keyword "food hygiene, cruise" on the web pages and news on Google, during October 1st, 2015 to October 1st, 2017 (two years). The data collection was processed by SCTM which is a data collecting and processing program and eventually, 899 kb, approximately 20,000 words were collected. For the data analysis, UCINET 6.0 packaged with visualization tool-Netdraw was utilized. As a result of the data analysis, the words such as jobs, news, showed the high frequency while the results of centrality (Freeman's degree centrality and Eigenvector centrality) and proximity indicated the distinct rank with the frequency. Meanwhile, as for the result of CONCOR analysis, 4 segmentations were created as "food hygiene group", "person group", "location related group" and "brand group". The diagnosis of this study for the food hygiene in cruise industry through big data is expected to provide instrumental implications both for academia research and empirical application.

Estimation of Extreme Wind Speeds in the Western North Pacific Using Reanalysis Data Synthesized with Empirical Typhoon Vortex Model (모조 태풍 합성 재분석 바람장을 이용한 북서태평양 극치 해상풍 추정)

  • Kim, Hye-In;Moon, Il-Ju
    • Ocean and Polar Research
    • /
    • v.43 no.1
    • /
    • pp.1-14
    • /
    • 2021
  • In this study, extreme wind speeds in the Western North Pacific (WNP) were estimated using reanalysis wind fields synthesized with an empirical typhoon vortex model. Reanalysis wind data used is the Fifth-generation European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA5) data, which was deemed to be the most suitable for extreme value analysis in this study. The empirical typhoon vortex model used has the advantage of being able to realistically reproduce the asymmetric winds of a typhoon by using the gale/storm-forced wind radii information in the 4 quadrants of a typhoon. Using a total of 39 years of the synthesized reanalysis wind fields in the WNP, extreme value analysis is applied to the General Pareto Distribution (GPD) model based on the Peak-Over-Threshold (POT) method, which can be used effectively in case of insufficient data. The results showed that the extreme analysis using the synthesized wind data significantly improved the tendency to underestimate the extreme wind speeds compared to using only reanalysis wind data. Considering the difficulty of obtaining long-term observational wind data at sea, the result of the synthesized wind field and extreme value analysis developed in this study can be used as basic data for the design of offshore structures.

A Study on How to Nurture New Players using Data Analysis (데이터 분석을 활용한 신인급 선수 육성 방안 연구)

  • You, Kangsoo
    • Journal of Industrial Convergence
    • /
    • v.19 no.4
    • /
    • pp.17-21
    • /
    • 2021
  • Recently, in the field of sports, the use of data in conducting games, planning seasons, and operating teams has increased significantly. Also, in order to develop better players, it has become necessary to use data to accurately analyze their performance. Therefore, in this study, various data about rookie players was collected and pre-processed in order to analyze and visualize their performance. Additionally, an analysis was conducted to determine at least how many opportunities should be given to foster rookie players. Then, a data analysis method was presented for nurturing athletes by using data in the field of sports. It is expected that this study will contribute to fostering rookie players by utilizing data.