• Title/Summary/Keyword: data model

Search Result 46,891, Processing Time 0.065 seconds

Deep Learning-based Evolutionary Recommendation Model for Heterogeneous Big Data Integration

  • Yoo, Hyun;Chung, Kyungyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.9
    • /
    • pp.3730-3744
    • /
    • 2020
  • This study proposes a deep learning-based evolutionary recommendation model for heterogeneous big data integration, for which collaborative filtering and a neural-network algorithm are employed. The proposed model is used to apply an individual's importance or sensory level to formulate a recommendation using the decision-making feedback. The evolutionary recommendation model is based on the Deep Neural Network (DNN), which is useful for analyzing and evaluating the feedback data among various neural-network algorithms, and the DNN is combined with collaborative filtering. The designed model is used to extract health information from data collected by the Korea National Health and Nutrition Examination Survey, and the collaborative filtering-based recommendation model was compared with the deep learning-based evolutionary recommendation model to evaluate its performance. The RMSE is used to evaluate the performance of the proposed model. According to the comparative analysis, the accuracy of the deep learning-based evolutionary recommendation model is superior to that of the collaborative filtering-based recommendation model.

Re-Considering Aggregated Data Bias by Extending "Koyck Model" of Advertising Effect (광고 효과 확장 코익 모델을 이용한 Aggregated data bias의 재조명)

  • Song, Tea-Ho;Yuan, Xina;Kim, Ji-Yoon
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.34 no.2
    • /
    • pp.91-100
    • /
    • 2009
  • "How does advertising affect sales?" is the fundamental issue of modern advertising research. There is an interesting issue for estimating carryover effects of advertising on sales, and the aggregated data biases exist in the duration of advertising effect. This research suggests an extended model of Koyck Model which is employed for micro-data (Koyck 1954) to estimate aggregated advertising data, and empirically shows the aggregated data bias. Our developed model with the aggregated level of actual advertising data is more appropriate than the basic Koyck model for micro-data. The result figures out that it is important to consider the disaggregated data level in the analysis of dynamic effects of adverting such as carryover effects.

THE DEVELOPMENT OF A ZERO-INFLATED RASCH MODEL

  • Kim, Sungyeun;Lee, Guemin
    • The Pure and Applied Mathematics
    • /
    • v.20 no.1
    • /
    • pp.59-70
    • /
    • 2013
  • The purpose of this study was to develop a zero-inflated Rasch (ZI-Rasch) model, a combination of the Rasch model and the ZIP model. The ZI-Rasch model was considered in this study as an appropriate alternative to the Rasch model for zero-inflated data. To investigate the relative appropriateness of the ZI-Rasch model, several analyses were conducted using PROC NLMIXED procedures in SAS under various simulation conditions. Sets of criteria for model evaluations (-2LL, AIC, AICC, and BIC) and parameter estimations (RMSE, and $r$) from the ZI-Rasch model were compared with those from the Rasch model. In the data-model fit indices, regardless of the simulation conditions, the ZI-Rasch model produced better fit statistics than did the Rasch model, even when the response data were generated from the Rasch model. In terms of item parameter ${\lambda}$ estimations, the ZI-Rasch model produced estimates similar to those of the Rasch model.

A Mixed Model for Oredered Response Categories

  • Choi, Jae-Sung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.2
    • /
    • pp.339-345
    • /
    • 2004
  • This paper deals with a mixed logit model for ordered polytomous data. There are two types of factors affecting the response varable in this paper. One is a fixed factor with finite quantitative levels and the other is a random factor coming from an experimental structure such as a randomized complete block design. It is discussed how to set up the model for analyzing ordered polytomous data and illustrated how to estimate the paramers in the given model.

  • PDF

Analysis of latent growth model using repeated measures ANOVA in the data from KYPS (청소년패널자료 분석에서의 반복측정분산분석을 활용한 잠재성장모형)

  • Lee, Hwa-Jung;Kang, Suk-Bok
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.6
    • /
    • pp.1409-1419
    • /
    • 2013
  • We analyzed the data from KYPS using the latent growth model which has been widely studied as an analysis method of longitudinal data. In this study, we applied repeated measures ANOVA to unconditional model in order for faster decision of the unconditional model of the latent growth model. Also, we compared the six-type models, the quadratic model and the model of which repeated measures ANOVA is applied.

GIS Application Model for Spatial Simulation of Surface Runoff from a Small Watershed( II) (소유역 지표유출의 공간적 해석을 위한 지리정보시스템의 응용모형(II) - 격자 물수지 모형을 위한 GIS응용 모형 개발 -)

  • 김대식;정하우;김성준;최진용
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.37 no.5
    • /
    • pp.35-42
    • /
    • 1995
  • his paper is to develop a GIS application model (GISCELWAB) for the spatial simulation of surface runoff from a small watershed. The model was constituted by three submodels : The input data extraction model (GISINDATA) which prepares cell-based input data automatically for a given watershed, the cell water balance model (CELWAB) which calculates the water balance for a cell and simulates surface runoff of watershed simultaneously by the interaction of cells, and the output data management model (GISOUTDISP) which visualize the results of temporal and spatial variation of surface runoff. The input data extraction model was developed to solve the time-consuming problems for the input-data preparation of distributed hydrologic model. The input data for CELWAB can be obtained by extracting ASCII data from a vector map. The output data management model was developed to convert the storage depth and discharge of cells into grid map. This model enables to visualize the spatial formulation process of watershed storage depth and surface runoff wholly with time increment.

  • PDF

A maximum likelihood approach to infer demographic models

  • Chung, Yujin
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.3
    • /
    • pp.385-395
    • /
    • 2020
  • We present a new maximum likelihood approach to estimate demographic history using genomic data sampled from two populations. A demographic model such as an isolation-with-migration (IM) model explains the genetic divergence of two populations split away from their common ancestral population. The standard probability model for an IM model contains a latent variable called genealogy that represents gene-specific evolutionary paths and links the genetic data to the IM model. Under an IM model, a genealogy consists of two kinds of evolutionary paths of genetic data: vertical inheritance paths (coalescent events) through generations and horizontal paths (migration events) between populations. The computational complexity of the IM model inference is one of the major limitations to analyze genomic data. We propose a fast maximum likelihood approach to estimate IM models from genomic data. The first step analyzes genomic data and maximizes the likelihood of a coalescent tree that contains vertical paths of genealogy. The second step analyzes the estimated coalescent trees and finds the parameter values of an IM model, which maximizes the distribution of the coalescent trees after taking account of possible migration events. We evaluate the performance of the new method by analyses of simulated data and genomic data from two subspecies of common chimpanzees in Africa.

Evaluation of Organization and Use of Data Model for Structural Experiment Information (구조실험정보를 위한 데이터 모델의 구성 및 사용성 평가)

  • Lee, Chang-Ho
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.28 no.6
    • /
    • pp.579-588
    • /
    • 2015
  • The data model for structural experiment information formally organizes the information involved in the structural experiments before the data repository using the data model is implemented. The data model is particularly required for the data repositories for the large-scale structural experiment information and the general information for various types of experiments, such as the NEEShub Project Warehouse developed by NEES. This paper proposes criteria for evaluating the organization and the use of design model for structural experiment information. The term of AVE(attribute value existence) indicates the ratio of attributes who values exist in objects, and then used for defining the Attribute AVE for the use of an attribute, the Class AVE for a class, the Class Level AVE for a class including its lower-level classes, the Project AVE for a project including all classes at class levels, and the Data Model AVE for a data model including projects. These criteria are applied to the projects in the NEES data model, and it is successively possible to numerically describe the evaluation of the use of classes and attributes in the data model.

Improvement of PM Forecasting Performance by Outlier Data Removing (Outlier 데이터 제거를 통한 미세먼지 예보성능의 향상)

  • Jeon, Young Tae;Yu, Suk Hyun;Kwon, Hee Yong
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.6
    • /
    • pp.747-755
    • /
    • 2020
  • In this paper, we deal with outlier data problems that occur when constructing a PM2.5 fine dust forecasting system using a neural network. In general, when learning a neural network, some of the data are not helpful for learning, but rather disturbing. Those are called outlier data. When they are included in the training data, various problems such as overfitting occur. In building a PM2.5 fine dust concentration forecasting system using neural network, we have found several outlier data in the training data. We, therefore, remove them, and then make learning 3 ways. Over_outlier model removes outlier data that target concentration is low, but the model forecast is high. Under_outlier model removes outliers data that target concentration is high, but the model forecast is low. All_outlier model removes both Over_outlier and Under_outlier data. We compare 3 models with a conventional outlier removal model and non-removal model. Our outlier removal model shows better performance than the others.

Extending the Multidimensional Data Model to Handle Complex Data

  • Mansmann, Svetlana;Scholl, Marc H.
    • Journal of Computing Science and Engineering
    • /
    • v.1 no.2
    • /
    • pp.125-160
    • /
    • 2007
  • Data Warehousing and OLAP (On-Line Analytical Processing) have turned into the key technology for comprehensive data analysis. Originally developed for the needs of decision support in business, data warehouses have proven to be an adequate solution for a variety of non-business applications and domains, such as government, research, and medicine. Analytical power of the OLAP technology comes from its underlying multidimensional data model, which allows users to see data from different perspectives. However, this model displays a number of deficiencies when applied to non-conventional scenarios and analysis tasks. This paper presents an attempt to systematically summarize various extensions of the original multidimensional data model that have been proposed by researchers and practitioners in the recent years. Presented concepts are arranged into a formal classification consisting of fact types, factual and fact-dimensional relationships, and dimension types, supplied with explanatory examples from real-world usage scenarios. Both the static elements of the model, such as types of fact and dimension hierarchy schemes, and dynamic features, such as support for advanced operators and derived elements. We also propose a semantically rich graphical notation called X-DFM that extends the popular Dimensional Fact Model by refining and modifying the set of constructs as to make it coherent with the formal model. An evaluation of our framework against a set of common modeling requirements summarizes the contribution.