• Title/Summary/Keyword: Statistical Decision Making

Search Result 387, Processing Time 0.026 seconds

Doubly-robust Q-estimation in observational studies with high-dimensional covariates (고차원 관측자료에서의 Q-학습 모형에 대한 이중강건성 연구)

  • Lee, Hyobeen;Kim, Yeji;Cho, Hyungjun;Choi, Sangbum
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.309-327
    • /
    • 2021
  • Dynamic treatment regimes (DTRs) are decision-making rules designed to provide personalized treatment to individuals in multi-stage randomized trials. Unlike classical methods, in which all individuals are prescribed the same type of treatment, DTRs prescribe patient-tailored treatments which take into account individual characteristics that may change over time. The Q-learning method, one of regression-based algorithms to figure out optimal treatment rules, becomes more popular as it can be easily implemented. However, the performance of the Q-learning algorithm heavily relies on the correct specification of the Q-function for response, especially in observational studies. In this article, we examine a number of double-robust weighted least-squares estimating methods for Q-learning in high-dimensional settings, where treatment models for propensity score and penalization for sparse estimation are also investigated. We further consider flexible ensemble machine learning methods for the treatment model to achieve double-robustness, so that optimal decision rule can be correctly estimated as long as at least one of the outcome model or treatment model is correct. Extensive simulation studies show that the proposed methods work well with practical sample sizes. The practical utility of the proposed methods is proven with real data example.

Vulnerability Assessment for Forest Ecosystem to Climate Change Based on Spatio-temporal Information (시공간 정보기반 산림 생태계의 기후변화 취약성 평가)

  • Byun, Jung-Yeon;Lee, Woo-Kyun;Choi, Sung-Ho;Oh, Su-Hyun;Yoo, Seong-Jin;Kwon, Tae-Sung;Sung, Joo-Han;Woo, Jae-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.28 no.1
    • /
    • pp.159-169
    • /
    • 2012
  • The purpose of this study was to assess the vulnerability of forest ecosystem to climate change in South Korea using socio-environmental indicators and the results of two vegetation models named as Hydrological and Thermal Analogy Group(HyTAG), and MAPSS-Century 1(MC1). The changing frequency and direction of biome types estimated by HyTAG model was used for quantifying sensitivity and adaptive capacity of forest distribution. Similarly, the variation and changing tendency of net primary production and soil carbon storage estimated by MC1 model was used for quantifying sensitivity and adaptive capacity of forest function. As socio-environmental indicators, many statistical data such as financial autonomy rate and the number of forestry officer was prepared. All indicators were standardized, and then calculated using the vulnerability assessment equation. The period of vulnerability assessment was divided into the past(1971-2000) and the future(2021-2050). To understand what policy has a priority to climate change, distribution maps of each indicators was depicted and the vulnerability results were compared among administrative districts. Evident differences could be found in entire study area. These differences were mostly derived from regionalspecific adaptive capacity. The result and methodology of this study would be helpful for the development of decision-making supporting system and policy making in forest management with respect to climate change.

Long-Term Arrival Time Estimation Model Based on Service Time (버스의 정차시간을 고려한 장기 도착시간 예측 모델)

  • Park, Chul Young;Kim, Hong Geun;Shin, Chang Sun;Cho, Yong Yun;Park, Jang Woo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.7
    • /
    • pp.297-306
    • /
    • 2017
  • Citizens want more accurate forecast information using Bus Information System. However, most bus information systems that use an average based short-term prediction algorithm include many errors because they do not consider the effects of the traffic flow, signal period, and halting time. In this paper, we try to improve the precision of forecast information by analyzing the influencing factors of the error, thereby making the convenience of the citizens. We analyzed the influence factors of the error using BIS data. It is shown in the analyzed data that the effects of the time characteristics and geographical conditions are mixed, and that effects on halting time and passes speed is different. Therefore, the halt time is constructed using Generalized Additive Model with explanatory variable such as hour, GPS coordinate and number of routes, and we used Hidden Markov Model to construct a pattern considering the influence of traffic flow on the unit section. As a result of the pattern construction, accurate real-time forecasting and long-term prediction of route travel time were possible. Finally, it is shown that this model is suitable for travel time prediction through statistical test between observed data and predicted data. As a result of this paper, we can provide more precise forecast information to the citizens, and we think that long-term forecasting can play an important role in decision making such as route scheduling.

A Study on Purchasing Tendency and Brand loyalty of Eco-Friendly Cosmetics According to GreenWashing Awareness: Mediating effect of Brand authenticity (그린워싱 인식에 따른 친환경 화장품 구매성향과 브랜드 충성도에 관한 연구: 브랜드 진정성의 매개효과)

  • Hye-In Gwon;Jae-Nam Lee
    • Journal of the Korean Applied Science and Technology
    • /
    • v.41 no.4
    • /
    • pp.903-919
    • /
    • 2024
  • This study attempted to investigate the mediating effects of brand authenticity in the influence of Purchasing tendency on brand loyalty by greenwashing in eco-friendly cosmetics with a goal of making a contribution to the growth and development of eco-friendly cosmetics industry. This study attempted to investigate the mediating effects of brand authenticity in the influence of Purchasing tendency on brand loyalty by greenwashing in eco-friendly cosmetics with a goal of making a contribution to the growth and development of eco-friendly cosmetics industry. As for the research method, a survey was conducted on women in their 20s and 40s using eco-friendly cosmetics, and the final 836 copies of data were analyzed with the SPSS WIN 20.0 statistical program. The results found the followings: awareness of greenwashing had an influence on purchasing tendency, brand authenticity and brand loyalty. purchasing tendency revealed an influence on brand authenticity and brand loyalty. And brand authenticity influenced brand loyalty, brand authenticity showed a mediating effect in relationships between purchasing tendency and brand loyalty. Therefore, we hope that it will help consumers make a rational buying decision and develop the eco-friendly cosmetics industry.

A Web Application for Open Data Visualization Using R (R 이용 오픈데이터 시각화 웹 응용)

  • Kim, Kwang-Seob;Lee, Ki-Won
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.17 no.2
    • /
    • pp.72-81
    • /
    • 2014
  • As big data are one of main issues in the recent days, the interests on their technologies are also increasing. Among several technological bases, this study focuses on data visualization and R based on open source. In general, the term of data visualization can be summarized as the web technologies for constructing, manipulating and displaying various types of graphic objects in the interactive mode. R is an operating environment or a language for statistical data analysis from basic to advanced level. In this study, a web application with these technological aspects and components is newly implemented and exemplified with data visualization for geo-based open data provided by public organizations or government agencies. This application model does not need users' data building or proprietary software installation. Futhermore it is designed for users in the geo-spatial application field with less experiences and little knowledges about R. The results of data visualization by this application can support decision making process of web users accessible to this service. It is expected that the more practical and various applications with R-based geo-statistical analysis functions and complex operations linked to big data contribute to expanding the scope and the range of the geo-spatial application.

Comparison of Deterministic and Probabilistic Approaches through Cases of Exposure Assessment of Child Products (어린이용품 노출평가 연구에서의 결정론적 및 확률론적 방법론 사용실태 분석 및 고찰)

  • Jang, Bo Youn;Jeong, Da-In;Lee, Hunjoo
    • Journal of Environmental Health Sciences
    • /
    • v.43 no.3
    • /
    • pp.223-232
    • /
    • 2017
  • Objectives: In response to increased interest in the safety of children's products, a risk management system is being prepared through exposure assessment of hazardous chemicals. To estimate exposure levels, risk assessors are using deterministic and probabilistic approaches to statistical methodology and a commercialized Monte Carlo simulation based on tools (MCTool) to efficiently support calculation of the probability density functions. This study was conducted to analyze and discuss the usage patterns and problems associated with the results of these two approaches and MCTools used in the case of probabilistic approaches by reviewing research reports related to exposure assessment for children's products. Methods: We collected six research reports on exposure and risk assessment of children's products and summarized the deterministic results and corresponding underlying distributions for exposure dose and concentration results estimated through deterministic and probabilistic approaches. We focused on mechanisms and differences in the MCTools used for decision making with probabilistic distributions to validate the simulation adequacy in detail. Results: The estimation results of exposure dose and concentration from the deterministic approaches were 0.19-3.98 times higher than the results from the probabilistic approach. For the probabilistic approach, the use of lognormal, Student's T, and Weibull distributions had the highest frequency as underlying distributions of the input parameters. However, we could not examine the reasons for the selection of each distribution because of the absence of test-statistics. In addition, there were some cases estimating the discrete probability distribution model as the underlying distribution for continuous variables, such as weight. To find the cause of abnormal simulations, we applied two MCTools used for all reports and described the improper usage routes of MCTools. Conclusions: For transparent and realistic exposure assessment, it is necessary to 1) establish standardized guidelines for the proper use of the two statistical approaches, including notes by MCTool and 2) consider the development of a new software tool with proper configurations and features specialized for risk assessment. Such guidelines and software will make exposure assessment more user-friendly, consistent, and rapid in the future.

A study on optimal environmental factors of tomato using smart farm data (스마트팜 데이터를 이용한 토마토 최적인자에 관한 연구)

  • Na, Myung Hwan;Park, Yuha;Cho, Wan Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.6
    • /
    • pp.1427-1435
    • /
    • 2017
  • The smart farm is a remarkable system because it utilizes information and communication technologies in agriculture to bring high productivity and excellent qualities of crops. It automatically measures the growth environment of the crops and accumulates huge amounts of environmental information in real time growing in smart farms using multi-variable control of environmental factors. The statistical model using the collected big data will be helpful for decision making in order to control optimal growth environment of crops in smart farms. Using data collected from a smart farm of tomato, we carried out multiple regression analysis to determine the relationship between yield and environmental factors and to predict yield of tomato. In this study, appropriate parameter modification was made for environmental factors considering tomato growth. Using these new factors, we fit the model and derived the optimal environmental factors that affect the yields of tomato. Based on this, we could predict the yields of tomato. It is expected that growth environment can be controlled to improve tomato productivities by using statistical model.

The Effect of Price and Brand Names on the Evaluation of Cosmetics (가격 및 인지도가 화장품 평가에 미치는 영향)

  • Lim, Hyo-Jung;Kim, Ju-Duck
    • Journal of the Society of Cosmetic Scientists of Korea
    • /
    • v.33 no.2
    • /
    • pp.117-126
    • /
    • 2007
  • This study investigated the effect of the price and brand name on the consumer's evaluation of cosmetics. 363 women from 20's to 50's living in Seoul and the metropolitan area were asked to use and describe the given samples of cosmetic products for one week with different information of price and brand name. The results of this study are as follows: First, the assessment of the facial toner, moisturizer and cream does not show a significant statistical difference between the group of 'renowned' and 'renownless'. Second, the assessment of the facial toner, moisturizer, and cream shows a significant statistical difference between the user groups which received the prior information whether the cosmetics are 'high price' or 'low price'. Third, the assessment of the users' satisfaction of the 3 kinds of cosmetic products mentioned above is influenced by 'renown' an 'price'. Finally, the interaction of the factor 'renown' and 'price' influences on the cosmetics' effectiveness significantly. From this study, it was discovered that the evaluation and the degree of satisfaction on cosmetics were influenced by the price and brand names. This will improve the understanding of consumers' behavior and personal decision-making, which in be the key of marketing strategy.

Statistical Analysis of Extreme Values of Financial Ratios (재무비율의 극단치에 대한 통계적 분석)

  • Joo, Jihwan
    • Knowledge Management Research
    • /
    • v.22 no.2
    • /
    • pp.247-268
    • /
    • 2021
  • Investors mainly use PER and PBR among financial ratios for valuation and investment decision-making. I conduct an analysis of two basic financial ratios from a statistical perspective. Financial ratios contain key accounting numbers which reflect firm fundamentals and are useful for valuation or risk analysis such as enterprise credit evaluation and default prediction. The distribution of financial data tends to be extremely heavy-tailed, and PER and PBR show exceedingly high level of kurtosis and their extreme cases often contain significant information on financial risk. In this respect, Extreme Value Theory is required to fit its right tail more precisely. I introduce not only GPD but exGPD. GPD is conventionally preferred model in Extreme Value Theory and exGPD is log-transformed distribution of GPD. exGPD has recently proposed as an alternative of GPD(Lee and Kim, 2019). First, I conduct a simulation for comparing performances of the two distributions using the goodness of fit measures and the estimation of 90-99% percentiles. I also conduct an empirical analysis of Information Technology firms in Korea. Finally, exGPD shows better performance especially for PBR, suggesting that exGPD could be an alternative for GPD for the analysis of financial ratios.

Water Quality Assessment and Turbidity Prediction Using Multivariate Statistical Techniques: A Case Study of the Cheurfa Dam in Northwestern Algeria

  • ADDOUCHE, Amina;RIGHI, Ali;HAMRI, Mehdi Mohamed;BENGHAREZ, Zohra;ZIZI, Zahia
    • Applied Chemistry for Engineering
    • /
    • v.33 no.6
    • /
    • pp.563-573
    • /
    • 2022
  • This work aimed to develop a new equation for turbidity (Turb) simulation and prediction using statistical methods based on principal component analysis (PCA) and multiple linear regression (MLR). For this purpose, water samples were collected monthly over a five year period from Cheurfa dam, an important reservoir in Northwestern Algeria, and analyzed for 12 parameters, including temperature (T°), pH, electrical conductivity (EC), turbidity (Turb), dissolved oxygen (DO), ammonium (NH4+), nitrate (NO3-), nitrite (NO2-), phosphate (PO43-), total suspended solids (TSS), biochemical oxygen demand (BOD5) and chemical oxygen demand (COD). The results revealed a strong mineralization of the water and low dissolved oxygen (DO) content during the summer period. High levels of TSS and Turb were recorded during rainy periods. In addition, water was charged with phosphate (PO43-) in the whole period of study. The PCA results revealed ten factors, three of which were significant (eigenvalues >1) and explained 75.5% of the total variance. The F1 and F2 factors explained 36.5% and 26.7% of the total variance, respectively and indicated anthropogenic pollution of domestic agricultural and industrial origin. The MLR turbidity simulation model exhibited a high coefficient of determination (R2 = 92.20%), indicating that 92.20% of the data variability can be explained by the model. TSS, DO, EC, NO3-, NO2-, and COD were the most significant contributing parameters (p values << 0.05) in turbidity prediction. The present study can help with decision-making on the management and monitoring of the water quality of the dam, which is the primary source of drinking water in this region.