• Title/Summary/Keyword: statistical models

Search Result 3,035, Processing Time 0.029 seconds

A Study on Clinical Variables Contributing to Differentiation of Delirium and Non-Delirium Patients in the ICU (중환자실 섬망 환자와 비섬망 환자 구분에 기여하는 임상 지표에 관한 연구)

  • Ko, Chanyoung;Kim, Jae-Jin;Cho, Dongrae;Oh, Jooyoung;Park, Jin Young
    • Korean Journal of Psychosomatic Medicine
    • /
    • v.27 no.2
    • /
    • pp.101-110
    • /
    • 2019
  • Objectives : It is not clear which clinical variables are most closely associated with delirium in the Intensive Care Unit (ICU). By comparing clinical data of ICU delirium and non-delirium patients, we sought to identify variables that most effectively differentiate delirium from non-delirium. Methods : Medical records of 6,386 ICU patients were reviewed. Random Subset Feature Selection and Principal Component Analysis were utilized to select a set of clinical variables with the highest discriminatory capacity. Statistical analyses were employed to determine the separation capacity of two models-one using just the selected few clinical variables and the other using all clinical variables associated with delirium. Results : There was a significant difference between delirium and non-delirium individuals across 32 clinical variables. Richmond Agitation Sedation Scale (RASS), urinary catheterization, vascular catheterization, Hamilton Anxiety Rating Scale (HAM-A), Blood urea nitrogen, and Acute Physiology and Chronic Health Examination II most effectively differentiated delirium from non-delirium. Multivariable logistic regression analysis showed that, with the exception of vascular catheterization, these clinical variables were independent risk factors associated with delirium. Separation capacity of the logistic regression model using just 6 clinical variables was measured with Receiver Operating Characteristic curve, with Area Under the Curve (AUC) of 0.818. Same analyses were performed using all 32 clinical variables;the AUC was 0.881, denoting a very high separation capacity. Conclusions : The six aforementioned variables most effectively separate delirium from non-delirium. This highlights the importance of close monitoring of patients who received invasive medical procedures and were rated with very low RASS and HAM-A scores.

Environmental Health Surveillance of Low Birth Weight in Seoul using Air Monitoring and Birth Data (2002년 서울시 대기오염과 출생 자료를 이용한 저체중아 환경보건감시체계 연구)

  • Seo, Ju-Hee;Kim, Ok-Jin;Kim, Byung-Mi;Park, Hye-Sook;Leem, Jong-Han;Hong, Yun-Chul;Kim, Young-Ju;Ha, Eun-Hee
    • Journal of Preventive Medicine and Public Health
    • /
    • v.40 no.5
    • /
    • pp.363-370
    • /
    • 2007
  • Objectives: The principal objective of this study was to determine the relationship between maternal exposure to air pollution and low birth weight and to propose a possible environmental health surveillance system for low birth weight. Methods: We acquired air monitoring data for Seoul from the Ministry of Environment, the meteorological data from the Korean Meteorological Administration, the exposure assessments from the National Institute of Environmental Research, and the birth data from the Korean National Statistical Office between January 1, 2002 and December 31, 2003. The final birth data were limited to singletons within $37{\sim}44$ weeks of gestational age. We defined the Low Birth Weight (LBW) group as infants with birth weights of less than 2500g and calculated the annual LBW rate by district. The air monitoring data were measured for $CO,\;SO_2,\;NO_2,\;and\;PM_{10}$ concentrations at 27 monitoring stations in Seoul. We utilized two models to evaluate the effects of air pollution on low birth weight: the first was the relationship between the annual concentration of air pollution and low birth weight (LBW) by individual and district, and the second involved a GIS exposure model constructed by Arc View 3.1. Results: LBW risk (by Gu, or district) was significantly increased to $1.113(95%\;CI=1.111{\sim}1.116)\;for\;CO,\;1.004(95%\;CI=1.003{\sim}1.005)\;for\;NO_2,\;1.202(95%\;CI=1.199{\sim}1.206\;for\;SO_2,\;and\;1.077(95%\;CI=1.075{\sim}1.078)\;\;for\;PM_{10}$ with each interquartile range change. Personal LBW risk was significantly increased to $1.081(95%\;CI=1.002{\sim}1.166)\;for\;CO,\;1.145(95%\;CI=1.036{\sim}1.267)\;for\;SO_2,\;and\;1.053(95%\;CI=1.002{\sim}1.108)\;for\;PM_{10}$ with each interquartile range change. Personal LBW risk was increased to $1.003(95%\;CI=0.954{\sim}1.055)\;for\;NO_2$, but this was not statistically significant. The air pollution concentrations predicted by GIS positively correlated with the numbers of low birth weights, particularly in highly polluted regions. Conclusions: Environmental health surveillance is a systemic, ongoing collection effort including the analysis of data correlated with environmentally-associated diseases and exposures. In addition. environmental health surveillance allows for a timely dissemination of information to those who require that information in order to take effective action. GIS modeling is crucially important for this purpose, and thus we attempted to develop a GIS-based environmental surveillance system for low birth weight.

Data Mining Approaches for DDoS Attack Detection (분산 서비스거부 공격 탐지를 위한 데이터 마이닝 기법)

  • Kim, Mi-Hui;Na, Hyun-Jung;Chae, Ki-Joon;Bang, Hyo-Chan;Na, Jung-Chan
    • Journal of KIISE:Information Networking
    • /
    • v.32 no.3
    • /
    • pp.279-290
    • /
    • 2005
  • Recently, as the serious damage caused by DDoS attacks increases, the rapid detection and the proper response mechanisms are urgent. However, existing security mechanisms do not effectively defend against these attacks, or the defense capability of some mechanisms is only limited to specific DDoS attacks. In this paper, we propose a detection architecture against DDoS attack using data mining technology that can classify the latest types of DDoS attack, and can detect the modification of existing attacks as well as the novel attacks. This architecture consists of a Misuse Detection Module modeling to classify the existing attacks, and an Anomaly Detection Module modeling to detect the novel attacks. And it utilizes the off-line generated models in order to detect the DDoS attack using the real-time traffic. We gathered the NetFlow data generated at an access router of our network in order to model the real network traffic and test it. The NetFlow provides the useful flow-based statistical information without tremendous preprocessing. Also, we mounted the well-known DDoS attack tools to gather the attack traffic. And then, our experimental results show that our approach can provide the outstanding performance against existing attacks, and provide the possibility of detection against the novel attack.

Development of a Predictive Growth Model of Staphylococcus aureus and Shelf-life Estimation of Cooked Mung Bean Sprouts Served in School Foodservice Operations (학교급식에서 제공되는 숙주나물의 Staphylococcus aureus 성장예측모델 개발 및 섭취유효기간 설정)

  • Park, Hyoung-Su;Kim, Min-Young;Jeong, Hyun-Suk;Park, Ki-Hwan;Ryu, Kyung
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.38 no.11
    • /
    • pp.1618-1624
    • /
    • 2009
  • This study was conducted to estimate the shelf-life of cooked mung bean sprouts contaminated with Staphylococcus aureus according to storage temperatures after cooking in school foodservice operations. A predictive growth model of S. aureus in cooked mung bean sprouts prepared using a standard recipe was developed at 4 storage temperatures (5, 15, 25, and 35${^{\circ}C}$). To determine the effect of vinegar on the shelf-life of cooked mung bean sprouts, the growth of S. aureus in sprouts prepared using vinegar and the standard recipe were compared. The $R^2$ values of the specific growth rate (SGR) and lag time (LT) determined using the Gompertz model were greater than 0.90 at all temperatures except 5${^{\circ}C}$, which confirmed that it would be appropriate to use these parameters for a secondary model. The secondary model, which indicates changes in LT and SGR values according to storage temperatures, was calculated using response surface models. The compatibility of the developed model was confirmed by calculating $R^2$, Bf, Af and MSE values as statistic parameters. The $R^2$ values of LT and SGR were 0.94 or higher, and the MSE, Bf and Af values were 0.02 and 0.002, 0.97 and 1.03, and 1.31 and 1.10, respectively, with high statistical compatibility. The growth rate of S. aureus was higher when the standard recipe was used than when vinegar was used at all temperatures. Indeed, no growth of S. aureus was observed in mung bean sprouts prepared using vinegar. Based on the model developed, cooked mung bean sprouts prepared using the standard recipe for school foodservice should be stored at 10${^{\circ}C}$ or less. Additionally, sprouts stored at 25 or 35${^{\circ}C}$ should be consumed within 6 or 12 hours after cooking. Finally, the addition of vinegar will prevent the growth of S. aureus in cooked mung bean sprouts.

Price Volatility, Seasonality and Day-of-the Week Effect for Aquacultural Fishes in Korean Fishery Markets (수산물 시장에서의 양식 어류 가격변동성.계절성.요일효과에 관한 연구 - 노량진수산시장의 넙치와 조피볼락을 중심으로 -)

  • Ko, Bong-Hyun
    • The Journal of Fisheries Business Administration
    • /
    • v.40 no.2
    • /
    • pp.49-70
    • /
    • 2009
  • This study proviedes GARCH model(Bollerslev, 1986) to analyze the structural characteristics of price volatility in domestic aquacultural fish market of Korea. As a case study, flatfish and rock-fish are analyzed as major species with relatively high portion in an aspect of production volume among fish captured in Korea. For analyzing, this study uses daily market data (dating from Jan 1 2000 to June 30, 2008) published by the Noryangjin Fisheries Wholesale Market which is located in Seoul of Korea. This study performs normality test on trading volume and price volatility of flatfish and rock-fish as an advanced empirical approach. The normality test adopted is Jarque-Bera test statistic. As a result, first, a null hypothesis that "an empirical distribution follows normal distribution" was rejected in both fishes. The distribution of daily market data of them were not only biased toward positive(+) direction in terms of kurtosis and skewness, but also characterized by leptokurtic distribution with long right tail. Secondly, serial correlations were found in data on market trading volume and price volatility of two species during very long period. Thirdly, the results of unit root test and ARCH-LM test showed that all data of time series were very stationary and demonstrated effects of ARCH. These statistical characteristics can be explained as a reasonable ground for supporting the fitness of GARCH model in order to estimate conditional variances that reveal price volatility in empirical analysis. From empirical data analysis above, this study drew the following conclusions. First of all, from an empirical analysis on potential effects of seasonality and the day of week on price volatility of aquacultural fish, Monday effects were found in both species and Thursday and Friday effects were also found in flatfish. This indicates that Monday is effective in expanding price volatility of aquacultural fish market and also Monday has higher effects upon the price volatility of fish than other days of week have since it has more new information for weekend. Secondly, the empirical analysis led to a common conclusion that there was very high price volatility of flatfish and rock-fish. This points out that the persistency parameter($\lambda$), an index of possibility for current volatility to sustain similarly in the future, was higher than 0.8-equivalently nearly to 1-in both flatfish and rock-fish, which presents volatility clustering. Also, this study estimated and compared and model that hypothesized normal distributions in order to determine fitness of respective models. As a result, the fitness of GARCH(1, 1)-t model was better than model where the distribution of error term was hypothesized through-distribution due to characteristics of fat-tailed distribution, was also better than model, as described in the results of basic statistic analysis. In conclusion, this study has an important mean in that it was introduced firstly in Korea to investigate in price volatility of Korean aquacultural fishery products, although there was partially a limited of official statistic data. Therefore, it is expected that the results of this study will be useful as a reference material for making and assessing governmental policies. Also, it is looked forward that the results will be helpful to build a fishery business plan as and aspect of producer, and also to take timely measures to potential price fluctuations of fishery products in market. Hence, it is advisable that further studies related to such price volatility in fishery market will extend and evolve into a wider variety of articles and issues in near future.

  • PDF

Significance Analysis of Facility Fires Though Spatial Econometrics Assessment (공간계량분석 방법에 따른 시설물 화재 발생 유의성 분석)

  • Seo, Min Song;Yoo, Hwan Hee
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.38 no.3
    • /
    • pp.281-293
    • /
    • 2020
  • Recently, large and small fires have been happening more often in Korea. Fire is one of the most frequent disasters along with traffic accidents in korean cities, and this frequency is closely related to the land use and the type of facilities. Therefore, in this study, the significance of fires was analyzed by considering land use, facility types, human and social factors and using 10 years of fire data in Jinju city. Based on this, OLS (Ordinary Least Square) regression analysis, SLM (Spatial Lag Model) and SEM (Spatial Error Model) using space weights, were compared and analyzed considering the location of the fire and each factor, then a statistical model with high suitability was presented. As a result, LISA analysis of spatial distribution patterns of fires in Jinju city was conducted, and it was proved that the frequency of fires was high in the order as follow, central commercial area, industrial area and residential area. Multiple regression analysis was performed by integrating demographic, social, and physical variables. Therefore, the three models were compared and analyzed by applying spatial weighting to the derived factors. As a result of the significance test, the spatial error model was analyzed to be the most significant. The facilities that have the highest correlation with fire occurrence were second type neighborhood facilities, followed by detached house, first type neighborhood facilities, number of households, and sales facilities. The results of this study are expected to be used as significant data to identify factors and manage fire safety in urban areas. Also, through the analysis of the standard deviation ellipsoid, the distribution characteristics of each facility in the residential area, industrial area, and central commercial area among the use areas were analyzed. In, the second type neighborhood facility with the highest fire risk was concentrated in the center. The results of these studies are expected to be used as useful data for identifying factors and managing fire safety in urban areas.

Optimization of Supercritical Water Oxidation(SCWO) Process for Decomposing Nitromethane (Nitromethane 분해를 위한 초임계수 산화(SCWO) 공정 최적화)

  • Han, Joo Hee;Jeong, Chang Mo;Do, Seung Hoe;Han, Kee Do;Sin, Yeong Ho
    • Korean Chemical Engineering Research
    • /
    • v.44 no.6
    • /
    • pp.659-668
    • /
    • 2006
  • The optimization of supercritical water oxidation (SCWO) process for decomposing nitromethane was studied by means of a design of experiments. The optimum operating region for the SCWO process to minimize COD and T-N of treated water was obtained in a lab scale unit. The authors had compared the results from a SCWO pilot plant with those from a lab scale system to explore the problems of scale-up of SCWO process. The COD and T-N in treated waters were selected as key process output variables (KPOV) for optimization, and the reaction temperature (Temp) and the mole ratio of nitromethane to ammonium hydroxide (NAR) were selected as key process input variables (KPIV) through the preliminary tests. The central composite design as a statistical design of experiments was applied to the optimization, and the experimental results were analyzed by means of the response surface method. From the main effects analysis, it was declared that COD of treated water steeply decreased with increasing Temp but slightly decreased with an increase in NAR, and T-N decreased with increasing both Temp and NAR. At lower Temp as $420{\sim}430^{\circ}C$, the T-N steeply decreased with an increase in NAR, however its variation was negligible at higher Temp above $450^{\circ}C$. The regression equations for COD and T-N were obtained as quadratic models with coded Temp and NAR, and they were confirmed with coefficient of determination ($r^2$) and normality of standardized residuals. The optimum operating region was defined as Temp $450-460^{\circ}C$ and NAR 1.03-1.08 by the intersection area of COD < 2 mg/L and T-N < 40 mg/L with regression equations and considering corrosion prevention. To confirm the optimization results and investigate the scale-up problems of SCWO process, the nitromethane was decomposed in a pilot plant. The experimental results from a SCWO pilot plant were compared with regression equations of COD and T-N, respectively. The results of COD and T-N from a pilot plant could be predicted well with regression equations which were derived in a lab scale SCWO system, although the errors of pilot plant data were larger than lab ones. The predictabilities were confirmed by the parity plots and the normality analyses of standardized residuals.

Effects of the Deer Antler Extract on Scopolamine-induced Memory Impairment and Its Related Enzyme Activities (녹용 추출물이 치매 동물모델의 기억력 개선과 관련효소 활성에 미치는 효과)

  • Lee, Mi-Ra;Sun, Bai-Shen;Gu, Li-Juan;Wang, Chun-Yan;Fang, Zhe-Ming;Wang, Zhen;Mo, Eun-Kyoung;Ly, Sun-Young;Sung, Chang-Keun
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.38 no.4
    • /
    • pp.409-414
    • /
    • 2009
  • The aim of this study was to investigate the ameliorating effects of deer antler extract on the learning and memory impairments induced by the administration of scopolamine (2 mg/kg, i.p.) in rats. Tacrine was used as a positive control agent for evaluating the cognition enhancing activity of deer antler extract in scopolamine-induced amnesia models. The results showed that the deer antler extract-treated group (200 mg/kg, p.o.) and the tacrine-treated group (10 mg/kg, p.o.) significantly ameliorated scopolamine-induced amnesia based on the Morris water maze test. Although there was no statistical significance of brain ACh contents among the experimental groups, the brain ACh contents of the deer antler extract-treated group was slightly higher than that of the scopolamine-treated group. The inhibitory effect of deer antler extract on the acetylcholinesterase activity in the brain was significantly lower than that of scopolamine-treated group. The tacrine- and the deer antler-treated groups reduced the MAO-B activity compared to the scopolamine-treated group, but not significantly. These results suggest that the deer antler extract could be an effective agent for the prevention of the cognitive impairment induced by cholinergic dysfunction.

Detection of Phantom Transaction using Data Mining: The Case of Agricultural Product Wholesale Market (데이터마이닝을 이용한 허위거래 예측 모형: 농산물 도매시장 사례)

  • Lee, Seon Ah;Chang, Namsik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.161-177
    • /
    • 2015
  • With the rapid evolution of technology, the size, number, and the type of databases has increased concomitantly, so data mining approaches face many challenging applications from databases. One such application is discovery of fraud patterns from agricultural product wholesale transaction instances. The agricultural product wholesale market in Korea is huge, and vast numbers of transactions have been made every day. The demand for agricultural products continues to grow, and the use of electronic auction systems raises the efficiency of operations of wholesale market. Certainly, the number of unusual transactions is also assumed to be increased in proportion to the trading amount, where an unusual transaction is often the first sign of fraud. However, it is very difficult to identify and detect these transactions and the corresponding fraud occurred in agricultural product wholesale market because the types of fraud are more intelligent than ever before. The fraud can be detected by verifying the overall transaction records manually, but it requires significant amount of human resources, and ultimately is not a practical approach. Frauds also can be revealed by victim's report or complaint. But there are usually no victims in the agricultural product wholesale frauds because they are committed by collusion of an auction company and an intermediary wholesaler. Nevertheless, it is required to monitor transaction records continuously and to make an effort to prevent any fraud, because the fraud not only disturbs the fair trade order of the market but also reduces the credibility of the market rapidly. Applying data mining to such an environment is very useful since it can discover unknown fraud patterns or features from a large volume of transaction data properly. The objective of this research is to empirically investigate the factors necessary to detect fraud transactions in an agricultural product wholesale market by developing a data mining based fraud detection model. One of major frauds is the phantom transaction, which is a colluding transaction by the seller(auction company or forwarder) and buyer(intermediary wholesaler) to commit the fraud transaction. They pretend to fulfill the transaction by recording false data in the online transaction processing system without actually selling products, and the seller receives money from the buyer. This leads to the overstatement of sales performance and illegal money transfers, which reduces the credibility of market. This paper reviews the environment of wholesale market such as types of transactions, roles of participants of the market, and various types and characteristics of frauds, and introduces the whole process of developing the phantom transaction detection model. The process consists of the following 4 modules: (1) Data cleaning and standardization (2) Statistical data analysis such as distribution and correlation analysis, (3) Construction of classification model using decision-tree induction approach, (4) Verification of the model in terms of hit ratio. We collected real data from 6 associations of agricultural producers in metropolitan markets. Final model with a decision-tree induction approach revealed that monthly average trading price of item offered by forwarders is a key variable in detecting the phantom transaction. The verification procedure also confirmed the suitability of the results. However, even though the performance of the results of this research is satisfactory, sensitive issues are still remained for improving classification accuracy and conciseness of rules. One such issue is the robustness of data mining model. Data mining is very much data-oriented, so data mining models tend to be very sensitive to changes of data or situations. Thus, it is evident that this non-robustness of data mining model requires continuous remodeling as data or situation changes. We hope that this paper suggest valuable guideline to organizations and companies that consider introducing or constructing a fraud detection model in the future.

Product Recommender Systems using Multi-Model Ensemble Techniques (다중모형조합기법을 이용한 상품추천시스템)

  • Lee, Yeonjeong;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.39-54
    • /
    • 2013
  • Recent explosive increase of electronic commerce provides many advantageous purchase opportunities to customers. In this situation, customers who do not have enough knowledge about their purchases, may accept product recommendations. Product recommender systems automatically reflect user's preference and provide recommendation list to the users. Thus, product recommender system in online shopping store has been known as one of the most popular tools for one-to-one marketing. However, recommender systems which do not properly reflect user's preference cause user's disappointment and waste of time. In this study, we propose a novel recommender system which uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user's preference. The research data is collected from the real-world online shopping store, which deals products from famous art galleries and museums in Korea. The data initially contain 5759 transaction data, but finally remain 3167 transaction data after deletion of null data. In this study, we transform the categorical variables into dummy variables and exclude outlier data. The proposed model consists of two steps. The first step predicts customers who have high likelihood to purchase products in the online shopping store. In this step, we first use logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. We perform above data mining techniques using SAS E-Miner software. In this study, we partition datasets into two sets as modeling and validation sets for the logistic regression and decision trees. We also partition datasets into three sets as training, test, and validation sets for the artificial neural network model. The validation dataset is equal for the all experiments. Then we composite the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. Bagging is the abbreviation of "Bootstrap Aggregation" and it composite outputs from several machine learning techniques for raising the performance and stability of prediction or classification. This technique is special form of the averaging method. Bumping is the abbreviation of "Bootstrap Umbrella of Model Parameter," and it only considers the model which has the lowest error value. The results show that bumping outperforms bagging and the other predictors except for "Poster" product group. For the "Poster" product group, artificial neural network model performs better than the other models. In the second step, we use the market basket analysis to extract association rules for co-purchased products. We can extract thirty one association rules according to values of Lift, Support, and Confidence measure. We set the minimum transaction frequency to support associations as 5%, maximum number of items in an association as 4, and minimum confidence for rule generation as 10%. This study also excludes the extracted association rules below 1 of lift value. We finally get fifteen association rules by excluding duplicate rules. Among the fifteen association rules, eleven rules contain association between products in "Office Supplies" product group, one rules include the association between "Office Supplies" and "Fashion" product groups, and other three rules contain association between "Office Supplies" and "Home Decoration" product groups. Finally, the proposed product recommender systems provides list of recommendations to the proper customers. We test the usability of the proposed system by using prototype and real-world transaction and profile data. For this end, we construct the prototype system by using the ASP, Java Script and Microsoft Access. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The participants for the survey are 173 persons who use MSN Messenger, Daum Caf$\acute{e}$, and P2P services. We evaluate the user satisfaction using five-scale Likert measure. This study also performs "Paired Sample T-test" for the results of the survey. The results show that the proposed model outperforms the random selection model with 1% statistical significance level. It means that the users satisfied the recommended product list significantly. The results also show that the proposed system may be useful in real-world online shopping store.